Merge Time Series#

The merge_time_series function is designed to merge two pandas DataFrames with datetime index on a specified column or index. It’s useful for financial analysts and investors who need to combine different time series data sources, such as stock prices, economic indicators, or other financial instruments.

To use the merge_time_series function, you need to provide two pandas DataFrames with datetime index. The function works with time series data.

The function takes two input DataFrames, an optional on parameter, and a how parameter. If you provide a column name for the on parameter, the function will merge the DataFrames based on the specified column. If you don’t provide the on parameter or pass an empty string, the function will merge on the index. The how parameter determines the type of merge to be performed and has a default value of ‘outer’.

After passing the DataFrames to the merge_time_series function, it will return a merged DataFrame containing the data from both input DataFrames, merged based on the specified column or index.

merge_time_series() function#

Function#

def merge_time_series(df_1, df_2, on='', how='outer'):
    """
    Merge two pandas DataFrames with datetime index on a specified column or index.

    Parameters
    ----------
    df_1 : pandas.DataFrame
        The first DataFrame to be merged. The index should be of datetime type.
    df_2 : pandas.DataFrame
        The second DataFrame to be merged. The index should be of datetime type.
    on : str, optional
        The column to merge the DataFrames on. If not provided or an empty string,
        the function will merge on the index. Default is an empty string.
    how : {'left', 'right', 'outer', 'inner'}, default 'outer'
        The type of merge to be performed:
            - 'left': use only keys from left frame, similar to a SQL left outer
                      join; preserve key order.
            - 'right': use only keys from right frame, similar to a SQL right
                       outer join; preserve key order.
            - 'outer': use union of keys from both frames, similar to a SQL full
                       outer join; sort keys lexicographically.
            - 'inner': use intersection of keys from both frames, similar to a SQL
                       inner join; preserve the order of the left keys.

    Returns
    -------
    pandas.DataFrame
        A merged DataFrame containing the data from both input DataFrames, merged
        based on the specified column or index.
    """
    df = df_1.merge(df_2, how=how, left_index=True, right_index=True)
    return df

Description#

The merge_time_series function merges two pandas DataFrames with datetime index on a specified column or index.

Parameters#

  • df_1 (pandas.DataFrame): The first DataFrame to be merged. The index should be of datetime type.

  • df_2 (pandas.DataFrame): The second DataFrame to be merged. The index should be of datetime type.

  • on (str, optional): The column to merge the DataFrames on. If not provided or an empty string, the function will merge on the index. Default is an empty string.

  • how (str, optional): The type of merge to be performed. Default is ‘outer’.

Returns#

pandas.DataFrame: A merged DataFrame containing the data from both input DataFrames, merged based on the specified column or index. Example

import ds4finance as dsf
import yfinance as yf
import pandas as pd

def download_data(ticker, start, end):
    data = yf.download(ticker, start=start, end=end, progress=False)
    adj_close = data['Adj Close']
    return adj_close

ticker = 'SPY'
start_date = '2015-01-01'
end_date = '2020-12-31'

# Load your price data into a pandas DataFrame
spy = pd.DataFrame(download_data(ticker, start_date, end_date))
spy.columns = ['SPY']

qqq = pd.DataFrame(download_data(ticker, start_date, end_date))
qqq.columns = ['QQQ']

dsf.merge_time_series(spy, qqq, how='inner')
SPY QQQ
Date
2015-01-02 176.788849 176.788849
2015-01-05 173.596146 173.596146
2015-01-06 171.961014 171.961014
2015-01-07 174.103851 174.103851
2015-01-08 177.193375 177.193375
... ... ...
2020-12-23 355.586884 355.586884
2020-12-24 356.970276 356.970276
2020-12-28 360.036926 360.036926
2020-12-29 359.350067 359.350067
2020-12-30 359.862854 359.862854

1510 rows × 2 columns