Merge Time Series#
The merge_time_series
function is designed to merge two pandas DataFrames with datetime index on a specified column or index. It’s useful for financial analysts and investors who need to combine different time series data sources, such as stock prices, economic indicators, or other financial instruments.
To use the merge_time_series
function, you need to provide two pandas DataFrames with datetime index. The function works with time series data.
The function takes two input DataFrames, an optional on parameter, and a how parameter. If you provide a column name for the on parameter, the function will merge the DataFrames based on the specified column. If you don’t provide the on parameter or pass an empty string, the function will merge on the index. The how parameter determines the type of merge to be performed and has a default value of ‘outer’.
After passing the DataFrames to the merge_time_series
function, it will return a merged DataFrame containing the data from both input DataFrames, merged based on the specified column or index.
merge_time_series() function#
Function#
def merge_time_series(df_1, df_2, on='', how='outer'):
"""
Merge two pandas DataFrames with datetime index on a specified column or index.
Parameters
----------
df_1 : pandas.DataFrame
The first DataFrame to be merged. The index should be of datetime type.
df_2 : pandas.DataFrame
The second DataFrame to be merged. The index should be of datetime type.
on : str, optional
The column to merge the DataFrames on. If not provided or an empty string,
the function will merge on the index. Default is an empty string.
how : {'left', 'right', 'outer', 'inner'}, default 'outer'
The type of merge to be performed:
- 'left': use only keys from left frame, similar to a SQL left outer
join; preserve key order.
- 'right': use only keys from right frame, similar to a SQL right
outer join; preserve key order.
- 'outer': use union of keys from both frames, similar to a SQL full
outer join; sort keys lexicographically.
- 'inner': use intersection of keys from both frames, similar to a SQL
inner join; preserve the order of the left keys.
Returns
-------
pandas.DataFrame
A merged DataFrame containing the data from both input DataFrames, merged
based on the specified column or index.
"""
df = df_1.merge(df_2, how=how, left_index=True, right_index=True)
return df
Description#
The merge_time_series function merges two pandas DataFrames with datetime index on a specified column or index.
Parameters#
df_1 (pandas.DataFrame): The first DataFrame to be merged. The index should be of datetime type.
df_2 (pandas.DataFrame): The second DataFrame to be merged. The index should be of datetime type.
on (str, optional): The column to merge the DataFrames on. If not provided or an empty string, the function will merge on the index. Default is an empty string.
how (str, optional): The type of merge to be performed. Default is ‘outer’.
Returns#
pandas.DataFrame: A merged DataFrame containing the data from both input DataFrames, merged based on the specified column or index. Example
import ds4finance as dsf
import yfinance as yf
import pandas as pd
def download_data(ticker, start, end):
data = yf.download(ticker, start=start, end=end, progress=False)
adj_close = data['Adj Close']
return adj_close
ticker = 'SPY'
start_date = '2015-01-01'
end_date = '2020-12-31'
# Load your price data into a pandas DataFrame
spy = pd.DataFrame(download_data(ticker, start_date, end_date))
spy.columns = ['SPY']
qqq = pd.DataFrame(download_data(ticker, start_date, end_date))
qqq.columns = ['QQQ']
dsf.merge_time_series(spy, qqq, how='inner')
SPY | QQQ | |
---|---|---|
Date | ||
2015-01-02 | 176.788849 | 176.788849 |
2015-01-05 | 173.596146 | 173.596146 |
2015-01-06 | 171.961014 | 171.961014 |
2015-01-07 | 174.103851 | 174.103851 |
2015-01-08 | 177.193375 | 177.193375 |
... | ... | ... |
2020-12-23 | 355.586884 | 355.586884 |
2020-12-24 | 356.970276 | 356.970276 |
2020-12-28 | 360.036926 | 360.036926 |
2020-12-29 | 359.350067 | 359.350067 |
2020-12-30 | 359.862854 | 359.862854 |
1510 rows × 2 columns