Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get the value in a dataframe based on a value and a date in another dataframe

I tried countless answers to similar problems here on SO but couldn’t find anything that works for this scenario. It’s driving me nuts.

I have these two Dataframes:

df_op:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

index Date Close Name LogRet
0 2022-11-29 00:00:00 240.33 MSFT -0.0059
1 2022-11-29 00:00:00 280.57 QQQ -0.0076
2 2022-12-13 00:00:00 342.46 ADBE 0.0126
3 2022-12-13 00:00:00 256.92 MSFT 0.0173

df_quotes:

index Date Close Name
72 2022-11-29 00:00:00 141.17 AAPL
196 2022-11-29 00:00:00 240.33 MSFT
73 2022-11-30 00:00:00 148.03 AAPL
197 2022-11-30 00:00:00 255.14 MSFT
11 2022-11-30 00:00:00 293.36 QQQ
136 2022-12-01 00:00:00 344.11 ADBE
198 2022-12-01 00:00:00 254.69 MSFT
12 2022-12-02 00:00:00 293.72 QQQ

I would like to add a column to df_op that indicates the close of the stock in df_quotes 2 days later. For example, the first row of df_op should become:

index Date Close Name LogRet Next
0 2022-11-29 00:00:00 240.33 MSFT -0.0059 254.69

In other words:
for each row in df_op, find the corresponding Name in df_quotes with Date of 2 days later and copy its Close to df_op in column ‘Next’.

I tried tens of combinations like this without success:

df_quotes[df_quotes['Date'].isin(df_op['Date'] + pd.DateOffset(days=2)) & df_quotes['Name'].isin(df_op['Name'])]

How can I do this without recurring to loops?

>Solution :

Try this:

#first convert to datetime
df_op['Date'] = pd.to_datetime(df_op['Date'])
df_quotes['Date'] = pd.to_datetime(df_quotes['Date'])


#merge on Date and Name, but the date is offset 2 business days
(pd.merge(df_op,
df_quotes[['Date','Close','Name']].rename({'Close':'Next'},axis=1),
left_on=['Date','Name'],
right_on=[df_quotes['Date'] - pd.tseries.offsets.BDay(2),'Name'],
how = 'left')
.drop(['Date_x','Date_y'],axis=1))

Output:

        Date  index   Close  Name  LogRet    Next
0 2022-11-29      0  240.33  MSFT -0.0059  254.69
1 2022-11-29      1  280.57   QQQ -0.0076     NaN
2 2022-12-13      2  342.46  ADBE  0.0126     NaN
3 2022-12-13      3  256.92  MSFT  0.0173     NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading