I have one time series dataset with different daily interest rates. I want to merge the data with a panel dataset, where I also have a daily time component but each day exists a couple of times since it is attributed to different asset classes.
The data looks like this
Dataframe A:
| Time | Asset |
|---|---|
| 01/08/2021 | A |
| 01/08/2021 | B |
| 01/08/2021 | C |
| 01/08/2021 | D |
| 02/08/2021 | A |
| 02/08/2021 | B |
| 02/08/2021 | C |
| 02/08/2021 | D |
| 03/08/2021 | A |
Dataframe B:
| Time | Rate |
|---|---|
| 01/08/2021 | 2.3 |
| 02/08/2021 | 2.34 |
| 03/08/2021 | 2.33 |
What I want to have is:
| Time | Asset | Rate |
|---|---|---|
| 01/08/2021 | A | 2.3 |
| 01/08/2021 | B | 2.3 |
| 01/08/2021 | C | 2.3 |
| 01/08/2021 | D | 2.3 |
| 02/08/2021 | A | 2.34 |
| 02/08/2021 | B | 2.34 |
| 02/08/2021 | C | 2.34 |
| 02/08/2021 | D | 2.34 |
| 03/08/2021 | A | 2.33 |
How would you merge these two dataframes?
When I use the merge command:
Merge = pd.merge(Dataframe A, Dataframe B, on="Time", how = "inner"), I get the following error message.
ValueError: You are trying to merge on datetime64[ns] and object columns. If you wish to proceed you should use pd.concat
But I do not want to use pd.concat since it is not the outcome I wish to achieve.
>Solution :
pd.merge on Time column. Merge will work only when both the column have same data type. Your columns may have different type due to which its raising exception.
df1.Time = pd.to_datetime(df1.Time, format = '%d/%m/%Y').dt.strftime('%m/%d/%Y')
df2.Time = pd.to_datetime(df2.Time, format = '%d/%m/%Y').dt.strftime('%m/%d/%Y')
pd.merge(df1, df2, how='inner', on='Time')
This gives us the expected output :
Time Asset Rate
0 08/01/2021 A 2.30
1 08/01/2021 B 2.30
2 08/01/2021 C 2.30
3 08/01/2021 D 2.30
4 08/02/2021 A 2.34
5 08/02/2021 B 2.34
6 08/02/2021 C 2.34
7 08/02/2021 D 2.34
8 08/03/2021 A 2.33