Advertisements
I have the following dataframe "df1":
reference
date
2023-01-01 1
2023-01-02 2
2023-01-03 3
2023-01-04 4
I have another dataframe "df2":
1 2 3 4
date
2023-01-01 9 7 9 3
2023-01-02 6 5 5 6
2023-01-03 9 3 7 5
2023-01-04 2 4 0 4
I want to add a new column in df1 where, for each row, I need to point to the cell value from df2 where the row index is the same as df1 (so the date) and the column is the reference column from df1.
This is the expected output:
reference output
date
2023-01-01 1 9
2023-01-02 2 5
2023-01-03 3 7
2023-01-04 4 4
For example, 9 on 2023-01-01 is the value of column "1", row "2023-01-01" from df2.
5 on 2023-01-02 is the value of column "2", row "2023-01-02" from df2, and so on.
Any ideas of how to do it?
>Solution :
Here is one of the options :
df1["output"] = df2.to_numpy()[np.arange(len(df1)), df1["reference"].sub(1)]
df1["output"] = df1.join(df2).apply(lambda x: x.at[str(x["reference"])], axis=1)
Output :
print(df1)
reference output
date
2023-01-01 1 9
2023-01-02 2 5
2023-01-03 3 7
2023-01-04 4 4