Get the (dynamic) nth column value for each row in pandas dataframe

Advertisements

I have the following dataframe "df1":

            reference
date                 
2023-01-01          1
2023-01-02          2
2023-01-03          3
2023-01-04          4

I have another dataframe "df2":

            1  2  3  4
date                  
2023-01-01  9  7  9  3
2023-01-02  6  5  5  6
2023-01-03  9  3  7  5
2023-01-04  2  4  0  4

I want to add a new column in df1 where, for each row, I need to point to the cell value from df2 where the row index is the same as df1 (so the date) and the column is the reference column from df1.
This is the expected output:

            reference  output
date                         
2023-01-01          1       9
2023-01-02          2       5
2023-01-03          3       7
2023-01-04          4       4 

For example, 9 on 2023-01-01 is the value of column "1", row "2023-01-01" from df2.

5 on 2023-01-02 is the value of column "2", row "2023-01-02" from df2, and so on.

Any ideas of how to do it?

>Solution :

Here is one of the options :

df1["output"] = df2.to_numpy()[np.arange(len(df1)), df1["reference"].sub(1)]

Or with apply/at :

df1["output"] = df1.join(df2).apply(lambda x: x.at[str(x["reference"])], axis=1)


Output :

print(df1)

            reference  output
date                         
2023-01-01          1       9
2023-01-02          2       5
2023-01-03          3       7
2023-01-04          4       4

Leave a ReplyCancel reply