I have a dataframe of varying size, for example
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]],
columns=['A1', 'A2', 'A3'])
and a list li = [5,6] which I know should be somewhere in the dataframe exactly in its sequence in columns A2 and A3.
I can find that row via
mask = (df['A2'] == li[0]) & (df['A3'] == li[1])
but I am looking for a more general solution for if the column names and the number of columns change.
I know that we have list in_ = [2,3] of variable length and know it must be columns 'A'+str(in_) that we find li in (always of the same length as in_). I want to return the row number or index, how can I do that?
Then I want to proceed to extract the value from column A1 based on the matching other columns.
>Solution :
You can slice the columns, perform the comparison and aggregate with all:
li = [5,6]
in_ = [2,3]
cols = [f'A{i}' for i in in_]
mask = df[cols].eq(li).all(axis=1)
Or, if there is a chance that values in in_ do not match existing columns, rather reindex:
mask = df.reindex(columns=cols).eq(li).all(axis=1)
Output:
0 False
1 True
2 False
dtype: bool