Home pandas search across multiple columns return one column if matches

Questions

pandas search across multiple columns return one column if matches

September 24, 2024

Example data:

df1 = pd.DataFrame({
    'a': [1, 6, 3, 9],
    'b': ['A', 'B', 'C', 'D'],
    'c': [10, 20, 30, 40],
    'd': [100, 200, 300, 400]
})

df2 = pd.DataFrame({
    'm': [1, 5, 3, 7],
    'n': [2, 6, 8, 4],
    'o': [9, 10, 11, 12]
})

Requirement:
df1['a'] can occur anywhere of df2. I want to return df2['m'] irrespective of where the match is found.

After some googling and chatGpt, I found melting df2 and merging with df1 is helpful except for it doesn’t check for a match in df2['m'].

Code:

df2_melted = df2.melt(id_vars=['m'], value_vars=['n', 'o'])
merged_df = df1.merge(df2_melted, left_on='a', right_on='value', how='left')
df1['e'] = merged_df['m']
print(df1)

Output:

a  b   c    d    e
1  A  10  100  NaN     # df1['a'] == df2['m']
6  B  20  200  5.0     # df1['a'] == df2['n']
3  C  30  300  NaN     # df1['a'] == df2['m']
9  D  40  400  1.0     # df1['a'] == df2['o']

Required Output:

a  b   c    d  e
1  A  10  100  1
6  B  20  200  5
3  C  30  300  3
9  D  40  400  1

If df2['m'] could also be added to value_vars while melting, it’d have resolved the issue. I tried it, it didn’t work. Then checked docs, found that whatever is there in the id_vars, the remaining or a subset of the remaining can be part of value_vars. So this approach might not be correct or I’m missing something.

Then I thought, if df1['a'] matches df2['m'], then df1['e'] == df1['a'] == df2['m']. So just replacing NaN value with df1['a'] should work and it worked. But had to convert the column to int; because of NaN, it’s changed to float.

Working complete Code:

df2_melted = df2.melt(id_vars=['m'], value_vars=['n', 'o'])
merged_df = df1.merge(df2_melted, left_on='a', right_on='value', how='left')
df1['e'] = merged_df['m']
df1['e'] = (df1['e'].fillna(df1['a'])).astype(int)

Seemed like even though it’s a working solution, it’s unnecessarily complicated: "try any solution: add more code to fix the issues as you proceed without changing the initial solution".

Any other better approach which can help with my requirement?

PS1: In above example, it’s not mandatory that df1 and df2 will have the same number of rows.

>Solution :

Just duplicate the column m as a new column before melt (also use map instead of merge):

df1['e'] = df1['a'].map(df2.eval('e=m').melt('e')
                           .set_index('value')['e'])

Variant with merge:

out = df1.merge(df2.assign(e=df2['m'])
                   .melt('e', value_name='a')
                   [['a', 'e']]
               )

Output:

   a  b   c    d  e
0  1  A  10  100  1
1  6  B  20  200  5
2  3  C  30  300  3
3  9  D  40  400  1

dataframe

byMR

Published September 24, 2024

Add a comment

Delete kubernetes node pull with Terraform without cluster recreating

byMR

September 24, 2024

Questions

Date function not returning expected Month

byMR

September 24, 2024

Questions

How to remove warnings as comments in vim?

byMR

September 25, 2024

Questions

Laravel Confusion with POST Routes for Ticket Comments and Updates

byMR

September 25, 2024

Questions

sum each value in a list to all previous values including extra value between them

byMR

September 25, 2024

Questions

Issue with Protected Routes Redirecting to Root Path

byMR

September 25, 2024

pandas search across multiple columns return one column if matches

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Delete kubernetes node pull with Terraform without cluster recreating

Date function not returning expected Month

How to remove warnings as comments in vim?

Laravel Confusion with POST Routes for Ticket Comments and Updates

sum each value in a list to all previous values including extra value between them

Issue with Protected Routes Redirecting to Root Path

Keep Up to Date with the Most Important News

pandas search across multiple columns return one column if matches

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Delete kubernetes node pull with Terraform without cluster recreating

Date function not returning expected Month

How to remove warnings as comments in vim?

Laravel Confusion with POST Routes for Ticket Comments and Updates

sum each value in a list to all previous values including extra value between them

Issue with Protected Routes Redirecting to Root Path

Discover more from Dev solutions