I would like to recovery the rows in a dataframe where, in the same row, differing keys hold equal values. I can display where, for instance, the rows where col2 == col3. I would like to get this code to track across col1 matching across col2, col3 and col4. Then col2 to match across col 3 and col4. Then finally col3 across col4.
I have read through this post and I am confused if iteration is the solution to my problem. If so, how can this be done.
I can display, for instance, the rows where col2 == col3.
# -*- coding: utf-8 -*-
import pandas as pd
## writing a dataframe
rows = {'col1':['5412','5148','5800','2122','5645','1060','4801','1039'],
'col2':['542','512','541','412','565','562','645','152'],
'col3':['542','3120','3410','2112','5650','5620','4801','152'],
'col4':['5800','2122','5645','2112','412','562','562','645']
}
df = pd.DataFrame(rows)
print(f'Unsorted dataframe \n\n{df}')
## print the rows where col2 == col3
dft = df[(df['col2'] == df['col3'])]
print('\n\nupdate - list row of matching row elements')
print(dft)
## print all except the rows where col2 == col3
dft = df.drop(df[(df['col2'] == df['col3'])].index)
print('\n\nupdate - Dropping rows of matching row elements')
print(dft)
With this I am getting back
col1 col2 col3 col4
0 5412 542 542 5800
7 1039 152 152 645
I would like to get back
col1 col2 col3 col4
0 5412 542 542 5800
3 2122 412 2112 2112
4 5645 565 5650 412
5 1060 562 5620 562
6 4801 645 4801 562
7 1039 152 152 645
>Solution :
Use nunique with axis=1 and compare it to the number of columns:
import pandas as pd
rows = {
"col1": ["5412", "5148", "5800", "2122", "5645", "1060", "4801", "1039"],
"col2": ["542", "512", "541", "412", "565", "562", "645", "152"],
"col3": ["542", "3120", "3410", "2112", "5650", "5620", "4801", "152"],
"col4": ["5800", "2122", "5645", "2112", "412", "562", "562", "645"],
}
df = pd.DataFrame(rows)
df = df[df.nunique(axis=1) < len(df.columns)]
print(df)
Output:
col1 col2 col3 col4
0 5412 542 542 5800
3 2122 412 2112 2112
5 1060 562 5620 562
6 4801 645 4801 562
7 1039 152 152 645