Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to work with ' df.loc ' so that it combines two columns to return ' unique ' values?

I ran into a problem to be able to combine two columns before generating a unique list.

CSV file:

country,half,uniqueTournament
Brazil,1st half,Serie A
England,1st half,Championship
Argentina,2nd half,Primera Liga
Brazil,1st half,Serie A

My attempt:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd

csv_file = '@@@@@@@@@@@@@'
df = pd.read_csv(csv_file)

df.loc[(df['half'] == '1st half'), 'country' + ' - ' + 'uniqueTournament'].unique()

Expected outcome:

Brazil - Serie A
England - Championship

>Solution :

If df was like:

     country      half uniqueTournament
0     Brazil  1st half          Serie A
1    England  1st half     Championship
2  Argentina  1st half     Primera Liga
3     Brazil  1st half          Serie A
4     Brazil  2nd half          Serie A

then you could create a new column, then groupby + agg(list):

df['new'] = df['country'] + ' - '+ df['uniqueTournament']
df.drop_duplicates(subset=['half','new']).groupby('half')['new'].agg(list).tolist()

or you could use groupby + unique:

out = df.groupby('half')['new'].unique().tolist()

Output:

[['Brazil - Serie A', 'England - Championship', 'Argentina - Primera Liga'],
 ['Brazil - Serie A']]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading