Home pandas filter rows based on atmost matching criteria

Questions

pandas filter rows based on atmost matching criteria

January 12, 2022

I have a dataframe like as shown below

df=pd.DataFrame({'subjects':['A','A','D','B','B','C'],
                'B':['12','12','13','14','14','16'],
                'C':[21,23,24,25,26,27]
                })
df['r_no'] = df.groupby(['subjects','B']).cumcount()+1

Now, I would like to only select rows that had only r_no = 1 (and not r_no > 1).

I tried the below

df[df['subjects'].value_counts() == 1]
df.iloc[df['subjects'].value_counts() == 1:,]
df.ix[df['subjects'].value_counts() == 1:,]
df[(df['r_no'] == 1) & (df['r_no'] < 2)]

None of them worked.

I expect my output to be like as shown below.

You can see that subjects = A and subjects = B is excluded because they also had rows where r_no > 1. Basically, I want to select subjects that had only one record for them (r_no) in dataframe

>Solution :

IIUC, what you want to do it just to keep the groups of size 1:

df[df.groupby(['subjects','B'])['C'].transform(len).le(1)]

Or maybe even just keeping the subjects with unique rows:

df[~df['subjects'].duplicated(keep=False)]

output:

  subjects   B   C
2        D  13  24
5        C  16  27

byMR

Published January 12, 2022

Add a comment

My functions are not working properly in C?

byMR

January 12, 2022

Questions

query the spdlog logger name

byMR

January 12, 2022

Questions

Dragged item going faster than cursor

byMR

January 12, 2022

Questions

How create variable name with other variable's value in it?

byMR

January 12, 2022

Questions

Implement Variadic function in C

byMR

January 12, 2022

Questions

How to rename Pandas columns based on mapping?

byMR

January 12, 2022

pandas filter rows based on atmost matching criteria

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

My functions are not working properly in C?

query the spdlog logger name

Dragged item going faster than cursor

How create variable name with other variable's value in it?

Implement Variadic function in C

How to rename Pandas columns based on mapping?

Keep Up to Date with the Most Important News

pandas filter rows based on atmost matching criteria

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

My functions are not working properly in C?

query the spdlog logger name

Dragged item going faster than cursor

How create variable name with other variable's value in it?

Implement Variadic function in C

How to rename Pandas columns based on mapping?

Discover more from Dev solutions