Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Count how many times any value appears in a combination of columns but not in other columns (pandas)

I have a data set that looks like this:

A   B   C
1   2   
    3   4
1       5
1   2   4
1       2
1 

I want to return the amount of times any value appears in a combination of two columns but not the third column. The numeric values in this case are arbitrary. I just care about counting instances.

In other words:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

1.) How many times does any value appear in column A and column B but not column C?

2.) How many times does any value appear in column B and column C but not column A?

3.) How many times does any value appear in column A and column C but not in column B?

My expected answers based on the mock data I have given above:

1.) 1 (row 1)
2.) 1 (row 2)
3.) 2 (rows 3 and 5)

>Solution :

You could use isna to create a boolean DataFrame. Then filter the rows that have only one NaN value (so that we exclude the last row). Finally sum vertically:

df_na = df.isna()
df_na[df_na.sum(axis=1).eq(1)].sum()

Output:

A    1
B    2
C    1
dtype: int64

Then for example, column "A" doesn’t have a value while the other two have values once, "B" doesn’t have a value while the other two have twice, etc.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading