Home How to estimate count for Pandas dataframe column values based on multiple conditions?

Questions

How to estimate count for Pandas dataframe column values based on multiple conditions?

February 6, 2022

Given a pandas DataFrame df of the following format:-

StudyNo   Condition1      Condition2      ResultVal
1222          25              30              0.5
1245          30              45              0.12
1672          40              15              0.46
1222          25              30              0.95
1245          30              45              0.88
1245          30              45              0.23

I would like to calculate frequency/count of the number of ResultVal attributes existing in the dataset for a specific StudyNo under a given Condition1 and Condition2. There are no exactly duplicate rows in this dataset.

e.g. for StudyNo 1245 at Condition1 of 30 and Condition2 of 45, the count would be 3 (because 3 ResultVal attributes exist for this aspect). Also, I would like to print out a list of StudyNo, Condition1 and Condition2 in cases wherein < =1 ResultVal attributes exist (which in this example would be [1672,40,15].

Any help in this regard would be much appreciated.

>Solution :

You can groupby "StudyNo", "Condition1" and "Condition2" and transform count of "ResultVal" to create column "Frequency".

Then create a boolean mask for rows where "Frequency" is less than or equal to 1 and filter the rows that satisfy it.

cols = ['StudyNo','Condition1','Condition2']
df['Frequency'] = df.groupby(cols)['ResultVal'].transform('count')
appears_only_once = df.loc[df['Frequency'] <= 1, cols].to_numpy().tolist()

Output:

   StudyNo  Condition1  Condition2  ResultVal  Frequency
0     1222          25          30       0.50          2
1     1245          30          45       0.12          3
2     1672          40          15       0.46          1
3     1222          25          30       0.95          2
4     1245          30          45       0.88          3
5     1245          30          45       0.23          3

The list: