Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to check if all grouped by month data is equal to nan?

I have this df:

               DATE  CODE    PP   YEAR_MONTH
9862     1991-01-01  100007  NaN  1991-01
9863     1991-01-02  100007  NaN  1991-01
9864     1991-01-03  100007  NaN  1991-01
9865     1991-01-04  100007  NaN  1991-01
9866     1991-01-05  100007  NaN  1991-01
            ...     ...  ...      ...
10541963 2020-12-27  155217  5.7  2020-12
10541964 2020-12-28  155217  0.0  2020-12
10541965 2020-12-29  155217  0.0  2020-12
10541966 2020-12-30  155217  0.0  2020-12
10541967 2020-12-31  155217  5.7  2020-12

[5347504 rows x 4 columns]

I want to know if for each specific df[‘CODE’] and df[‘YEAR_MONTH’] my df[‘PP’] has only values of nan (that generates the value of true). If in df[‘PP’] there is at least 1 value that is not nan in a specific df[‘CODE’] and df[‘YEAR_MONTH’], it will generate the value of false.

Expected result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

CODE    NAN_MONTH YEAR_MONTH
100007  True      1991-01
100007  False     1991-02
100007  False     1991-03
100007  False     1991-04
100007  False     1991-05
...     ...  ...      ...
155217  True      2020-09
155217  False     2020-10
155217  False     2020-11
155217  False     2020-12
155217  False     2020-12

I tried this code:

yearmonths=[]
nanmonths=[]
for yearmonth, data in df.groupby(['CODE','YEAR_MONTH']):
    
    checknan=data['PP'].isna().all()

    yearmonths=yearmonths.append(yearmonth)
    nanmonths=nanmonths.append(checknan)

But i get this error:

AttributeError: 'NoneType' object has no attribute 'append'

Would you mind to help me?

Thanks in advance.

>Solution :

Create helper column NAN_MONTH and then call GroupBy.all:

out = (df.assign(NAN_MONTH = df['PP'].isna())
         .groupby(['CODE','YEAR_MONTH'])['NAN_MONTH']
         .all()
         .reset_index())

print (out)
     CODE YEAR_MONTH  NAN_MONTH
0  100007    1991-01       True
1  155217    2020-12      False

Or use Series, then is necessary also groiping by Series, not by columns names:

out = (df['PP'].isna()
         .groupby([df['CODE'],df['YEAR_MONTH']])
         .all()
         .reset_index())
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading