Follow

Follow

Contact

Home How to calculate cumulative missing values group by an ID (in python)?

Questions

How to calculate cumulative missing values group by an ID (in python)?

byMR

January 24, 2022

a) given the following "id" and "freq"

df = pd.DataFrame({'id':[1,1,1,1,1,1,2,2,2,2,2,3,3,3],'freq':[1,2,np.NaN, np.NaN, np.NaN, 6,7,8,9,10,np.NaN,np.NaN,13,14]})

df

b) how to calculate cumulative missing of "freq" group by "id"? with a reset to zero when freq > 0

so that the result ‘cum_null’ should look like –
print(df([‘cum_null’])
0 0 1 2 3 0 0 0 0 0 1 1 0 0

c) I’ve tried this. Very close, but cannot reset to zero when freq > 0

df['cum_null'] = id_grp['freq'].apply(lambda x:x.isnull().astype(int).cumsum())

df

>Solution :

If you case you can do groupby with mask

df['cum_null'] = df.freq.isnull().groupby(df['id']).cumsum().where(df.freq.isnull(),0)
0     0
1     0
2     1
3     2
4     3
5     0
6     0
7     0
8     0
9     0
10    1
11    1
12    0
13    0
Name: freq, dtype: int64

cumsum

byMR

Published January 24, 2022

Add a comment

Leave a ReplyCancel reply

Read more

Questions

"binary 'operator+' has too many parameters

byMR

January 24, 2022

Questions

How to create and print a Dictionary that has keys as the names in list and their values as number of times the name appears on the list

byMR

January 24, 2022

Questions

count occurrences of unique string row by row

byMR

January 24, 2022

Questions

Define multiple buttons for Windows commands

byMR

January 24, 2022

Questions

Regex: Wrap each group of matches in a parent element

byMR

January 24, 2022

Questions

Async await method gets stuck

byMR

January 24, 2022