Given a dataframe df as follows:
id value1 value2 value3
0 0 22 1 7
1 1 0 0 0
2 2 0 0 0
3 3 4 1 25
4 4 5 0 24
5 5 0 0 3
Or:
df = pd.DataFrame([{'id': 0, 'value1': 22, 'value2': 1, 'value3': 7},
{'id': 1, 'value1': 0, 'value2': 0, 'value3': 0},
{'id': 2, 'value1': 0, 'value2': 0, 'value3': 0},
{'id': 3, 'value1': 4, 'value2': 1, 'value3': 25},
{'id': 4, 'value1': 5, 'value2': 0, 'value3': 24},
{'id': 5, 'value1': 0, 'value2': 0, 'value3': 3}])
I need to subset columns 'value1', 'value2', 'value3', replace 0 with NaN if all values are 0s row-wisely. How could I acheive that in Pandas?
The exepected result will like this:
id value1 value2 value3
0 0 22.0 1.0 7.0
1 1 NaN NaN NaN
2 2 NaN NaN NaN
3 3 4.0 1.0 25.0
4 4 5.0 0.0 24.0
5 5 0.0 0.0 3.0
Code I have tried:
cols = ['value1', 'value2', 'value3']
df[cols] = df[cols].replace({'0': np.NaN, 0: np.NaN})
>Solution :
Use mask:
df[cols] = df[cols].mask(df[cols].eq(0).all(axis=1))
Original answer:
I’d prefer mask:
>>> df.set_index('id').mask(df[cols].eq(0).all(axis=1))
value1 value2 value3
id
0 22.0 1.0 7.0
1 NaN NaN NaN
2 NaN NaN NaN
3 4.0 1.0 25.0
4 5.0 0.0 24.0
5 0.0 0.0 3.0
>>>
With resetting index:
>>> df.set_index('id').mask(df[cols].eq(0).all(axis=1)).reset_index()
id value1 value2 value3
0 0 22.0 1.0 7.0
1 1 NaN NaN NaN
2 2 NaN NaN NaN
3 3 4.0 1.0 25.0
4 4 5.0 0.0 24.0
5 5 0.0 0.0 3.0
>>>