How to (better) get NaN data from pandas dataframe into new dataframe?

Advertisements

I have a dataframe and am currently creating a new dataframe with the column names and number of empty cells like this.

empty = pd.DataFrame(columns=['Column', 'NaNs'])
for (columnName, columnData) in dataset.items():
    empty.loc[-1] = [columnName, columnData.isnull().any().sum()]
    empty.index = empty.index + 1
    empty = empty.sort_index()

This is 5 lines for a simple overview table.

I wonder if there’s a better, shorter way of achieving the same with transpose and apply or something else which I could’t figure out so far.

>Solution :

You can use a vectorial approach with isna and sum, then processing the output Series to form a DataFrame:

out = df.isna().sum().rename_axis('Column').reset_index(name='NaNs')

Output, using @Dogbert’s example (df = pd.DataFrame({"a": [0, 1, None], "b": [0, None, 2], "c": [0, None, None]})):

  Column  NaNs
0      a     1
1      b     1
2      c     2

Leave a ReplyCancel reply