Advertisements
I have a dataframe and am currently creating a new dataframe with the column names and number of empty cells like this.
empty = pd.DataFrame(columns=['Column', 'NaNs'])
for (columnName, columnData) in dataset.items():
empty.loc[-1] = [columnName, columnData.isnull().any().sum()]
empty.index = empty.index + 1
empty = empty.sort_index()
This is 5 lines for a simple overview table.
I wonder if there’s a better, shorter way of achieving the same with transpose
and apply
or something else which I could’t figure out so far.
>Solution :
You can use a vectorial approach with isna
and sum
, then processing the output Series to form a DataFrame:
out = df.isna().sum().rename_axis('Column').reset_index(name='NaNs')
Output, using @Dogbert’s example (df = pd.DataFrame({"a": [0, 1, None], "b": [0, None, 2], "c": [0, None, None]})
):
Column NaNs
0 a 1
1 b 1
2 c 2