noob question
I can’t figure out how/if the object output from a pandas data frame .info() call can be sorted like a regular data frame.
example:
import pandas as pd
temp = pd.DataFrame(data={"x":[1, 2, 3, None, 4], "y":[5, 6, 7, None, None]})
temp.info(null_counts=True).sort_values(by="Non-Null Count")
results in:
AttributeError: 'NoneType' object has no attribute 'sort_values'
(context: I have a lot of columns and varying numbers of missing values I want to sort the columns by)
>Solution :
Internally Pandas has a DataFrameInfo class that you can use to get at the info() data programatically. You can turn this into a DataFrame, which can then be sorted.
import pandas as pd
from pandas.io.formats.info import DataFrameInfo
temp = pd.DataFrame(data={"x":[1, 2, 3], "y":[4, 5, 6]})
info = DataFrameInfo(data=temp)
infodf = pd.DataFrame(
{'Column': info.ids,
'Non-Null Count':info.non_null_counts,
'Dtype':info.dtypes})
print(infodf)
Output:
Column Non-Null Count Dtype
x x 3 int64
y y 3 int64