I have this code:
import pandas as pd
df = pd.DataFrame({'consumption': [10.51, 103.11, 55.48], 'co2_emissions': [37.2, 19.66, 1712]}, index=['Pork', 'Wheat Products', 'Beef'])
df['Max'] = df.idxmax(axis=1, skipna=True, numeric_only=True)
df
I need to find the n largest values. Here there is a technique using apply/lambda.
But it returns error.
df.apply(lambda s: s.abs().nlargest(2).index.tolist(), axis=1,skipna=True, numeric_only=True)
TypeError: () got an unexpected keyword argument
‘numeric_only’
Is there any way to obtain top N results using idxmax? Is there any way to overcome this error got when using apply lambda method?
>Solution :
Your error is due to passing the skipna and numeric_only parameters to apply.
You can fix it with:
(df.select_dtypes('number')
.apply(lambda s: s.dropna().abs().nlargest(2)
.index.tolist(), axis=1)
)
Output:
Pork [co2_emissions, consumption]
Wheat Products [consumption, co2_emissions]
Beef [co2_emissions, consumption]
dtype: object
A more efficient approach using numpy
tmp = df.select_dtypes('number')
out = pd.Series(np.take_along_axis(
tmp.columns.to_numpy()[:,None],
np.argsort(tmp)[:, -2:], axis=0
).tolist(), index=df.index)