Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to extract a category from describe.()

import pandas as pd
df = {'fish': [38,10,23,45],'eggs': [24.0,10,12,8],'fruit': [80.5,60,12,32],'sugar': [0.234,0.8,0.34,0.76],
                  'category':['Price A', 'Price B','Price B','Price A']}
df = pd.DataFrame(data=df)
xy=['fish',
   'eggs',
   'fruit',
   'sugar',
   'category']
         
df=(df[xy].groupby(['category']).describe()).T
df['Ratio'] = df['Price A']/df['Price B']
df

I am trying to extract the category with only the mean with a ratio smaller than 0.5 and bigger than 2.

I tried to use

df.reset_index()

and making two separate df with the respective conditions, then turning the unwanted values to null, removing the null values(but I don’t know how to specifically only choose mean’s ratio), and then concatenating both of the df.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Maybe there are some better ways to do this?

>Solution :

Use DataFrame.loc for select groups by categories:

m = df['Ratio'].xs('mean', level=1).between(0.5,2)
out = df.loc[m.index[~m]]

print(out)
category      Price A    Price B     Ratio
fish count   2.000000   2.000000  1.000000
     mean   41.500000  16.500000  2.515152
     std     4.949747   9.192388  0.538462
     min    38.000000  10.000000  3.800000
     25%    39.750000  13.250000  3.000000
     50%    41.500000  16.500000  2.515152
     75%    43.250000  19.750000  2.189873
     max    45.000000  23.000000  1.956522

Explanation:

Select levels mean by Series.xs:

print(df['Ratio'].xs('mean', level=1))
fish     2.515152
eggs     1.454545
fruit    1.562500
sugar    0.871930
Name: Ratio, dtype: float64

Create mask between 0.5 and 2 by Series.between:

print(df['Ratio'].xs('mean', level=1).between(0.5,2))
fish     False
eggs      True
fruit     True
sugar     True
Name: Ratio, dtype: bool

Get indices if Falses – invert mask and filter m.index:

print (m.index[~m])
Index(['fish'], dtype='object')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading