Home Conditional mean while using iloc pandas

Questions

Conditional mean while using iloc pandas

September 28, 2022

Assume I have a dataframe with columns stated below (consist more column in actual data).

Customer   Group1     jan_revenue feb_revenue mar_revenue
  Sam      Bank A          40           50          0
Wilson     Bank A          60           70         30
  Jay      Bank B          10           40         40
  Jim      Bank A           0           40         70
  Yan      Bank C           0           40         90
  Tim      Bank C          10            0         50

I want to calculate the mean for each customer but only those are non-zero.

For example, customer Sam has mean (40+50)/2 = 45 and Wilson (60+70+30)/3 = 53.3333

Since I have a large number of columns, so i choose to use iloc but my approach included all the 0.

df['avg_revenue21'] = df.iloc[:,27:39].mean(axis=1)

May I know is there a way for conditional mean while using iloc?

Thank you

>Solution :

You can use select_dtypes to get numeric columns, replace the zeros with NA, then get the mean as usual:

df.select_dtypes('number').replace(0, pd.NA).mean(axis=1)

output:

Sam       45.000000
Wilson    53.333333
Jay       30.000000
Jim       55.000000
Yan       65.000000
Tim       30.000000
dtype: float64

As new column:

df['avg_revenue21'] = df.select_dtypes('number').replace(0, pd.NA).mean(axis=1)

       Customer Group1  jan_revenue  feb_revenue  mar_revenue  avg_revenue21
Sam        Bank      A           40           50            0      45.000000
Wilson     Bank      A           60           70           30      53.333333
Jay        Bank      B           10           40           40      30.000000
Jim        Bank      A            0           40           70      55.000000
Yan        Bank      C            0           40           90      65.000000
Tim        Bank      C           10            0           50      30.000000

variants:

If the input are strings:

df['avg_revenue21'] = df.apply(pd.to_numeric, errors='coerce').replace(0, pd.NA).mean(axis=1)

If you only want to consider a subset:

df['avg_revenue21'] = df.filter(regex='(feb|mar)_').replace(0, pd.NA).mean(axis=1)

or:

df['avg_revenue21'] = df[['feb_revenue', 'mar_revenue']].replace(0, pd.NA).mean(axis=1)

dataframe

byMR

Published September 28, 2022

Add a comment

how to insert .txt file data into database using php?

byMR

September 28, 2022

Questions

Small space in gnuplot

byMR

September 28, 2022

Questions

pandas dataframe as as json file using index value

byMR

September 28, 2022

Questions

Flutter sorted function doesn't sort correctly

byMR

September 28, 2022

Questions

LinuxONE commntiy cloud freezes when training a not so large CNN

byMR

September 28, 2022

Conditional mean while using iloc pandas

MEDevel.com: Open-source for Healthcare and Education

>Solution :

variants:

Like this:

Leave a ReplyCancel reply

Read more

how to insert .txt file data into database using php?

Small space in gnuplot

pandas dataframe as as json file using index value

Flutter sorted function doesn't sort correctly

LinuxONE commntiy cloud freezes when training a not so large CNN

Keep Up to Date with the Most Important News

Conditional mean while using iloc pandas

MEDevel.com: Open-source for Healthcare and Education

>Solution :

variants:

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

how to insert .txt file data into database using php?

Small space in gnuplot

pandas dataframe as as json file using index value

React native modal open modal from different component using visible prop

Flutter sorted function doesn't sort correctly

LinuxONE commntiy cloud freezes when training a not so large CNN

Discover more from Dev solutions