Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas – apply some calculations using multiple columns

I am trying to do calculate price per unit based on price and pack_count columns from a dataframe.

import pandas as pd
# assign data of lists.
data = {'price': ['23', '', '211', '100'], 'pack_count': [23, 10, 1, 1]}

# Create DataFrame.
df = pd.DataFrame(data)
df["price_per_unit"] = (df["price"].str.replace('', '0' ).astype(float) \
                                / df["pack_count"].astype(float))\
                                    .apply(lambda x: '' if x == 0 else round(x, 2))
                                    
print(df)

This is the result I get:

   
  price  pack_count price_per_unit
0    23          23          88.26
1                10               
2   211           1         201010
3   100           1         100000

I used replace to handle empty strings but something weird is happening as you can see in the output above.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  1. Why is this happening?

  2. How can I modify it such that if price column is an empty string, then we skip the calculation for that column and assign a default value, otherwise do the required calculation.

Any help would be appreciated.
Thank you!

>Solution :

Instead of the str.replace method, use replace method.

str.replace replaces each occurrence of the pattern in the Series, i.e. matches in any part of the strings. So for example, '23' is changed to '02030', which becomes 2030.00 when changed to float, which divided by 23 becomes 88.26.

Meanwhile replace replaces values given in to_replace with value, i.e. looks for exact matches.

df["price_per_unit"] = (df["price"].replace('', '0').astype(float)) \
                                / df["pack_count"].astype(float) \
                                    .apply(lambda x: '' if x == 0 else round(x, 2))

Output:

  price  pack_count  price_per_unit
0    23          23             1.0
1                10             0.0
2   211           1           211.0
3   100           1           100.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading