Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Divide by a number only non Nan values in a data frame column

Lets say i have a dataframe that looks like this:

       Col1
0    100,000,000
1    234,343,000
2      NaN
3    213,564,545
4      NaN
5    678,000
6     200

how can divide the non NaN values in Col1 with 1000 and have this desirable output:

        Col1            Col2
0    100,000,000      100,000
1    234,343,000      234,343
2        NaN            NaN
3    213,564,545      213,564
4        NaN            NaN
5      678,000          678
6     200               0.2

I tried to do df['Col1'].div(1000) but it is not working because of the NaN values

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

It looks like you have strings, so dividing by 1000 is just removing the last comma and last 3 digits. You can use a regex:

df['Col2'] = df['Col1'].str.replace(',?\d{,3}$', '', regex=True)

output:

          Col1     Col2
0  100,000,000  100,000
1  234,343,000  234,343
2          NaN      NaN
3  213,564,545  213,564
4          NaN      NaN
5      678,000      678

emulating a true division:

only small numbers as float
df['Col2'] = df['Col1'].str.replace(r'(.*?)(,?(\d{1,3}))$',
                                    lambda m: m.group(1) if m.group(1) else ('0.'+m.group(3).zfill(3)),
                                    regex=True)

output:

          Col1     Col2
0  100,000,000  100,000
1  234,343,000  234,343
2          NaN      NaN
3  213,564,545  213,564
4          NaN      NaN
5      678,000      678
6           23    0.023
all floats
df['Col2'] = df['Col1'].str.replace(r'(.*?)(,?(\d{1,3}))$',
                                    lambda m: (m.group(1) if m.group(1) else '0')+'.'+m.group(3).zfill(3),
                                    regex=True)

output:

          Col1         Col2
0  100,000,000  100,000.000
1  234,343,000  234,343.000
2          NaN          NaN
3  213,564,545  213,564.545
4          NaN          NaN
5      678,000      678.000
6           23        0.023
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading