Need some help on processing data inside a pandas dataframe
I want to convert the form of "col1" into the form of "col2", what should I do?
import pandas as pd
a = {'col1' : ['0.11K','1011K','0.12M','0','0.3','0.02'],'col2':['110','1011000','120000','0','300000','20000']}
df = pd.DataFrame(a , columns=['col1','col2'])
df['col'] = (df['col'].replace(r'[KM]+$', '', regex=True).astype(float) * \
df['col'].str.extract(r'[\d\.]+([KM]+)', expand=False).fillna(1).replace(['K','M'],[10**3,10**6]).astype(int))
>Solution :
If no unit means M then change .fillna(10**6) instead fillna(1) and processing column col1 instead col:
df['col'] = (df['col1'].replace(r'[KM]+$', '', regex=True).astype(float) *
df['col1'].str.extract(r'[\d\.]+([KM]+)', expand=False)
.fillna(10**6)
.replace(['K','M'],[10**3,10**6]).astype(int))
print (df)
col1 col2 col
0 0.11K 110 110.0
1 1011K 1011000 1011000.0
2 0.12M 120000 120000.0
3 0 0 0.0
4 0.3 300000 300000.0
5 0.02 20000 20000.0
Your solution from Convert the string 2.90K to 2900 or 5.2M to 5200000 in pandas dataframe:
df['col'] = (df['col1'].replace(r'[KM]+$', '', regex=True).astype(float) *
df['col1'].str.extract(r'[\d\.]+([KM]+)', expand=False)
.fillna(1)
.replace(['K','M'],[10**3,10**6]).astype(int))
print (df)
col1 col2 col
0 0.11K 110 110.00
1 1011K 1011000 1011000.00
2 0.12M 120000 120000.00
3 0 0 0.00
4 0.3 300000 0.30
5 0.02 20000 0.02