Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create new data frame columns with return in function

In the dataframe below, i need to check the shop column and add two columns based on shop

datetime dtnum time tnum shop
02-03-2022 05:25 20220302052500 05:25:00 52500 PRESS1
02-03-2022 05:26 20220302052600 05:26:00 52600 BODY1
02-03-2022 05:27 20220302052700 05:27:00 52700 BODY2

I tried with below code. that is using same return to make two columns

# Apply shop_groupcode
def shop_chk(column):
    if column['shop'] == 'PRESS1' or 'PRESS1_A' or 'E176' or 'E177' or 'E184' or 'E185' or 'E186' or 'E187':
        return 1, 1  # shop code , plant code
    if column['shop'] == 'BODY1' or 'BODY1_A' or 'E179' or 'E180' or 'E181' or 'E208' or \
            'E216' or 'E217' or 'E218' or 'E232':
        return 2, 1 # shop code , plant code
    if column['shop'] == 'BODY2' or 'BODY2_A' or 'E196' or 'E197' or 'E198' or 'E199' or 'E200':
        return 8, 2 # shop code , plant code

df['shop_code', 'plant_code'] = df.apply(shop_chk, axis=1, result_type="expand")

Code runs without error, two columns are created- but the column values are all 1

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Requirement:
I want to know if there is mistake in code or if there is any efficient method as I have some more shop conditions to check

datetime dtnum time tnum shop shop_code plant_code
02-03-2022 05:25 20220302052500 05:25:00 52500 PRESS1 1 1
02-03-2022 05:26 20220302052600 05:26:00 52600 BODY1 2 1
02-03-2022 05:27 20220302052700 05:27:00 52700 BODY2 8 2

>Solution :

You can use isin instead of multiple ORs and store those conditions in a list and use numpy.select:

import numpy as np
conditions = [df['shop'].isin(['PRESS1','PRESS1_A','E176','E177','E184','E185','E186','E187']), 
              df['shop'].isin(['BODY1','BODY1_A','E179','E180','E181','E208','E216','E217','E218','E232']),
              df['shop'].isin(['BODY2','BODY2_A','E196','E197','E198','E199','E200'])
             ]
df['shop_code'] = np.select(conditions, [1, 2, 8])
df['plant_code'] = np.select(conditions, [1, 1, 2])

Output:

           datetime           dtnum      time   tnum    shop  shop_code  plant_code  
0  02-03-2022 05:25  20220302052500  05:25:00  52500  PRESS1          1           1  
1  02-03-2022 05:26  20220302052600  05:26:00  52600   BODY1          2           1
2  02-03-2022 05:27  20220302052700  05:27:00  52700   BODY2          8           2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading