Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create new column by using a list comprehension with two 'for' loops in Pandas DataFrame

I have the following dataframe

df=pd.DataFrame({'col1': ['aaaa', 'aabb', 'bbcc', 'ccdd'], 
   'col2': ['ab12', 'cd15', 'kf25', 'zx78']})
df
    col1    col2
0   aaaa    ab12
1   aabb    cd15
2   bbcc    kf25
3   ccdd    zx78

I want to create ‘col3’ based on ‘col1’ and ‘col2’, I want to get:

df
    col1    col2    col3
0   aaaa    ab12    aa-12
1   aabb    cd15    aa-15
2   bbcc    kf25    bb-25
3   ccdd    zx78    cc-78

I tried to use list comprehension but I got the error: ValueError: Length of values (16) does not match length of index (4)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The code I used is :

df['col3']=[x[0:2]+'-'+y[2:4] for x in df['col1'] for y in df['col2']]

>Solution :

Use simple slicing with the str accessor, and concatenation:

df['col3'] = df['col1'].str[:2] + '-' + df['col2'].str[2:4]

Or, if you want the last two characters of col2:

df['col3'] = df['col1'].str[:2] + '-' + df['col2'].str[-2:]

Output:

   col1  col2   col3
0  aaaa  ab12  aa-12
1  aabb  cd15  aa-15
2  bbcc  kf25  bb-25
3  ccdd  zx78  cc-78

why your approach did not work

You would have needed to zip:

df['col3'] = [x[0:2]+'-'+y[2:4] for x,y in zip(df['col1'], df['col2'])]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading