Create new column by using a list comprehension with two 'for' loops in Pandas DataFrame

I have the following dataframe

df=pd.DataFrame({'col1': ['aaaa', 'aabb', 'bbcc', 'ccdd'], 
   'col2': ['ab12', 'cd15', 'kf25', 'zx78']})
df
    col1    col2
0   aaaa    ab12
1   aabb    cd15
2   bbcc    kf25
3   ccdd    zx78

I want to create ‘col3’ based on ‘col1’ and ‘col2’, I want to get:

df
    col1    col2    col3
0   aaaa    ab12    aa-12
1   aabb    cd15    aa-15
2   bbcc    kf25    bb-25
3   ccdd    zx78    cc-78

I tried to use list comprehension but I got the error: ValueError: Length of values (16) does not match length of index (4)

The code I used is :

df['col3']=[x[0:2]+'-'+y[2:4] for x in df['col1'] for y in df['col2']]

>Solution :

Use simple slicing with the str accessor, and concatenation:

df['col3'] = df['col1'].str[:2] + '-' + df['col2'].str[2:4]

Or, if you want the last two characters of col2:

df['col3'] = df['col1'].str[:2] + '-' + df['col2'].str[-2:]

Output:

   col1  col2   col3
0  aaaa  ab12  aa-12
1  aabb  cd15  aa-15
2  bbcc  kf25  bb-25
3  ccdd  zx78  cc-78

why your approach did not work

You would have needed to zip:

df['col3'] = [x[0:2]+'-'+y[2:4] for x,y in zip(df['col1'], df['col2'])]

Leave a Reply