Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extracting first 3 elements from list of strings in pandas df

I want to extract the first 3 elements from a list of strings from the '1/1' column.
My df_unique looks like that:

                                                                                    1/1                                                             0/0  count
0              ['P1-12', 'P1-22', 'P1-25', 'P1-26', 'P1-28', 'P1-6', 'P1-88', 'P1-93']                            ['P1-89', 'P1-90', 'P1-92', 'P1-95']      1
1              ['P1-12', 'P1-22', 'P1-25', 'P1-26', 'P1-6', 'P1-89', 'P1-92', 'P1-95']                                     ['P1-28', 'P1-90', 'P1-93']      1
2    ['P1-12', 'P1-22', 'P1-25', 'P1-26', 'P1-88', 'P1-89', 'P1-92', 'P1-93', 'P1-95']                                      ['P1-28', 'P1-6', 'P1-90']      1
3             ['P1-12', 'P1-22', 'P1-25', 'P1-26', 'P1-88', 'P1-89', 'P1-92', 'P1-93']                                      ['P1-28', 'P1-6', 'P1-90']      1                                                                                                                                         

I’ve tried to use different solutions:

df_extract_3 = df_unique['1/1'].str.split().map(lambda lst: [string[0:3] for string in lst])

but the result looks like that:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

0           [['P, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1]
1           [['P, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1]
2      [['P, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1]
3           [['P, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1, 'P1]

And the second solution:

df_extract_3 = df_unique['1/1'].str[0:3]

gives:

0      ['P
1      ['P
2      ['P
3      ['P

When I try to add split :

df_extract_3 = df_unique['1/1'].str.split().str[0:3]

the final result is:

0      [['P1-12',, 'P1-22',, 'P1-25',]
1      [['P1-12',, 'P1-22',, 'P1-25',]
2      [['P1-12',, 'P1-22',, 'P1-25',]
3      [['P1-12',, 'P1-22',, 'P1-25',]

What should I change to receive ‘normal’ output like:

0      ['P1-12', 'P1-22', 'P1-25']
1      ['P1-12', 'P1-22', 'P1-25']
2      ['P1-12', 'P1-22', 'P1-25']
3      ['P1-12', 'P1-22', 'P1-25']

? I know it can be easy modification but I’ve stuck and messed with that…
Thanks a lot!

>Solution :

First convert your strings to read lists, then slice with str:

import ast
df_unique['1/1'] = df_unique['1/1'].apply(ast.literal_eval)
df_unique['0/0'] = df_unique['0/0'].apply(ast.literal_eval)

df_extract_3 = df_unique['1/1'].str[:3]
print(df_extract_3)

Or in one shot:

df_extract_3 = df_unique['1/1'].apply(lambda x: ast.literal_eval(x)[:3])

Output:

0    [P1-12, P1-22, P1-25]
1    [P1-12, P1-22, P1-25]
2    [P1-12, P1-22, P1-25]
3    [P1-12, P1-22, P1-25]
Name: 1/1, dtype: object
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading