Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

List of values from dataframe column where each row has muliple values

I have a dataframe that looks like this

index     column

A         41 13 4 61 12 35

B         16 35 56 24

C         12

And I want to end up with a long list of all values like [41, 13, 4, 61, 12, 35, 16, 35, 56, 24, 12]

So first I converted that column into a dict with DataFrame.to_dict()

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

And then what I was trying to do was to split each dict value into a list rather than a long string:

for key,val in d.items():
    d[key] = val.split[' ']

but it’s throwing an error: TypeError: 'builtin_function_or_method' object is not subscriptable

Then I would proceed to append all the values into a long list. But given the error, I’m suspecting that there is a simpler way that I am missing. Does anyone know what that could be?

>Solution :

split the column then use hstack to flatten

np.hstack(df['column'].str.split()).tolist()

Alternatively you can use a pure python approach

from itertools import chain

list(chain(*map(str.split, df['column'])))

['41', '13', '4', '61', '12', '35', '16', '35', '56', '24', '12']
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading