Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create Dataframe from list of strings of delimited column names and values

I have a list of strings:

data = ['col1:abc col2:def col3:ghi',
        'col4:123 col2:qwe col10:xyz',
        'col3:asd']

I would like to convert this to a dataframe, where each string in the list is a row in the dataframe, like so:

desired_out = pd.DataFrame({'col1':  ['abc',  np.nan, np.nan],
                            'col2':  ['def',  'qwe',  np.nan],
                            'col3':  ['ghi',  np.nan, 'asd'],
                            'col4':  [np.nan, '123',  np.nan],
                            'col10': [np.nan, 'xyz',  np.nan]})

desired output

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use nested list comprehension with convert splitted values to dictionaries:

df = pd.DataFrame([dict([y.split(':') for y in x.split()]) for x in data])
print (df)
  col1 col2 col3 col4 col10
0  abc  def  ghi  NaN   NaN
1  NaN  qwe  NaN  123   xyz
2  NaN  NaN  asd  NaN   NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading