Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert lists to dataframe, where list elements are column headers, and rows represent a binary response if list had element

I have multiple lists, and want to convert them to a dataframe

ss1 = ['a','b','c'] 
ss2 = ['d','c','b','a','f']
ss3 = ['a','g']
ss4 = ['a','d','g','h']
ls=[ss1,sorted(ss2),sorted(ss3),ss4]
pd.DataFrame(ls)

but, I want the distinct individual elements of the combined lists as the column headers, and a 1/0 response in the body showing whether or not that column header (element) was in that row (list)

I want:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

pd.DataFrame({'a':[1,1,1,1],
              'b':[1,1,None,None],
              'c':[1,1,None,None],
              'd':[None,1,None,1],
              'f':[None,None,1,None],
              'g':[None,None,1,1],
              'h':[None,None,None,1]})

Visually, from:

to

enter image description here

>Solution :

You could rework the list into a list of dictionaries with dict.fromkeys:

out = pd.DataFrame([dict.fromkeys(x, 1) for x in ls])

Output:

   a    b    c    d    f    g    h
0  1  1.0  1.0  NaN  NaN  NaN  NaN
1  1  1.0  1.0  1.0  1.0  NaN  NaN
2  1  NaN  NaN  NaN  NaN  1.0  NaN
3  1  NaN  NaN  1.0  NaN  1.0  1.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading