I have multiple lists, and want to convert them to a dataframe
ss1 = ['a','b','c']
ss2 = ['d','c','b','a','f']
ss3 = ['a','g']
ss4 = ['a','d','g','h']
ls=[ss1,sorted(ss2),sorted(ss3),ss4]
pd.DataFrame(ls)
but, I want the distinct individual elements of the combined lists as the column headers, and a 1/0 response in the body showing whether or not that column header (element) was in that row (list)
I want:
pd.DataFrame({'a':[1,1,1,1],
'b':[1,1,None,None],
'c':[1,1,None,None],
'd':[None,1,None,1],
'f':[None,None,1,None],
'g':[None,None,1,1],
'h':[None,None,None,1]})
Visually, from:
to
>Solution :
You could rework the list into a list of dictionaries with dict.fromkeys:
out = pd.DataFrame([dict.fromkeys(x, 1) for x in ls])
Output:
a b c d f g h
0 1 1.0 1.0 NaN NaN NaN NaN
1 1 1.0 1.0 1.0 1.0 NaN NaN
2 1 NaN NaN NaN NaN 1.0 NaN
3 1 NaN NaN 1.0 NaN 1.0 1.0

