Advertisements
I have a DF in which one of the columns has strings of the form
0 word1|category1 word2|category2
1 word3|category3 word4|category4 word2|category2 ..
2 word1|category1 word4|category4 word3|category3 ..
where "word1|category1 word4|category4 word3|category3 .."
is a string
I need an output dictionary mapping that maps unique set of words to their respective categories.
I tried using series.apply(ast.literal_eval)
but it throws an invalid syntax error
>Solution :
If need dictionaries for each row use nested list comprehension:
df['col'] = [dict(y.split('|') for y in x.split()) for x in df['col']]
print (df)
col
0 {'word1': 'category1', 'word2': 'category2'}
1 {'word3': 'category3', 'word4': 'category4', '...
2 {'word1': 'category1', 'word4': 'category4', '...
Or if need one big dictionary from all values use Series.str.split
with Series.explode
, create 2 columns DataFrame and convert to dictionary:
d = df['col'].str.split().explode().str.split('|', expand=True).set_index(0)[1].to_dict()
print (d)
{'word1': 'category1', 'word2': 'category2', 'word3': 'category3', 'word4': 'category4'}
Another alterntive:
d = dict(df['col'].str.split().explode().str.split("|").to_numpy())