How could I calculate the value counts within a string col?
df col
0 fruit["apple"], colour["green", "yellow" ]
1 colour["yellow"]
2 colour["brown"]
Expected Output
fruit 1
colour 3
>Solution :
Use Series.str.extractall with substrings joined by | for regex or:
s = df['col'].str.extractall('(fruit|colour)')[0].value_counts()
print (s)
colour 3
fruit 1
Name: 0, dtype: int64
Or get words before [ for more dynamic solution:
s = df['col'].str.extractall(r'(\w+)\[')[0].value_counts()
print (s)
colour 3
fruit 1
dtype: int64