Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I count the number of occurrences of a given string in a string array in pandas

I want to see which tags occur most frequently in my dataset. When i try to do this on my own i get something like this:

df['tags'].value_counts()

[‘Startup’] 80
[‘Bitcoin’] 79
[‘The Daily Pick’] 78
[‘Addiction’, ‘Health’, ‘Body’, ‘Alcohol’, ‘Mental Health’] 62

Some articles have many tags but
I would like to count the tracking count for each tag separately.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

IIUC, You need to use ast.literal_eval, explode(), and then use value_counts().

from ast import literal_eval
import pandas as pd

res = df['tags'].apply(literal_eval).explode().value_counts()
print(res)

Output:

Startup      4
Bitcoin      3
Addiction    2
Health       2
Name: tags, dtype: int64

Sample input DataFrame:

df = pd.DataFrame({
    "tags" : [
        "['Startup']", "['Startup']", "['Startup']", "['Startup']",
        "['Bitcoin']", "['Bitcoin']", "['Bitcoin']", 
        "['Addiction', 'Health']", "['Addiction', 'Health']"
    ]
})

By thanks @ljmc:

NB. ast.literal_eval is not safe always. from doc:

This function had been documented as “safe” in the past without defining what that meant. That was misleading. This is specifically designed not to execute Python code, unlike the more general eval(). […] But it is not free from attack: A relatively small input can lead to memory exhaustion or to C stack exhaustion, crashing the process. There is also the possibility for excessive CPU consumption denial of service on some inputs. Calling it on untrusted data is thus not recommended.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading