Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Count columns with multiple values

I have this dataframe

df = pd.DataFrame({
    "col1": ["Kev", "Kev", "Fr"],
    "col2": ["Red; Purple", "Yellow; Purple; Red", "Red; Yellow"], }, index=["1", "2", "3"])

It’ll look like this

    col1 col2
1   Kev  Red; Purple
2   Kev  Yellow; Purple; Red
3   Fr   Red; Yellow

I want to count all the items in col2 according to col1. In this case the final df will be like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    col1 col2   count
1   Kev  Red    2
2   Kev  Purple 2
3   Kev  Yellow 1
4   Fr   Red    1
5   Fr   Yellow 1

I tried using explode:

df2 = (df.explode(df.columns.tolist())
      .apply(lambda col: col.str.split(';'))
      .explode('col1')
      .explode('col2'))

but that only gives me col1 and col2 of my desired dataframe, not the count. If I use crosstab on df2, I’ll get a very different result.

I managed to get the desired output with 2 nested for loops, but my dataframe is so big that it takes almost a minute loading the function. I want to avoid this solution.

>Solution :

After pd.crosstab, you can try melt

df2 = (df.explode(df.columns.tolist())
      .apply(lambda col: col.str.split('; ')) # <-- space here
      .explode('col1')
      .explode('col2'))


out = (pd.crosstab(df2['col1'], df2['col2'])
       .melt(value_name='count', ignore_index=False)
       .reset_index())
print(out)

  col1    col2  count
0   Fr  Purple      0
1  Kev  Purple      2
2   Fr     Red      1
3  Kev     Red      2
4   Fr  Yellow      1
5  Kev  Yellow      1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading