Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas new column from counts of column contents

A simple data frame that I want to add a column, to tell how many Teams that the Project has, according to a name dictionary.

enter image description here

The way I came up with seems working ok but doesn’t look very smart.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What is a better way to do so? Thank you.

import pandas as pd
from io import StringIO

dict_name = {
"William":  "A",
"James":    "C",
"Ava":  "A",
"Elijah":   "A",
"Mason":    "B",
"Ethan":    "B",
"Noah": "B",
"Benjamin": "B",
"Lucas":    "B",
"Oliver":   "B",
"Olivia":   "C",
"Emma": "C"}

csvfile = StringIO(
"""
Project ID  Members
A58 Noah, Oliver
A34 William, Elijah, James, Benjamin
A157    Lucas, Mason, Ethan, Olivia
A49 Emma, Ava""")

df = pd.read_csv(csvfile, sep = '\t', engine='python')

final_count_list = []
final_which_list = []

for names in df.Members.to_list():
    team_list = []
    for each in names.split(', '):
        team_list.append(dict_name[each])

    final_count_list.append(len(list(set(team_list))))
    final_which_list.append(list(set(team_list)))

df['How many teams?'] = final_count_list
df['Which teams?'] = final_which_list

print (df)

enter image description here

>Solution :

Approach 1: (faster)

c = ['Which teams?', 'How many teams?']
df[c] = df['Members'].map(lambda x: (z:={dict_name[y] for y in x.split(', ')}, len(z))).tolist()

Approach 2: (looks better)

c = ['How many teams?', 'Which teams?']
df[c] = (
    df['Members']
    .str.split(', ')
    .explode()
    .map(dict_name)
    .groupby(level=0)
    .agg(['nunique', 'unique'])
)

Result

  Project ID                           Members  How many teams? Which teams?
0        A58                      Noah, Oliver                1          [B]
1        A34  William, Elijah, James, Benjamin                3    [A, C, B]
2       A157       Lucas, Mason, Ethan, Olivia                2       [B, C]
3        A49                         Emma, Ava                2       [C, A]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading