Home How to count values from one column based on the description in other column in pandas?

Questions

How to count values from one column based on the description in other column in pandas?

May 21, 2022

I am new to programming and Python. Lately, I am learning to use pandas.

What I would like to know

I am wondering what would be the best approach to work only on numbers related to Group II (in the attached DataFrame). I mean e.g. sum all grades for group II and column ‘Project’. Sure it won’t make sense to sum grades, but the data is just for illustration purposes.

I’d be grateful for any advices and suggestions.

My DataFrame

The code attached will generate random numbers (except for the ‘Group’ column) but the DataFrame will always be like that:

     Name Album  Group  Colloquium_1  Colloquium_2  Project
# 0    B  61738     I             5             4        5
# 1    Z  44071     I             5             5        2
# 2    M  87060     I             5             5        5
# 3    L  67974     I             3             5        3
# 4    Z  15617     I             3             2        3
# 5    Z  91872    II             2             4        5
# 6    H  84685    II             4             2        5
# 7    T  17943    II             2             5        2
# 8    L  54302    II             2             5        3
# 9    O  53433    II             5             4        5

Code to generate my DataFrame:

import pandas as pd
import random as rd

def gen_num():
    num = ""
    for i in range(5):
        num += str(rd.randint(0,9))
    return num

names = ['A','B','C','D','E','F','G','H','I','J','K', 'L','M','N','O', \
             'P','R','S','T','W','Z']

list_names = []
list_album = []
list_group = []
list_coll_1 = []
list_coll_2 = []
list_project = []  

num_of_students = 10

for i in range(num_of_students):
    
    list_names.append(rd.choice(names))
    list_album.append(gen_num())
    list_coll_1.append(rd.randint(2, 5))
    list_coll_2.append(rd.randint(2, 5))
    list_project.append(rd.randint(2, 5)) 
    if i < (num_of_students / 2):
        list_group.append('I')
    else:
        list_group.append('II')
        
group = pd.DataFrame(list_names)
group.set_axis(['Name'], axis=1, inplace=True)
group['Album'] = list_album
group['Group'] = list_group
group['Colloquium_1'] = list_coll_1
group['Colloquium_2'] = list_coll_2
group['Project'] = list_project

>Solution :

One solution to this is to filter the DataFrame first:

group[group["Group"] == "II"]["Project"].sum()
#Out: 18

Breaking this up into parts:

First, this part returns a series of bools (True/False) for each row as to whether the values in "Group" are equal to "II":

group["Group"] == "II"
#0    False
#1    False
#2    False
#3    False
#4    False
#5     True
#6     True
#7     True
#8     True
#9     True
#Name: Group, dtype: bool

Next, writing this into group[] returns a filtered dataframe for those rows that are True:

group[group["Group"] == "II"]
#  Name  Album Group  Colloquium_1  Colloquium_2  Project
#5    E  77371    II             4             5        3
#6    N  90525    II             4             3        3
#7    H  89889    II             3             4        5
#8    T  88154    II             3             4        5
#9    E  56176    II             3             2        2

Using ["Project"] on the end returns a pandas Series of the values in the column:

group[group["Group"] == "II"]["Project"]
#5    3
#6    3
#7    5
#8    5
#9    2
#Name: Project, dtype: int64

And lastly .sum() returns the sum of the series (18).

byMR

Published May 21, 2022

Add a comment

Best way to implement fixed size array in C++

byMR

May 21, 2022

Questions

How to exclude some text between two patterns?

byMR

May 21, 2022

Questions

A pending promise is returned from stripe.checkout.sessions.listLineItems each time

byMR

May 21, 2022

Questions

How do I fix "AttributeError: 'TextChannel' object has no attribute 'news'" in Pycord?

byMR

May 21, 2022

Questions

Why are these centered divs shifted upwards and how to fix this?

byMR

May 21, 2022

Questions

Plot looks different everytime i run the code

byMR

May 21, 2022

How to count values from one column based on the description in other column in pandas?

What I would like to know

MEDevel.com: Open-source for Healthcare and Education

My DataFrame

Code to generate my DataFrame:

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Best way to implement fixed size array in C++

How to exclude some text between two patterns?

A pending promise is returned from stripe.checkout.sessions.listLineItems each time

How do I fix "AttributeError: 'TextChannel' object has no attribute 'news'" in Pycord?

Why are these centered divs shifted upwards and how to fix this?

Plot looks different everytime i run the code

Keep Up to Date with the Most Important News

How to count values from one column based on the description in other column in pandas?

What I would like to know

MEDevel.com: Open-source for Healthcare and Education

My DataFrame

Code to generate my DataFrame:

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Best way to implement fixed size array in C++

How to exclude some text between two patterns?

A pending promise is returned from stripe.checkout.sessions.listLineItems each time

How do I fix "AttributeError: 'TextChannel' object has no attribute 'news'" in Pycord?

Why are these centered divs shifted upwards and how to fix this?

Plot looks different everytime i run the code

Discover more from Dev solutions