Home How to adjust the value counts based on a metric?

Questions

How to adjust the value counts based on a metric?

January 12, 2024

My input is this dataframe :

data = {
    'ISSUE': ['#1', '#2', '#3', '#4', '#5', '#6', '#7', '#8', '#9', '#10', '#11', '#12'],
    'CATEGORY': ['CAT1', 'CAT1', 'CAT2', 'CAT3', 'CAT1', 'CAT3', 'CAT3', 'CAT2', 'CAT2', 'CAT2', 'CAT4', 'CAT5'],
    'AGENT': ['A', 'A', 'B', 'A', 'C', 'C', 'A', 'A', 'B', 'C', 'C', 'A'],
    'STATUS': ['FINISHED'] * 12
}
df = pd.DataFrame(data)
print(df)

   ISSUE CATEGORY AGENT    STATUS
0     #1     CAT1     A  FINISHED
1     #2     CAT1     A  FINISHED
2     #3     CAT2     B  FINISHED
3     #4     CAT3     A  FINISHED
4     #5     CAT1     C  FINISHED
5     #6     CAT3     C  FINISHED
6     #7     CAT3     A  FINISHED
7     #8     CAT2     A  FINISHED
8     #9     CAT2     B  FINISHED
9    #10     CAT2     C  FINISHED
10   #11     CAT4     C  FINISHED
11   #12     CAT5     A  FINISHED

Each category has a complexity of resolution time :

# EASY  (around 10 mins) :  CAT1+CAT3+CAT5
# MEDIUM (around 2 hours) : CAT2
# HARD  (>= 1 day) : CAT4

When using a value counts I got this :

AGENT
A    6
C    4
B    2
Name: count, dtype: int64

For example one that interpret these values might think that Agent A is the most productive one. It could be but we need to verify that.

For that I started making a code but ignore how to purse so I can readjust the value counts :

complexity = {
    'CAT1': 10/60,
    'CAT2': 2,
    'CAT3': 10/60,
    'CAT4': 8*60,
    'CAT5': 10/60
}

df['TIME'] = df['CATEGORY'].map(complexity)

df['AGENT'].value_counts() * df.groupby('AGENT')['TIME'].sum()

# AGENT
# A      17.000000
# B       8.000000
# C    1929.333333
# dtype: float64

Have you guys encountered a similar subject ? I’m open to any suggestion.

>Solution :

Since you’re now using a quantitative value, value_counts is no longer appropriate, better use a groupby.sum:

df.groupby('AGENT', sort=False)['TIME'].sum().sort_values(ascending=False)

Output:

AGENT
C    482.333333
B      4.000000
A      2.833333
Name: TIME, dtype: float64

Or if you want to have the productivity relative to the total number of tasks, then a mean:

df.groupby('AGENT', sort=False)['TIME'].mean().sort_values(ascending=False)

Output:

AGENT
C    120.583333
B      2.000000
A      0.472222
Name: TIME, dtype: float64

pandas

byMR

Published January 12, 2024

Add a comment

C++ Adding two uint16_t does not compile in clang

byMR

January 12, 2024

Questions

Stop compiling if a called function is not supported under OS

byMR

January 12, 2024

Questions

"No matching distribution" for pip install on WSL

byMR

January 12, 2024

Questions

Huge delay for *ngIf due to http.get

byMR

January 12, 2024

Questions

Can't find contact in state

byMR

January 12, 2024

Questions

why update is not working when customising the save() in the model class?

byMR

January 12, 2024

How to adjust the value counts based on a metric?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

C++ Adding two uint16_t does not compile in clang

Stop compiling if a called function is not supported under OS

"No matching distribution" for pip install on WSL

Huge delay for *ngIf due to http.get

Can't find contact in state

why update is not working when customising the save() in the model class?

Keep Up to Date with the Most Important News

How to adjust the value counts based on a metric?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

C++ Adding two uint16_t does not compile in clang

Stop compiling if a called function is not supported under OS

"No matching distribution" for pip install on WSL

Huge delay for *ngIf due to http.get

Can't find contact in state

why update is not working when customising the save() in the model class?

Discover more from Dev solutions