Advertisements
My input is this dataframe :
data = {
'ISSUE': ['#1', '#2', '#3', '#4', '#5', '#6', '#7', '#8', '#9', '#10', '#11', '#12'],
'CATEGORY': ['CAT1', 'CAT1', 'CAT2', 'CAT3', 'CAT1', 'CAT3', 'CAT3', 'CAT2', 'CAT2', 'CAT2', 'CAT4', 'CAT5'],
'AGENT': ['A', 'A', 'B', 'A', 'C', 'C', 'A', 'A', 'B', 'C', 'C', 'A'],
'STATUS': ['FINISHED'] * 12
}
df = pd.DataFrame(data)
print(df)
ISSUE CATEGORY AGENT STATUS
0 #1 CAT1 A FINISHED
1 #2 CAT1 A FINISHED
2 #3 CAT2 B FINISHED
3 #4 CAT3 A FINISHED
4 #5 CAT1 C FINISHED
5 #6 CAT3 C FINISHED
6 #7 CAT3 A FINISHED
7 #8 CAT2 A FINISHED
8 #9 CAT2 B FINISHED
9 #10 CAT2 C FINISHED
10 #11 CAT4 C FINISHED
11 #12 CAT5 A FINISHED
Each category has a complexity of resolution time :
# EASY (around 10 mins) : CAT1+CAT3+CAT5
# MEDIUM (around 2 hours) : CAT2
# HARD (>= 1 day) : CAT4
When using a value counts I got this :
AGENT
A 6
C 4
B 2
Name: count, dtype: int64
For example one that interpret these values might think that Agent A
is the most productive one. It could be but we need to verify that.
For that I started making a code but ignore how to purse so I can readjust the value counts :
complexity = {
'CAT1': 10/60,
'CAT2': 2,
'CAT3': 10/60,
'CAT4': 8*60,
'CAT5': 10/60
}
df['TIME'] = df['CATEGORY'].map(complexity)
df['AGENT'].value_counts() * df.groupby('AGENT')['TIME'].sum()
# AGENT
# A 17.000000
# B 8.000000
# C 1929.333333
# dtype: float64
Have you guys encountered a similar subject ? I’m open to any suggestion.
>Solution :
Since you’re now using a quantitative value, value_counts
is no longer appropriate, better use a groupby.sum
:
df.groupby('AGENT', sort=False)['TIME'].sum().sort_values(ascending=False)
Output:
AGENT
C 482.333333
B 4.000000
A 2.833333
Name: TIME, dtype: float64
Or if you want to have the productivity relative to the total number of tasks, then a mean
:
df.groupby('AGENT', sort=False)['TIME'].mean().sort_values(ascending=False)
Output:
AGENT
C 120.583333
B 2.000000
A 0.472222
Name: TIME, dtype: float64