Hi i want to convert my dataframe from
df = pd.DataFrame({
'date': [
'2019-08-01', '2019-12-01',
'2019-23-01', '2019-15-01',
'2020-26-02', '2020-10-10',
'2020-08-10', '2021-01-04'
],
'loc': [
'axp','axp', 'axp', 'axp',
'axe', 'axp', 'axp', 'axe'
],
'category': [
'domestic', 'domestic', 'domestic',
'domestic',
'wild', 'domestic', 'domestic', 'wild'
],
'status': [
'found', 'found', 'not found', 'found',
'found', 'found', 'found', 'not found'
]
})
to this format where each loc
is reduced to the count of its associated category
and status
per month and year
loc_mnth_yr | n_domestic | n_wild | n_found | n_not_found |
---|---|---|---|---|
axp_01_2019 | 4 | 0 | 3 | 1 |
axe_02_2020 | 0 | 1 | 1 | 0 |
axp_10_2020 | 2 | 0 | 2 | 0 |
axe_04_2021 | 0 | 1 | 0 | 1 |
i have tried using the below codes but am not yet getting the desired results
df = df.resample('M')['category'].value_counts().to_frame().rename({'category':'count'},axis=1)
df = df.pivot_table(index='date', columns='category')
>Solution :
Use DataFrame.assign
for loc_mnth_yr
column, then unpivot by DataFrame.melt
ad count values by crosstab
:
df['date'] = pd.to_datetime(df['date'], format='%Y-%d-%m')
df = (df.assign(loc_mnth_yr = df['loc']+ '_' + df['date'].dt.strftime('%m_%Y'))
.melt(id_vars='loc_mnth_yr', value_vars=['category','status']))
df = pd.crosstab(df['loc_mnth_yr'], df['value']).add_prefix('n_')
print (df)
value n_domestic n_found n_not found n_wild
loc_mnth_yr
axe_02_2020 0 1 0 1
axe_04_2021 0 0 1 1
axp_01_2019 4 3 1 0
axp_10_2020 2 2 0 0
If order is important use:
df['date'] = pd.to_datetime(df['date'], format='%Y-%d-%m')
df = (df.assign(mnth_yr = df['date'].dt.strftime('%m_%Y'))
.melt(id_vars=['loc', 'mnth_yr'], value_vars=['category','status']))
df = (pd.crosstab([df['mnth_yr'], df['loc']], df['value'])
.reindex(columns=df['value'].unique())
.add_prefix('n_'))
df.index = [f'{b}_{a}' for a, b in df.index]
df = df.rename_axis(index='loc_mnth_yr', columns=None).reset_index()
print (df)
loc_mnth_yr n_domestic n_wild n_found n_not found
0 axp_01_2019 4 0 3 1
1 axe_02_2020 0 1 1 0
2 axe_04_2021 0 1 0 1
3 axp_10_2020 2 0 2 0