Can you set ValueCount() to output the corresponding entry as well as the value count?

Say you have the following data frame and you needed to know how many Assays were done per month.

type,"Date Tested"
Assay,2022/01/28
Assay,2022/01/31
Assay,2022/02/02
Assay,2022/03/31
Assay,2022/04/21
Assay,2022/05/12
Assay,2022/06/02
Assay,2022/02/03
Assay,2022/06/03

You can use value_counts() from Pandas to easily do this.

data['Date Tested']=pd.to_datetime(data['Date Tested'], format = "%Y/%m/%d")
months = data['Date Tested'].dt.month.value_counts(sort=False)
print(months)

Which outputs:

1    2
2    2
3    1
4    1
5    1
6    2
Name: Date Tested, dtype: int64

The ‘numbers’ in the first column are each month (i.e 01 – Jan, 02 – Feb etc..) but this isn’t great. What if the dataset started at March? Then March = 01.
Or what if I needed to do the same thing but by weeks? How could you workout what, say 12 was in terms of a week?

How can you modify the output of value_count to include the corresponding month/week? This information is present in the dataframe, shown by:

print(data['Date Tested'])

Which gives:

0   2022-01-28
1   2022-01-31
2   2022-02-02
3   2022-03-31
4   2022-04-21
5   2022-05-12
6   2022-06-02
7   2022-02-03
8   2022-06-03
Name: Date Tested, dtype: datetime64[ns]

Ideally, my count output would be something like this:

2022-01   2
2022-02   2
2022-03   1
2022-04   1
2022-05   1
2022-06   2
Name: Date Tested, dtype: datetime64[ns]

>Solution :

You can convert to_period (M = monthly) and count:

data.groupby(data['Date Tested'].dt.to_period('M'))['type'].count()

output:

Date Tested
2022-01    2
2022-02    2
2022-03    1
2022-04    1
2022-05    1
2022-06    2
Freq: M, Name: type, dtype: int64

By week:

data.groupby(data['Date Tested'].dt.to_period('W'))['type'].count()

output:

Date Tested
2022-01-24/2022-01-30    1
2022-01-31/2022-02-06    3
2022-03-28/2022-04-03    1
2022-04-18/2022-04-24    1
2022-05-09/2022-05-15    1
2022-05-30/2022-06-05    2
Freq: W-SUN, Name: type, dtype: int64

Leave a Reply