Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Can you set ValueCount() to output the corresponding entry as well as the value count?

Say you have the following data frame and you needed to know how many Assays were done per month.

type,"Date Tested"
Assay,2022/01/28
Assay,2022/01/31
Assay,2022/02/02
Assay,2022/03/31
Assay,2022/04/21
Assay,2022/05/12
Assay,2022/06/02
Assay,2022/02/03
Assay,2022/06/03

You can use value_counts() from Pandas to easily do this.

data['Date Tested']=pd.to_datetime(data['Date Tested'], format = "%Y/%m/%d")
months = data['Date Tested'].dt.month.value_counts(sort=False)
print(months)

Which outputs:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

1    2
2    2
3    1
4    1
5    1
6    2
Name: Date Tested, dtype: int64

The ‘numbers’ in the first column are each month (i.e 01 – Jan, 02 – Feb etc..) but this isn’t great. What if the dataset started at March? Then March = 01.
Or what if I needed to do the same thing but by weeks? How could you workout what, say 12 was in terms of a week?

How can you modify the output of value_count to include the corresponding month/week? This information is present in the dataframe, shown by:

print(data['Date Tested'])

Which gives:

0   2022-01-28
1   2022-01-31
2   2022-02-02
3   2022-03-31
4   2022-04-21
5   2022-05-12
6   2022-06-02
7   2022-02-03
8   2022-06-03
Name: Date Tested, dtype: datetime64[ns]

Ideally, my count output would be something like this:

2022-01   2
2022-02   2
2022-03   1
2022-04   1
2022-05   1
2022-06   2
Name: Date Tested, dtype: datetime64[ns]

>Solution :

You can convert to_period (M = monthly) and count:

data.groupby(data['Date Tested'].dt.to_period('M'))['type'].count()

output:

Date Tested
2022-01    2
2022-02    2
2022-03    1
2022-04    1
2022-05    1
2022-06    2
Freq: M, Name: type, dtype: int64

By week:

data.groupby(data['Date Tested'].dt.to_period('W'))['type'].count()

output:

Date Tested
2022-01-24/2022-01-30    1
2022-01-31/2022-02-06    3
2022-03-28/2022-04-03    1
2022-04-18/2022-04-24    1
2022-05-09/2022-05-15    1
2022-05-30/2022-06-05    2
Freq: W-SUN, Name: type, dtype: int64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading