Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Average transactions per active week Pandas

I have a df that looks like this

user_id customer_id transaction_amount  transaction_type    transaction_date
0       0             10000               debit               2020-09-08
1       1             20000               debit               2020-09-08

Now I want to calculate the average amount of transactions of each user in the weeks. So far I have achieved

df.groupby(['user_id', 'transaction_date']).mean()

Which gives me the output in the following format

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


                                                       transaction_amount
user_id                              transaction_date   
1                                       2021-03-08      36114.913043
                                        2021-03-09      13425.000000
                                        2021-03-23      14915.200000
                                        2021-03-24      9762.000000

2                                       2021-03-08      1234114.913043
                                        2021-02-08      1354325.000000
                                        2021-12-23      141241915.200000
                                        2021-5-21       9735162.000000

How can I get it in the format of

user_id | customer_id | average_transactions_per_week 
_______________________________________________________
1          1                5 # 5 transactions in 1 week
2          2                8 # 8 transactions in 1 week

>Solution :

IIUC, you shouldn’t use sum but count. Also, to group by week, you need to compute a grouper, here we can use a period:

df.groupby(['user_id', 'customer_id', pd.to_datetime(df['transaction_date']).dt.to_period('W')]).agg(transactions_per_week=('user_id', 'count'))

Output:

                                           transactions_per_week
user_id customer_id transaction_date                            
0       0           2020-09-07/2020-09-13                      1
1       1           2020-09-07/2020-09-13                      1

Now to get the average number of transactions, you need a second groupby:

(df
 .groupby(['user_id', 'customer_id', pd.to_datetime(df['transaction_date']).dt.to_period('W')])
 .agg(average_transactions_per_week=('user_id', 'count'))
 .groupby(['user_id', 'customer_id']).mean()
 )

Output:

                     average_transactions_per_week
user_id customer_id                               
0       0                                      1.0
1       1                                      1.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading