Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group Pandas DataFrame in Time Interval and Plot

Goal

Group a pandas dataframe in 30s intervals and extract the data to plot it.

Example

import pandas as pd

log = [
        ['2022/10/10_6:13:39', '6328f0c6ad70889fd28dcd07'],
        ['2022/10/10_6:13:49', '6328f0c6ad70889fd28dcd07'],
        ['2022/10/10_6:14:23', '6328f0c6ad70889fd28dcd07'],
        ['2022/10/10_6:14:25', '6328b959a5745f6fa5206fa6'],
        ['2022/10/10_6:15:4', '6328b959a5745f6fa5206fa6'],
        ['2022/10/10_6:15:52', '628fa4ac88be7ffeb9b7e7e3']]

df = pd.DataFrame(log,
                 columns=['timestamp', 'data'])

# convert to timestamp format
df['timestamp'] = pd.to_datetime(df['timestamp'],format='%Y/%m/%d_%H:%M:%S')

The dataframe:

            timestamp                      data
0 2022-10-10 06:13:39  6328f0c6ad70889fd28dcd07
1 2022-10-10 06:13:49  6328f0c6ad70889fd28dcd07
2 2022-10-10 06:14:23  6328f0c6ad70889fd28dcd07
3 2022-10-10 06:14:25  6328b959a5745f6fa5206fa6
4 2022-10-10 06:15:04  6328b959a5745f6fa5206fa6
5 2022-10-10 06:15:52  628fa4ac88be7ffeb9b7e7e3

My approach

# Group in intervals 
g = df.groupby(pd.Grouper(key='timestamp',freq='30s'))

The issue

  1. I would like to see the grouped dataframe. How do I do that?
  2. I would like to plot how many unique data there was within each interval.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

when you use group by you need an aggregation function in your case if you want the number of values you can use count(). To check the data grouped you can use list. Then you can just plot the data using a bar plot

grouped_data = df.groupby(pd.Grouper(key='timestamp',freq='30s')).agg(list)

grouped_counts = df.groupby(pd.Grouper(key='timestamp',freq='30s')).count()
grouped_counts.plot(kind='bar')

enter image description here

EDIT for unique values

if you want unique values, you can aggregate by a set and count the values

grouped_data = df.groupby(pd.Grouper(key='timestamp',freq='30s')).agg(set)
grouped_data['counts'] = grouped_data['data'].apply(lambda x: len(x))
grouped_data.plot(y='counts', kind='bar')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading