How to count event in predefined time ranges

December 15, 2022

I want to count the events for every 1 second for the csv data file and draw a histogram according to the results. But I don’t understand how I can get the number of events in every second.
Can someone please help me with this issue?

code is :

from matplotlib import pyplot as pl
import pandas as pd
import numpy as np

def read_data():
    df = pd.read_csv("test.csv", usecols=['time', 'unix_time', 'name'])
    df['time'] = pd.to_datetime(df['time'])
    df['unix_time'] = (df['unix_time']).astype(int)
    df.info()

    i = 1

    time_counts = df.groupby((3600 * df.time.dt.minute + df.time.dt.second) // i * i)['time'].count()
    print(time_counts)


if __name__ == "__main__":
    read_data()

output is looks strange:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 33 entries, 0 to 32
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   time       33 non-null     datetime64[ns]
 1   unix_time  33 non-null     int32         
 2   name       33 non-null     object        
dtypes: datetime64[ns](1), int32(1), object(1)
memory usage: 788.0+ bytes

time
18        1
25217     1
43209     1
43219     1
46804     1
54047     1
61241     1
64815     1
64833     1
68402     1
75620     1
79235     1
82806     1
82837     2
86407     1
86446     1
93625     1
97254     1
104446    1
140438    1
144050    1
162025    1
169250    1
180050    1
183623    1
183658    1
194404    1
194412    2
194433    1
194438    1
205219    1
Name: time, dtype: int64

data in csv is :

time                    unix_time       name
2022-12-15 08:00:18.034 1671091218034   apple
2022-12-15 08:07:17.376 1671091637376   apple
2022-12-15 08:12:09.648 1671091929648   apple
2022-12-15 08:12:19.320 1671091939320   apple
2022-12-15 08:13:04.623 1671091984623   apple
2022-12-15 08:15:47.103 1671092147103   apple
2022-12-15 08:17:41.878 1671092261878   apple
2022-12-15 08:18:15.842 1671092295842   apple
2022-12-15 08:18:33.786 1671092313786   apple
2022-12-15 08:19:02.022 1671092342022   apple
2022-12-15 08:21:20.350 1671092480350   apple
2022-12-15 08:22:35.603 1671092555603   apple
2022-12-15 08:23:06.009 1671092586009   apple
2022-12-15 08:23:37.101 1671092617101   apple
2022-12-15 08:23:37.334 1671092617334   apple
2022-12-15 08:24:07.645 1671092647645   apple
2022-12-15 08:24:46.978 1671092686978   apple
2022-12-15 08:26:25.430 1671092785430   apple
2022-12-15 08:27:54.027 1671092874027   apple
2022-12-15 08:29:46.712 1671092986712   apple
2022-12-15 08:39:38.742 1671093578742   apple
2022-12-15 08:40:50.310 1671093650310   apple
2022-12-15 08:45:25.007 1671093925007   apple
2022-12-15 08:47:50.770 1671094070770   apple
2022-12-15 08:50:50.856 1671094250856   apple
2022-12-15 08:51:23.914 1671094283914   apple
2022-12-15 08:51:58.572 1671094318572   apple
2022-12-15 08:54:04.959 1671094444959   apple
2022-12-15 08:54:12.424 1671094452424   apple
2022-12-15 08:54:12.807 1671094452807   apple
2022-12-15 08:54:33.562 1671094473562   apple
2022-12-15 08:54:38.531 1671094478531   apple
2022-12-15 08:57:19.777 1671094639777   apple

>Solution :

Use Grouper by one seconds frequency:

df['time'] = pd.to_datetime(df['time'])

time_counts = df.groupby(pd.Grouper(freq='1s', key='time'))['time'].count()
print(time_counts)
time
2022-12-15 08:00:18    1
2022-12-15 08:00:19    0
2022-12-15 08:00:20    0
2022-12-15 08:00:21    0
2022-12-15 08:00:22    0
                      ..
2022-12-15 08:57:15    0
2022-12-15 08:57:16    0
2022-12-15 08:57:17    0
2022-12-15 08:57:18    0
2022-12-15 08:57:19    1
Freq: S, Name: time, Length: 3422, dtype: int64

Or Series.dt.floor for remove miliseconds:

df['time'] = pd.to_datetime(df['time'])

time_counts = df.groupby(df['time'].dt.floor('S'))['time'].count()

print(time_counts)
time
2022-12-15 08:00:18    1
2022-12-15 08:07:17    1
2022-12-15 08:12:09    1
2022-12-15 08:12:19    1
2022-12-15 08:13:04    1
2022-12-15 08:15:47    1
2022-12-15 08:17:41    1
2022-12-15 08:18:15    1
2022-12-15 08:18:33    1
2022-12-15 08:19:02    1
2022-12-15 08:21:20    1
2022-12-15 08:22:35    1
2022-12-15 08:23:06    1
2022-12-15 08:23:37    2
2022-12-15 08:24:07    1
2022-12-15 08:24:46    1
2022-12-15 08:26:25    1
2022-12-15 08:27:54    1
2022-12-15 08:29:46    1
2022-12-15 08:39:38    1
2022-12-15 08:40:50    1
2022-12-15 08:45:25    1
2022-12-15 08:47:50    1
2022-12-15 08:50:50    1
2022-12-15 08:51:23    1
2022-12-15 08:51:58    1
2022-12-15 08:54:04    1
2022-12-15 08:54:12    2
2022-12-15 08:54:33    1
2022-12-15 08:54:38    1
2022-12-15 08:57:19    1
Name: time, dtype: int64