I have a datafrmame:
- user_id object
- local time datetime64[ns]
- value int32
user_id local time value
0 user1 2023-01-01 00:00:00 3
1 user1 2023-01-01 00:00:00 3
2 user1 2023-01-01 01:00:00 7
3 user1 2023-01-01 01:00:00 2
4 user2 2023-01-01 02:00:00 4
5 user2 2023-01-01 02:00:00 10
6 user2 2023-01-01 03:00:00 7
7 user2 2023-01-01 03:00:00 2
I want to:
- groupby user_id
- mean of cols: "local time" (only time HH:MM:SS, not datetime) and "value"
import pandas as pd
import numpy as np
# Set the random seed for reproducibility
np.random.seed(123)
# Define the number of users and values
num_users = 2
num_values = 4
# Generate the user IDs
user_ids = ['user{}'.format(i+1) for i in range(num_users)]
# Generate the local time values
local_time = pd.date_range(start='2023-01-01 00:00:00', periods=num_values, freq='H')
# Generate the random values
values = np.random.randint(1, 11, size=(num_values*num_users))
# Create the DataFrame
df = pd.DataFrame({
'user_id': np.repeat(user_ids, num_values),
'local time': np.repeat(local_time, num_users),
'value': values})
# calculate the mean of local time TIME - NOT datetime.
print (df)
print("expected_output")
'''
local time value
user1 00:30:00 3.75
user2 02:30:00 5.75
'''
df.groupby('user_id').mean()
Expected Output:
I want the mena of the time (hour minutes and seoncds, not date) and mean of value, groupby user
local time value
user1 00:30:00 3.75
user2 02:30:00 5.75
>Solution :
mean
compute only numeric by default. Use numeric_only=False
to compute the mean of local time:
However you can use:
>>> df.groupby('user_id', as_index=False).mean(numeric_only=False)
user_id local time value
0 user1 2023-01-01 09:36:00 3.8
1 user2 2023-01-03 00:00:00 4.4
2 user3 2023-01-04 14:24:00 5.2
Update
I want the mean of Time, not dateime (so just hour, minute, seconds)
mean_time = lambda x: str(pd.to_timedelta(x.dt.time.astype(str)).mean())[7:]
out = (df.groupby('user_id', as_index=False)
.agg({'local time': mean_time, 'value': 'mean'}))
user_id local time value
0 user1 00:30:00 3.75
1 user2 02:30:00 5.75