I am tryng to update a MultiIndex frame with derived data.
My multiframe is a time series where ‘Vehicle_ID’ and ‘Frame_ID’ are the levels of index and I iterate through each Vehicle_ID in order and compute exponential weighted avgs to clean the data and try to merge the additional columns to the original MultiIndex dataframe.
Example Code:
v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
for id in v_ids:
ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()
smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
trajec.join(smooth)
And this works outside of the loop, to join the values to the trajec dataframe. But when implemented in the loop seems to overwrite on each loop.
Local_X, Local_Y, v_Length, v_Width, v_Class, v_Vel, v_Acc, Lane_ID, Preceding, Following, Space_Headway, Time_Headway
Vehicle_ID Frame_ID
1 12 16.884 48.213 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
13 16.938 49.463 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
14 16.991 50.712 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
15 17.045 51.963 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
16 17.098 53.213 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
... ... ... ... ... ... ... ... ... ... ... ... ... ...
2911 8588 53.693 1520.312 14.9 5.9 2 31.26 0.0 5 2910 2915 78.19 2.50
8589 53.719 1523.437 14.9 5.9 2 31.26 0.0 5 2910 2915 78.26 2.50
8590 53.746 1526.564 14.9 5.9 2 31.26 0.0 5 2910 2915 78.41 2.51
8591 53.772 1529.689 14.9 5.9 2 31.26 0.0 5 2910 2915 78.61 2.51
8592 53.799 1532.830 14.9 5.9 2 30.70 5.9 5 2910 2915 78.81 2.57
dataframe exerpt.
>Solution :
You can create an empty dataframe outside the loop to store the results, and then concatenate the results from each iteration to this empty dataframe.
v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
results = pd.DataFrame() # empty dataframe to store results
for id in v_ids:
ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()
smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
results = pd.concat([results, smooth]) # concatenate results from each iteration
# join the results to the original dataframe
trajec = trajec.join(results)