Pandas MultiIndex updating with derived values

I am tryng to update a MultiIndex frame with derived data.

My multiframe is a time series where ‘Vehicle_ID’ and ‘Frame_ID’ are the levels of index and I iterate through each Vehicle_ID in order and compute exponential weighted avgs to clean the data and try to merge the additional columns to the original MultiIndex dataframe.

Example Code:

v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
for id in v_ids:
    ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
    ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()

    smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
    trajec.join(smooth)

And this works outside of the loop, to join the values to the trajec dataframe. But when implemented in the loop seems to overwrite on each loop.

        Local_X, Local_Y, v_Length, v_Width, v_Class, v_Vel, v_Acc, Lane_ID, Preceding, Following, Space_Headway, Time_Headway
Vehicle_ID  Frame_ID                                                
1   12  16.884  48.213  14.3    6.4 2   12.50   0.0 2   0   0   0.00    0.00
    13  16.938  49.463  14.3    6.4 2   12.50   0.0 2   0   0   0.00    0.00
    14  16.991  50.712  14.3    6.4 2   12.50   0.0 2   0   0   0.00    0.00
    15  17.045  51.963  14.3    6.4 2   12.50   0.0 2   0   0   0.00    0.00
    16  17.098  53.213  14.3    6.4 2   12.50   0.0 2   0   0   0.00    0.00
... ... ... ... ... ... ... ... ... ... ... ... ... ...
2911    8588    53.693  1520.312    14.9    5.9 2   31.26   0.0 5   2910    2915    78.19   2.50
        8589    53.719  1523.437    14.9    5.9 2   31.26   0.0 5   2910    2915    78.26   2.50
        8590    53.746  1526.564    14.9    5.9 2   31.26   0.0 5   2910    2915    78.41   2.51
        8591    53.772  1529.689    14.9    5.9 2   31.26   0.0 5   2910    2915    78.61   2.51
        8592    53.799  1532.830    14.9    5.9 2   30.70   5.9 5   2910    2915    78.81   2.57

dataframe exerpt.

>Solution :

You can create an empty dataframe outside the loop to store the results, and then concatenate the results from each iteration to this empty dataframe.

v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
results = pd.DataFrame() # empty dataframe to store results

for id in v_ids:
    ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
    ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()

smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
results = pd.concat([results, smooth]) # concatenate results from each iteration

# join the results to the original dataframe
trajec = trajec.join(results)

Leave a Reply