With the following code I am able to calculate the maximum gaps of each operation:
data = [
{'order': 1, 'operation': 'milling', 'start': 0, 'end': 70},
{'order': 1, 'operation': 'milling', 'start': 200, 'end': 210},
{'order': 1, 'operation': 'milling', 'start': 500, 'end': 600},
{'order': 1, 'operation': 'grinding', 'start': 90, 'end': 150},
{'order': 2, 'operation': 'grinding', 'start': 150, 'end': 170},
{'order': 3, 'operation': 'grinding', 'start': 400, 'end': 420},
{'order': 3, 'operation': 'milling', 'start': 610, 'end': 660}
]
df = pd.DataFrame(data)
df['start'] = df['start'].shift(-1)
df = df.groupby('operation').apply(lambda x: x.loc[(x['start'] - x['end']).idxmax()])[['operation', 'end', 'start']].reset_index(drop=True)
df.columns = ['operation', 'start', 'end']
df['max_gap'] = df['end'] - df['start']
print(df)
Prints:
operation start end max_gap
0 grinding 170 400.0 230.0
1 milling 210 500.0 290.0
The problem is, when there is a new order with a new operation (e.g. "new_operation") I get a key error (KeyError: nan) because it only exsists once (I guess).
data = [
{'order': 1, 'operation': 'milling', 'start': 0, 'end': 70},
{'order': 1, 'operation': 'milling', 'start': 200, 'end': 210},
{'order': 1, 'operation': 'milling', 'start': 500, 'end': 600},
{'order': 1, 'operation': 'grinding', 'start': 90, 'end': 150},
{'order': 2, 'operation': 'grinding', 'start': 150, 'end': 170},
{'order': 3, 'operation': 'grinding', 'start': 400, 'end': 420},
{'order': 3, 'operation': 'milling', 'start': 610, 'end': 660},
{'order': 3, 'operation': 'new_operation', 'start': 610, 'end': 660}
]
...
KeyError: nan
How to avoid this in a nice way?
>Solution :
when using df["start"] = df["start"].shift(-1)
the last data point is filled by NaN so you should fill the missing value by fillna method or use the fill_value option in the shift method.
df["start"] = df["start"].shift(-1).fillna(method="ffill")
df["start"] = df["start"].shift(-1, fill_value=0)