I have a dataframe like below :
time power speed state
1 14.00 29 3 1
2 14.01 30 3 2
3 14.02 29 3 3
4 14.03 30 3 4
5 14.04 29 3 5
6 14.05 30 3 6
7 14.06 29 3 6
8 14.07 30 3 6
9 14.08 29 3 6
10 14.09 30 3 5
11 14.10 29 3 5
12 14.11 30 3 5
13 14.12 29 3 5
14 14.13 30 3 6
15 14.14 31 4 6
16 14.15 32 4 6
Each cycle starts at state 5 ( row 10, only after state 6 ) and ends just before state 6 is back ( i.e row 13 ). So cycle 1 is between rows 10 and 13.
This is a large data and there are multiple cycles. I want to extract each cycle as a data frame.
I tried some iterations but it didn’t work.
charge_cycles = []
current_charge_start = None
current_drive_start = None
total_energy_consumed = 0
drive_data = []
for index, row in data.iterrows():
if row['state'] == '6':
if current_drive_start is not None:
energy_during_drive = total_energy_consumed
charge_cycles.append(energy_during_drive)
drive_data.append(data.loc[current_drive_start:index])
current_drive_start = None
total_energy_consumed = 0
current_charge_start = row['time']
elif row['state'] == '5':
if current_charge_start is not None and current_drive_start is None:
current_drive_start = index
if current_drive_start is not None:
total_energy_consumed += row['power'] * (row['time'] - data.loc[current_drive_start, 'time'])
current_drive_start = index
# Print the energy consumption during driving between each charge cycle
for i, energy in enumerate(charge_cycles, start=1):
print(f"Charge Cycle {i}: Energy Consumed During Driving = {energy} units")
# Display the DataFrames for each driving cycle
for i, drive_df in enumerate(drive_data, start=1):
print(f"Driving Cycle {i}:\n{drive_df}")
This is giving me the whole data frame as a result. Can anyone please help me with the python code for this problem ?
>Solution :
IIUC, you can try:
df = pd.DataFrame(
{
"state": list(
"6666665555555555555543555555512555666666666666666655555555412344666666666"
)
}
)
df["state"] = df["state"].astype(int)
# remove the initial values 'till 6
df = df.loc[df["state"].eq(6).idxmax() :]
mask = df["state"].eq(6)
for _, g in df.groupby((mask != mask.shift()).cumsum()):
if (eq5 := g["state"].eq(5)).any():
g = g.loc[eq5.idxmax() :]
print(g)
print("-" * 80)
Prints:
state
6 5
7 5
8 5
9 5
10 5
11 5
12 5
13 5
14 5
15 5
16 5
17 5
18 5
19 5
20 4
21 3
22 5
23 5
24 5
25 5
26 5
27 5
28 5
29 1
30 2
31 5
32 5
33 5
--------------------------------------------------------------------------------
state
50 5
51 5
52 5
53 5
54 5
55 5
56 5
57 5
58 4
59 1
60 2
61 3
62 4
63 4
--------------------------------------------------------------------------------