Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Slicing and extracting from dataframe

I have a dataframe like below :

     time  power speed state 

1   14.00  29    3     1
2   14.01  30    3     2
3   14.02  29    3     3
4   14.03  30    3     4
5   14.04  29    3     5
6   14.05  30    3     6
7   14.06  29    3     6
8   14.07  30    3     6
9   14.08  29    3     6
10  14.09  30    3     5
11  14.10  29    3     5
12  14.11  30    3     5
13  14.12  29    3     5
14  14.13  30    3     6
15  14.14  31    4     6 
16  14.15  32    4     6

Each cycle starts at state 5 ( row 10, only after state 6 ) and ends just before state 6 is back ( i.e row 13 ). So cycle 1 is between rows 10 and 13.

This is a large data and there are multiple cycles. I want to extract each cycle as a data frame.
I tried some iterations but it didn’t work.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

 charge_cycles = []
current_charge_start = None
current_drive_start = None
total_energy_consumed = 0
drive_data = []

for index, row in data.iterrows():
    if row['state'] == '6':
        if current_drive_start is not None:
            energy_during_drive = total_energy_consumed
            charge_cycles.append(energy_during_drive)
            drive_data.append(data.loc[current_drive_start:index])
            current_drive_start = None
            total_energy_consumed = 0
        current_charge_start = row['time']
    elif row['state'] == '5':
        if current_charge_start is not None and current_drive_start is None:
            current_drive_start = index
        if current_drive_start is not None:
            total_energy_consumed += row['power'] * (row['time'] - data.loc[current_drive_start, 'time'])
            current_drive_start = index

# Print the energy consumption during driving between each charge cycle
for i, energy in enumerate(charge_cycles, start=1):
    print(f"Charge Cycle {i}: Energy Consumed During Driving = {energy} units")

# Display the DataFrames for each driving cycle
for i, drive_df in enumerate(drive_data, start=1):
    print(f"Driving Cycle {i}:\n{drive_df}")

This is giving me the whole data frame as a result. Can anyone please help me with the python code for this problem ?

>Solution :

IIUC, you can try:

df = pd.DataFrame(
    {
        "state": list(
            "6666665555555555555543555555512555666666666666666655555555412344666666666"
        )
    }
)
df["state"] = df["state"].astype(int)


# remove the initial values 'till 6
df = df.loc[df["state"].eq(6).idxmax() :]

mask = df["state"].eq(6)
for _, g in df.groupby((mask != mask.shift()).cumsum()):
    if (eq5 := g["state"].eq(5)).any():
        g = g.loc[eq5.idxmax() :]
        print(g)
        print("-" * 80)

Prints:

    state
6       5
7       5
8       5
9       5
10      5
11      5
12      5
13      5
14      5
15      5
16      5
17      5
18      5
19      5
20      4
21      3
22      5
23      5
24      5
25      5
26      5
27      5
28      5
29      1
30      2
31      5
32      5
33      5
--------------------------------------------------------------------------------
    state
50      5
51      5
52      5
53      5
54      5
55      5
56      5
57      5
58      4
59      1
60      2
61      3
62      4
63      4
--------------------------------------------------------------------------------
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading