Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get max value in previous rows for matching rows

Say I have a dataframe that records temperature measurements for various sensors:

import pandas as pd

df = pd.DataFrame({'sensor': ['A', 'C', 'A', 'C', 'B', 'B', 'C', 'A', 'A', 'A'],
                   'temperature': [4.8, 12.5, 25.1, 16.9, 20.4, 15.7, 7.7, 5.5, 27.4, 17.7]})

I would like to add a column max_prev_temp that will show the previous maximum temperature for the corresponding sensor. So this works:

df["max_prev_temp"] = df.apply(
    lambda row: df[df["sensor"] == row["sensor"]].loc[: row.name, "temperature"].max(),
    axis=1,
)

It returns:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  sensor  temperature  max_prev_temp
0      A          4.8            4.8
1      C         12.5           12.5
2      A         25.1           25.1
3      C         16.9           16.9
4      B         20.4           20.4
5      B         15.7           20.4
6      C          7.7           16.9
7      A          5.5           25.1
8      A         27.4           27.4
9      A         17.7           27.4

Problem is: my actual data set contains over 2 million rows, so this is excruciatingly slow (it probably will take about 2 hours). I understand that rolling is a better method, but I don’t see to use it for this specific case.

Any hint would be appreciated.

>Solution :

Use Series.expanding per groups with remove first level by Series.droplevel:

df["max_prev_temp"] = df.groupby('sensor')["temperature"].expanding().max().droplevel(0)
print (df)
  sensor  temperature  max_prev_temp
0      A          4.8            4.8
1      C         12.5           12.5
2      A         25.1           25.1
3      C         16.9           16.9
4      B         20.4           20.4
5      B         15.7           20.4
6      C          7.7           16.9
7      A          5.5           25.1
8      A         27.4           27.4
9      A         17.7           27.4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading