Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I find the maximum value of a dynamic window and the minimum value below it?

This is my DataFrame:

import pandas as pd
df = pd.DataFrame(
    {
        'a': [3, 1, 2, 5, 10, 3, 13, 3, 2],
    }
)

Expected output is creating a a_max and a_min:

  a  a_max  a_min
0   3    NaN    NaN
1   1      3      1
2   2      3      1
3   5      3      1
4  10      3      1
5   3     10      3
6  13     10      3
7   3     13      3
8   2     13      2

Logic:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I explain the logic row by row. There is a dynamic window for this df that for the first instance of the window only the first row is considered. For the second instance of the window the first two rows are considered. Same as below:

enter image description here

These are the first four windows. It expands accordingly.

For each window I need to find the maximum value and after that I need to find the minimum value BELOW that maximum value.

I start explaining it from the yellow window. For this window the max value is 3 and the min value BELOW it is 1. So that is why a_max and a_min for this window is 3 and 1.

Now for the orange window the maximum value is 5 but since there are no values in this window BELOW this value that is less than 5, the previous a_max and a_min are repeated.

And the logic continues for the rest of rows.

This is my attempt:

df['a_max'] = df.a.cummax()
df['a_min'] = df.a.cummin()

>Solution :

This is a tricky one, I would use a cummax+shift, then mask+ffill to compute a_max. Then a_min is the groupby.cummin per group of identical a_max:

# compute the shifted cummax
cm = df['a'].cummax().shift()

# a_max is the cummax except if the current row is larger
df['a_max'] = cm.mask(df['a'].gt(cm)).ffill()

# a_min is the cummin of the current group of a_max
df['a_min'] = df.groupby('a_max')['a'].cummin()

Output:

    a  a_max  a_min
0   3    NaN    NaN
1   1    3.0    1.0
2   2    3.0    1.0
3   5    3.0    1.0
4  10    3.0    1.0
5   3   10.0    3.0
6  13   10.0    3.0
7   3   13.0    3.0
8   2   13.0    2.0

Intermediates:

    a  a_max  cummax  shift  mask  ffill  a_min
0   3    NaN       3    NaN   NaN    NaN    NaN
1   1    3.0       3    3.0   3.0    3.0    1.0
2   2    3.0       3    3.0   3.0    3.0    1.0
3   5    3.0       5    3.0   NaN    3.0    1.0
4  10    3.0      10    5.0   NaN    3.0    1.0
5   3   10.0      10   10.0  10.0   10.0    3.0
6  13   10.0      13   10.0   NaN   10.0    3.0
7   3   13.0      13   13.0  13.0   13.0    3.0
8   2   13.0      13   13.0  13.0   13.0    2.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading