Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas update previous records because future peaking is not possible

This is what I have so far:

import numpy as np
import pandas_ta as ta
from pandas import DataFrame, pandas

df = pandas.DataFrame({"color": [None, None, 'blue', None, None, None, 'orange', None, None, None, None],
                       'bottom': [1, 2, 7, 5, 9, 9, 5, 4, 5, 5, 3],
                       'top': [5, 5, 11, 8, 10, 10, 9, 7, 10, 6, 7]})

print(df)

"""
     color  down  top
0     None     1    5
1     None     2    5
2     blue     7   11
3     None     5    8
4     None     9   10
5     None     9   10
6   orange     5    9
7     None     4    7
8     None     5   10
9     None     5    6
10    None     3    7
"""

# lookback period
N = 3

# Pivot each color to own column and shift
df2 = (df.pivot(columns='color', values=['top', 'bottom'])
         .drop(columns=np.nan, level=1)
         .ffill(limit=N-1).shift()
       )


# compare current top with bottom & top from color occurance
out = df.join((df2['bottom'].le(df['top'], axis=0)
               & df2['top'].ge(df['top'], axis=0)).astype(int))
print(out)


"""
     color  bottom  top  blue  orange
0     None       1    5     0       0
1     None       2    5     0       0
2     blue       7   11     0       0
3     None       5    8     1       0
4     None       9   10     1       0
5     None       9   10     1       0
6   orange       5    9     0       0
7     None       4    7     0       1
8     None       5   10     0       0
9     None       5    6     0       1
10    None       3    7     0       0
"""

Question:

I only want to consume each color once. That means that for every blue or orange occurrence there can only be only one 1 in the upcoming 3 rows.
( 2 blues after each other will result in two 1s. One 1 for every blue.)

"""
     color  bottom  top  blue  orange
0     None       1    5     0       0
1     None       2    5     0       0
2     blue       7   11     0       0
3     None       5    8     1       0
4     None       9   10     1       0 --> this should be 0, blue already consumed on row 3
5     None       9   10     1       0 --> this should be 0, blue already consumed on row 3
6   orange       5    9     0       0
7     None       4    7     0       1
8     None       5   10     0       0
9     None       5    6     0       1 --> this should be 0, orange already consumed on row 7
10    None       3    7     0       0
"""

One bottleneck is that for this to function correctly I am not allowed to peak in to the future. So I am not allowed to use .shift(-3) or iloc[-1] for example.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

That sort of kills my initial thinking about keeping track of a consumed state by using something like .rolling(-3).max() == 1 .

>Solution :

You can post-process the output to only keep the first 1 per group:

# lookback period
N = 3

# Pivot each color to own column and shift
df2 = (df.pivot(columns='color', values=['top', 'bottom'])
         .drop(columns=np.nan, level=1)
         .ffill(limit=N-1).shift()
       )

# compare current top with bottom & top from color occurance
out = df.join((df2['bottom'].le(df['top'], axis=0)
               & df2['top'].ge(df['top'], axis=0)).astype(int))

# post process the output to keep only the first 1
cols = list(df['color'].dropna().unique())

out[cols] = out[cols].mask(out[cols].ne(out.groupby(df['color'].notna().cumsum())[cols].cumsum()), 0)

Or with a loop:

cols = list(df['color'].dropna().unique())

g = out.groupby(df['color'].notna().cumsum())
for c in cols:
    out[c] = np.where(out[c].eq(1) & df.index.isin(g[c].idxmax()), 1, 0)

Output:

     color  bottom  top  blue  orange
0     None       1    5     0       0
1     None       2    5     0       0
2     blue       7   11     0       0
3     None       5    8     1       0
4     None       9   10     0       0
5     None       9   10     0       0
6   orange       5    9     0       0
7     None       4    7     0       1
8     None       5   10     0       0
9     None       5    6     0       0
10    None       3    7     0       0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading