Home Consecutive rows meeting a condition in pandas

Questions

Consecutive rows meeting a condition in pandas

July 22, 2022

I have a pandas dataframe like this:

that could be created with the code:

import pandas as pd

df = pd.DataFrame(
    {
        'col_name': [-1, -1, -3, 2, 1, -3, -2, 4, 3, 5]
    }
)

I want to find the rows that x rows before them and the row itself have positive values and y rows before those x rows have negative values and also the last row of these y rows which is actually y rows before the current row has the least value compared to k rows before of it.

So, for x=1, y=2 and k=2 the output is:

    col_name
4       1

(Index 8 is not in the output because even though itself and one row before it have positive values, and two rows before them have negative values, but the last row with a negative value which is index 6, doesn’t have the least value compared to two rows before itself.)

Also, it’s my priority not to use any for-loops for the code.

Have you any idea about this?

>Solution :

Your explanation is not very clear, so I’ll put a base solution here and you feel free to modify to your needs. Should not be hard to adjust.

We can achieve that my shifting the series and applying iterative masks.

First, create your shifts:

m = d.assign(**{f'col_name_shift_{i}': d.col_name.shift(i) 
                for i in range(1, x+y+1)})

Note that the for loop here is very small (3 iterations only). This gives:

   col_name  col_name_shift_1  col_name_shift_2  col_name_shift_3
0        -1               NaN               NaN               NaN
1        -1              -1.0               NaN               NaN
2        -3              -1.0              -1.0               NaN
3         2              -3.0              -1.0              -1.0
4         1               2.0              -3.0              -1.0
5        -3               1.0               2.0              -3.0
6        -2              -3.0               1.0               2.0
7         4              -2.0              -3.0               1.0
8         3               4.0              -2.0              -3.0
9         5               3.0               4.0              -2.0

Now, it’s just a matter of row-wise analyze which rows follow your requirement.

For example,

I want to find the rows that x rows before them and the row itself have positive values

m1 = m.iloc[:, range(x+1)] > 0

and y rows before those x rows have negative values

m2 = m.iloc[:, range(x+1, x+y+1)] < 0

and also the last row of these y rows which is actually y rows before the current row has the least value compared to k rows before of it.

m3 = m.iloc[:, range(y+1, y+k)].gt(m.iloc[:, y], axis=0)

Then, you concatenate all your boolean series,

mask = pd.concat([m1, m2, m3, axis=1)

and find

df.loc[mask.all(1)]

dataframe

byMR

Published July 22, 2022

Add a comment

I am doing an election system, I am doing a graphic with ChartJS, so I need to COUNT votes GROUP BY date, then sum it with previous day

byMR

July 22, 2022

Questions

How to set a type to be a string but not allowing some specific values?

byMR

July 22, 2022

Questions

How to create a GridFSBucket and add a file in this a bucket with mongosh?

byMR

July 22, 2022

Questions

Convert 4 byte array to float

byMR

July 22, 2022

Questions

Arranging a dataframe into a list of dictionaries based on columns values

byMR

July 22, 2022

Questions

How to deserialize a array with indexes that contains arrays in Newtonsoft.Json in c#

byMR

July 22, 2022

Consecutive rows meeting a condition in pandas

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

I am doing an election system, I am doing a graphic with ChartJS, so I need to COUNT votes GROUP BY date, then sum it with previous day

How to set a type to be a string but not allowing some specific values?

How to create a GridFSBucket and add a file in this a bucket with mongosh?

Convert 4 byte array to float

Arranging a dataframe into a list of dictionaries based on columns values

How to deserialize a array with indexes that contains arrays in Newtonsoft.Json in c#

Keep Up to Date with the Most Important News

Consecutive rows meeting a condition in pandas

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

I am doing an election system, I am doing a graphic with ChartJS, so I need to COUNT votes GROUP BY date, then sum it with previous day

How to set a type to be a string but not allowing some specific values?

How to create a GridFSBucket and add a file in this a bucket with mongosh?

Convert 4 byte array to float

Arranging a dataframe into a list of dictionaries based on columns values

How to deserialize a array with indexes that contains arrays in Newtonsoft.Json in c#

Discover more from Dev solutions