Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Retrieve number of rows between the current row and the last/next positive value

I have the following dataframe:

   feature
0        1
1        0
2        0
3        0
4        0
5        1
6        0
7        1

I would like to create a 2 columns that will include the number of rows between the current row and the last and next positive value.
The output dataframe should be something like this:

   feature    previous_feat        next_feat
0        1               NA                5
1        0                1                4
2        0                2                3
3        0                3                2
4        0                4                1
5        1                5                2
6        0                1                1
7        1                2               NA

I already tried things around the combination of shift and mask methods but I don’t manage to make it works.
Note that it could be the number of row or the index difference it doesn’t really matter for me. Same thing for the NA values, it could be NA or 0.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd

df = pd.DataFrame({"feature": [1, 0, 0, 0, 0, 1, 0, 1]})

# df["previous_feat"] = df.shift().mask(df["feature"] != 0)

>Solution :

You can use groupby.cumcount and boolean masks:

# form groups
g1 = df.loc[::-1, 'feature'].eq(1).cumsum()
g2 = df['feature'].eq(1).cumsum()

# mask first/last
m1 = g2.eq(1) & df['feature'].eq(1)
m2 = g1.eq(1) & df['feature'].eq(1)

# compute cumcount
df['previous_feat'] = df.groupby(g1).cumcount().add(1).mask(m1)
df['next_feat'] = df[::-1].groupby(g2).cumcount().add(1).mask(m2)

Output:

   feature  previous_feat  next_feat
0        1            NaN        5.0
1        0            1.0        4.0
2        0            2.0        3.0
3        0            3.0        2.0
4        0            4.0        1.0
5        1            5.0        2.0
6        0            1.0        1.0
7        1            2.0        NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading