Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I find the first row after a number of duplicated rows?

My DataFrame is:

import pandas as pd
df = pd.DataFrame(
    {
        'x': ['a', 'a', 'a','b', 'b','c', 'c', 'c',],
        'y': list(range(8))
    }
)

And this is the expected output. I want to create column z:

   x  y    z
0  a  0    NaN
1  a  1    NaN 
2  a  2    NaN
3  b  3    3
4  b  4    NaN
5  c  5    NaN
6  c  6    NaN
7  c  7    NaN

The logic is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I want to find the first row after the first group of duplicated rows. For example in column x, the value a is the first duplicated value. I want to find one row after the a values end. And then put the y of that row for z column.

This is my attempt that did not give me the output:

m = (df.x.duplicated())
out = df[m]

>Solution :

One option, using a custom mask:

# flag rows after the first group
m = df['x'].ne(df['x'].iat[0]).cummax()

# pick the first one
out = df[m & ~m.shift(fill_value=False)]

If your first value is always a and you want to find the first non-a you could also use:

m2 = df['x'].eq('a')

out = df[m2.shift(fill_value=False) & ~m2]

Or, if you’re sure there is at least one row after the leading as:

out = df.loc[[df['x'].ne('a').idxmax()]]

Output:

   x  y
3  b  3

Some intermediates (all approaches):

   x  y      m  ~m.shift(fill_value=False)     m2  m2.shift(fill_value=False)  df['x'].ne('a')
0  a  0  False                        True   True                       False            False
1  a  1  False                        True   True                        True            False
2  a  2  False                        True   True                        True            False
3  b  3   True                        True  False                        True             True
4  b  4   True                       False  False                       False             True
5  c  5   True                       False  False                       False             True
6  c  6   True                       False  False                       False             True
7  c  7   True                       False  False                       False             True
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading