Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to find last occurrence of value meeting condition in column in python

I have the following dataframe:

df = pd.DataFrame({"A":['a','b','c','d','e','f','g','h','i','j','k'],
                    "B":[1,3,4,5,6,7,6,5,8,5,5]})
df

displayed as:

    A   B
0   a   1
1   b   3
2   c   4
3   d   5
4   e   6
5   f   7
6   g   6
7   h   5
8   i   8
9   j   5
10  k   5

I first want to find the letter in column "A" that corresponds to the first occurrence of a value in column "B" that is >= 6. Looking at this, we see that this would be row index 4, corresponding to a value of 6 and "e" in column "A".

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I can identify the column "A" value we just got with this code:

#Find first occurrence >= threshold
threshold = 6
array = df.values
array[np.where(array[:,1] >= threshold)][0,0]

This code returns 'e', which is what I want.

This code is referenced from this Stack Overflow source: Python find first occurrence in Pandas dataframe column 2 below threshold and return column 1 value same row using NumPy

What I am having trouble figuring out is how to modify this code to find the last occurrence meeting my criteria of being >= the threshold of 6. And so looking at my code above, I want to produce 'i', because looking at the above data frame, the row containing "i" in column "A" correspond to a value of 8 in column "B", which is the last occurrence of a value >= the threshold of 6. I want to preserve the order of the rows as alphabetical referencing column "A". I am guessing this might have to do with somehow modifying the indexing in my code, specifically the array[:,1] component or the [0,0] component, but I am not sure how to specifically call for the last occurrence meeting my criteria. How can I modify my code to find the value in column "A" corresponding to the last occurrence of a value >= the threshold of 6 in column "B"?

>Solution :

To get the first occurrence, You can use idxmax:

df.loc[df['B'].ge(6).idxmax()]

output:

A    e
B    6
Name: 4, dtype: object

For just the value in ‘A’:

df.loc[df['B'].ge(6).idxmax(), 'A']

output: 'e'

For the last, do the same on the reversed Series:

df.loc[df.loc[::-1,'B'].ge(6).idxmax()]

output:

A    k
B    8
Name: 10, dtype: object
df.loc[df.loc[::-1, 'B'].ge(6).idxmax(), 'A']

output: 'k'

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading