So I tried to get this working. I know how to sort, use nlargest or use rank with method dense.. But somehow I’m still lost.
Here is what I want to achieve:
- I always want to have top 3 rows based on some value (1, 2, 3)
- But, if there are multiple rows with same values, then I want to have more than 3 rows (e.g. 1,2,2,3 should return 4 rows)
Let’s say I have this data
I want to select top3 rows based on the lowest score and I want to get this
As I said I want top3 rows, but if there is the same score, should get 4 or more.
I got lost a bit in all of the combinations that I tried. Is there a simple way do do this?
>Solution :
You can create an array that has the 3 smallest values of your Score column, and then use isin
to filter your dataframe:
some_vals = pd.Series(df['Score'].unique()).nsmallest(3).values
df[df['Score'].isin(some_vals)]
Name Score
0 John 1
1 Mark 2
2 Perry 2
3 Dion 3
This way you ensure that all Name’s that have a value in Score equal to any of the 3 smallest values will be returned back.