Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Rank a numeric column based on condition

I would like to rank a column called Number after grouping by the column ID but the ranking would be done only if Variable_1 is True. If Variable_1 is False, assign directly to the new column Rank the highest rank : 1

data={'ID':["A","A","B","C","D","D"], 'Variable_1':[True,True,False,False,True,True], 'Number': [10,20,5,6,100,90]}

df=pd.DataFrame(data)

I tried to do the following but did not work:

 df["Rank"] = np.where(df.Variable_1, df.groupby(["ID"])[["Number"]].rank(method='dense', ascending=False), 1)

This is the expected output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here

>Solution :

You can first set column Rank to 1 and use pandas.groupby and compute rank. At the end write rank for those rows that have Variable_1 == True with pandas.mask.

df['Rank'] = 1
res = df.groupby('ID')['Number'].rank(ascending=False)
df['Rank'] = df['Rank'].mask(df['Variable_1'], res)
print(df)

  ID  Variable_1  Number  Rank
0  A        True      10     2
1  A        True      20     1
2  B       False       5     1
3  C       False       6     1
4  D        True     100     1
5  D        True      90     2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading