Largest Number from a text file using pandas

February 28, 2022

Using dataframe I have created columns. I need the five largest numbers among SNR for the corresponding ID. For e.g for J07451689+2804046 I need five largest and it should continue to the next ID which is J05062845+7149258. I’m trying to slice it but I’m stuck here:

 import pandas as pd
 df = pd.read_table("Ident_new_test.txt",sep=",",header=None)
 print(df)
 df.shape
 for i in range(0,N):
     df1 = df.loc[:, "1":"1":2]
      print(df1)´´´

      ID              SNR
   J07451689+2804046  200    
   J07451689+2804046  217   
   J07451689+2804046  257    
   J07451689+2804046  200    
   J07451689+2804046  181    
   J07451689+2804046  206    
   J07451689+2804046  198    
   J07451689+2804046  222    
   J05062845+7149258  281    
   J05062845+7149258  397    
   J15170588+7149258  431    
   J15170588+7149258  347    
   J15170588+7149258  411    
   J15170588+7149258  495    
   J18255915+6533486  257    
   J18255915+6533486  317    
   J18255915+6533486  349    
   J18255915+6533486  321    
   J18255915+6533486  403    
   J18255915+6533486  332    
   J19420540+5029382  328    
   J19420540+5029382  305    
   J19420540+5029382  721    
   J19420540+5029382  350  ´´´

>Solution :

IIUC,

out = df.groupby('ID')['SNR'].nlargest(5).reset_index('ID')
print(out)

# Output
                   ID  SNR
9   J05062845+7149258  397
8   J05062845+7149258  281
2   J07451689+2804046  257
7   J07451689+2804046  222
1   J07451689+2804046  217
5   J07451689+2804046  206
0   J07451689+2804046  200
13  J15170588+7149258  495
10  J15170588+7149258  431
12  J15170588+7149258  411
11  J15170588+7149258  347
18  J18255915+6533486  403
16  J18255915+6533486  349
19  J18255915+6533486  332
17  J18255915+6533486  321
15  J18255915+6533486  317
22  J19420540+5029382  721
23  J19420540+5029382  350
20  J19420540+5029382  328
21  J19420540+5029382  305

Note: if you want to keep your index ordered, append .sort_index() or sort_index(ignore_index=True) after reset_index('ID').