Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

I'm trying to find the closest value in a reference data frame but it is not outputting the correct row. Where am I going wrong?

I’m trying to find the closest value in a reference data frame but it is not outputting the correct row.

I am using the below data frame which is then used to find the relevant row corresponding to the closest value in column ‘P’ to a defined variable. For example if p = 0.22222 then the code should output row 2.

DF:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

      P   n1      n2      n3      n4      n5      n6      n7      n8      n9   
0   NaN  0.0  0.2000  0.4000  0.6000  0.8000  1.0000  1.2000  1.4000  1.6000
1   0.0  1.0  0.8039  0.6286  0.4855  0.3753  0.2929  0.2318  0.1863  0.1520   
2   0.2  1.0  0.7983  0.6201  0.4771  0.3683  0.2876  0.2279  0.1835  0.1500   
3   0.4  1.0  0.7789  0.5924  0.4508  0.3473  0.2720  0.2167  0.1754  0.1442   
4   0.6  1.0  0.7349  0.5377  0.4043  0.3124  0.2470  0.1989  0.1628  0.1351   
5   0.8  1.0  0.6301  0.4433  0.3368  0.2658  0.2147  0.1762  0.1465  0.1234   
6   1.0  0.5  0.3828  0.3105  0.2559  0.2130  0.1787  0.1510  0.1286  0.1102   
7   1.2  0.0  0.1544  0.1871  0.1795  0.1621  0.1433  0.1257  0.1103  0.0965   
8   1.4  0.0  0.0717  0.1101  0.1216  0.1197  0.1120  0.1024  0.0925  0.0831   
9   1.6  0.0  0.0400  0.0682  0.0829  0.0876  0.0865  0.0824  0.0765  0.0707   
10  1.8  0.0  0.0249  0.0449  0.0580  0.0647  0.0668  0.0659  0.0633  0.0597   
11  2.0  0.0  0.0168  0.0312  0.0418  0.0485  0.0519  0.0528  0.0520  0.0502   
12  3.0  0.0  0.0042  0.0082  0.0118  0.0149  0.0174  0.0193  0.0207  0.0216

The function I am using however outputs the incorrect value:

p = 0.2020202
closest_p = df.iloc[(df['P']-p).abs().argsort()[:1]]

Expected output:

      P   n1      n2      n3      n4      n5      n6      n7      n8      n9   
2   0.2  1.0  0.7983  0.6201  0.4771  0.3683  0.2876  0.2279  0.1835  0.1500

However it is only outputting the last row –

      P   n1      n2      n3      n4      n5      n6      n7      n8      n9   
12  3.0  0.0  0.0042  0.0082  0.0118  0.0149  0.0174  0.0193  0.0207  0.0216

Where am i going wrong????

>Solution :

You need to use idxmin:

closest_p = (df['P']-p).abs().idxmin()

Output: 2

For the row: df.loc[(df['P']-p).abs().idxmin()]

A fix of your approach would have been to use sort_values (but it’s less efficient):

closest_p = df.loc[(df['P']-p).abs().sort_values().index[0]]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading