Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Obtain minimum value in dataframe column based on two filters

I have a dataframe and I am trying to obtain the shortest pairwise distance of two sample points for distinct categories.

For example, my data:

   GridID.1 GridID.2 Distance.m. GridID.1.binned_temp GridID.1.binned_precip GridID.1.binned_devperc GridID.2.binned_temp GridID.2.binned_precip GridID.2.binned_devperc GridID.2.binned_agriperc
1       A13      A15    11478.59                    1                      5                       1                    1                      5                       1                        5
2       A13      A17    21079.59                    1                      5                       1                    1                      5                       1                        4
3       A13       A8    24837.51                    1                      5                       1                    1                      2                       1                        5
4       A13      B15    11313.88                    1                      5                       1                    1                      5                       1                        2
5       A13   E9.avg    27312.13                    1                      5                       1                    2                      3                       3                        3
6       A13  F10.avg    27973.76                    1                      5                       1                    3                      3                       4                        3
7       A13      F12    23552.94                    1                      5                       1                    3                      4                       5                        1
8       A13      F16    28017.79                    1                      5                       1                    3                      4                       3                        2
9       A15      B15     4318.41                    1                      5                       1                    1                      5                       1                        2
10      A15       B9    31370.86                    1                      5                       1                    1                      3                       1                        5
11      A15      C20    26994.65                    1                      5                       1                    3                      4                       3                        1
12      A15      C21    30116.30                    1                      5                       1                    2                      5                       5                        2
13      A15   C7.avg    44382.24                    1                      5                       1                    1                      2                       1                        5

I need to filter the dataframe to obtain the shortest distance between two sample points for a category (i.e., GridID.1.binned_temp).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

In that case the output would show the category I filtered for (calling it temp1 to indicate that GridID.1.binned_temp == 1), and then the GridID.1 site ID, followed by the GridID.2 site ID that is the shortest distance away from GridID.1 site. So something like:

Category  GridID.1 GridID.2  Distance
  temp1    A13        B15      11313.88
  temp1    A15        B15      4318.41

The code I’m using is terrible as I’m not sure how to change this code to filter by a category of my choosing, and then pick out the shortest distances using ids in GridID.1 to GridID.2.

My code which only gives me the shortest distance I want, and then I manually search for it in the table:

data <- read.csv("testsites.csv", header=TRUE)

df <- data %>%  filter(GridID.1.binned_temp == 1) %>% group_by(GridID.1) %>% 
  summarize(GridID.1 = min(Distance.m.))

The output is just a one column table such as below.

 GridID.1
1 11313.88
2 4318.41

>Solution :

You’re looking for the slice function, or in this particular case you can use slice_min.

df <- data %>%
  filter(GridID.1.binned_temp == 1) %>%
  group_by(GridID.1) %>% 
  slice_min(Distance.m.)

This will give you the complete rows with the minimum distance by group. You can select() whichever columns you want to keep and add %>% mutate(Category = "temp1") if you need to add a column named Category with all values set to the string "temp1".
I don’t get what you mean with

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading