Obtain minimum value in dataframe column based on two filters

Advertisements

I have a dataframe and I am trying to obtain the shortest pairwise distance of two sample points for distinct categories.

For example, my data:

   GridID.1 GridID.2 Distance.m. GridID.1.binned_temp GridID.1.binned_precip GridID.1.binned_devperc GridID.2.binned_temp GridID.2.binned_precip GridID.2.binned_devperc GridID.2.binned_agriperc
1       A13      A15    11478.59                    1                      5                       1                    1                      5                       1                        5
2       A13      A17    21079.59                    1                      5                       1                    1                      5                       1                        4
3       A13       A8    24837.51                    1                      5                       1                    1                      2                       1                        5
4       A13      B15    11313.88                    1                      5                       1                    1                      5                       1                        2
5       A13   E9.avg    27312.13                    1                      5                       1                    2                      3                       3                        3
6       A13  F10.avg    27973.76                    1                      5                       1                    3                      3                       4                        3
7       A13      F12    23552.94                    1                      5                       1                    3                      4                       5                        1
8       A13      F16    28017.79                    1                      5                       1                    3                      4                       3                        2
9       A15      B15     4318.41                    1                      5                       1                    1                      5                       1                        2
10      A15       B9    31370.86                    1                      5                       1                    1                      3                       1                        5
11      A15      C20    26994.65                    1                      5                       1                    3                      4                       3                        1
12      A15      C21    30116.30                    1                      5                       1                    2                      5                       5                        2
13      A15   C7.avg    44382.24                    1                      5                       1                    1                      2                       1                        5

I need to filter the dataframe to obtain the shortest distance between two sample points for a category (i.e., GridID.1.binned_temp).

In that case the output would show the category I filtered for (calling it temp1 to indicate that GridID.1.binned_temp == 1), and then the GridID.1 site ID, followed by the GridID.2 site ID that is the shortest distance away from GridID.1 site. So something like:

Category  GridID.1 GridID.2  Distance
  temp1    A13        B15      11313.88
  temp1    A15        B15      4318.41

The code I’m using is terrible as I’m not sure how to change this code to filter by a category of my choosing, and then pick out the shortest distances using ids in GridID.1 to GridID.2.

My code which only gives me the shortest distance I want, and then I manually search for it in the table:

data <- read.csv("testsites.csv", header=TRUE)

df <- data %>%  filter(GridID.1.binned_temp == 1) %>% group_by(GridID.1) %>% 
  summarize(GridID.1 = min(Distance.m.))

The output is just a one column table such as below.

 GridID.1
1 11313.88
2 4318.41

>Solution :

You’re looking for the slice function, or in this particular case you can use slice_min.

df <- data %>%
  filter(GridID.1.binned_temp == 1) %>%
  group_by(GridID.1) %>% 
  slice_min(Distance.m.)

This will give you the complete rows with the minimum distance by group. You can select() whichever columns you want to keep and add %>% mutate(Category = "temp1") if you need to add a column named Category with all values set to the string "temp1".
I don’t get what you mean with

Leave a ReplyCancel reply