Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Subtract one row from another row in an R data.frame

I have a fairly large data.frame that shows the results of a data analysis for two treatments (plus a control) for a range of tree species. I’d like to be able to create a new data.frame that shows the difference between the control and each treatment for each species.

Here’s some dummy data to show what I’m trying to do

dat <- data.frame(species = rep (c("Oak", "Elm", "Ash"), each = 3), 
                  result = c(10, 7, 4, 13, 9, 2, 8, 5, 1), 
                  treatment = rep(c('Ctrl', 'Type_1', 'Type_2')))

  species result treatment
1     Oak     10      Ctrl
2     Oak      7    Type_1
3     Oak      4    Type_2
4     Elm     13      Ctrl
5     Elm      9    Type_1
6     Elm      2    Type_2
7     Ash      8      Ctrl
8     Ash      5    Type_1
9     Ash      1    Type_2

What I’d like to do is subtract the Type_1 and Type_2 treatment results for each species by the respective control and generate a new data.frame containing the results. It should look like this.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

 species result treatment_diff
1     Oak      3         Type_1
2     Oak      6         Type_2
3     Elm      4         Type_1
4     Elm     11         Type_2
5     Ash      3         Type_1
6     Ash      7         Type_2

Happy to take a dplyr, tidyr, datatable or any other solution

Thanks very much

>Solution :

An option could be using group_by and use the first value for each group to extract with and filter the rows with result 0 like this:

dat <- data.frame(species = rep (c("Oak", "Elm", "Ash"), each = 3), 
                  result = c(10, 7, 4, 13, 9, 2, 8, 5, 1), 
                  treatment = rep(c('Ctrl', 'Type_1', 'Type_2')))

library(dplyr)
dat %>%
  group_by(species) %>%
  mutate(result = first(result) - result) %>%
  filter(result != 0)
#> # A tibble: 6 × 3
#> # Groups:   species [3]
#>   species result treatment
#>   <chr>    <dbl> <chr>    
#> 1 Oak          3 Type_1   
#> 2 Oak          6 Type_2   
#> 3 Elm          4 Type_1   
#> 4 Elm         11 Type_2   
#> 5 Ash          3 Type_1   
#> 6 Ash          7 Type_2

Created on 2022-07-29 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading