Home How to assign categories to columns in R so that values within those categories can be summed for each row?

Questions

How to assign categories to columns in R so that values within those categories can be summed for each row?

March 28, 2023

I have a data frame in R which is organized with sampled trees in rows and different lichen species in columns. Count data for each lichen species per tree is listed in each column.

Tree_ID	Lichen_A	Lichen_B	Lichen_C
Tree1	1	2	3
Tree2	2	5	7

I would like to categorize the lichens and count the number seen per tree according to category. Ideally, the categories could be imported from another data frame which could look like this. Certain lichens will share the same category.

Lichen	Sensitivity
Lichen_A	Low
Lichen_B	Low
Lichen_C	Medium

Combining these to get the sum of the categories would look like this:

Tree_ID	Low	Medium
Tree1	3	3
Tree2	7	7

In reality I have about 40 different lichen species, two sensitivity categories and multiple datasets of trees, so I’d like to find a solution that means I don’t need to do everything manually.

I am not sure how to set up the code for this, though I have a feeling it should be possible, perhaps with a loop and/or dplyr mutate?

>Solution :

With dplyr and tidyr:

library(dplyr)
library(tidyr)
df1 %>% 
  pivot_longer(unique(df2$Lichen)) %>% 
  inner_join(df2, by = c("name" = "Lichen")) %>% 
  group_by(Tree_ID, Sensitivity) %>% 
  summarise(value = sum(value)) %>% 
  pivot_wider(names_from = "Sensitivity")

  Tree_ID   Low Medium
1 Tree1       3      3
2 Tree2       7      7

data

df1 <- read.table(header = T,text = "Tree_ID    Lichen_A    Lichen_B    Lichen_C
Tree1   1   2   3
Tree2   2   5   7")


df2 <- read.table(header = T,text= "
Lichen  Sensitivity
Lichen_A    Low
Lichen_B    Low
Lichen_C    Medium")