I have a data frame of different sites, year, and max temperature. I’d like to run a linear regression of the temp and year for each specific site. Instead of doing this for each site, it’d be nice if I could write a for loop that applies the same linear regression model to all of the sites individually and gives me an output with the name of the site in it. I’ve made some dummy data, I have 25 sites in the actual df.
data<- data.frame(site= c('alder','alder','alder','alder','alder','alder','alder','alder', 'oak','oak','oak','oak','oak','oak','oak','oak' ),
year= c('2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015','2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015'),
temp= c(0.5,3, 12, 42, 67, 8, 12, 22, 11, 4, 3, 6, 76, 1, 11, .9))
What here’s how I’ve tried to do it so far:
output<- vector("list", length(unique(data$site)))
sites<- unique(data$site)
for (i in sites) {
data %>% filter(site=i) =j
lm(formula = temp~year, data = j)=k
output[[i]]=k
}
I’m not sure what the best way to make the for loop call the subset of rows that correspond to one site. When I run this code the error that I’m getting is
Error in data %>% filter(site = i) <- j :
could not find function "%>%<-"
I’ve already made sure tidyverse is in my library
Thanks for your help!
>Solution :
There are couple of typos, = would be == and do the -> instead of =. A third issue is the assignment to [[i]] – here i is each sites value. Thus, we may need to name the output to get the correct assignment
names(output) <- sites
for (i in sites) {
data %>% filter(site==i) -> j
lm(formula = temp~year, data = j)-> k
output[[i]]=k
}
-output
> output
$alder
Call:
lm(formula = temp ~ year, data = j)
Coefficients:
(Intercept) year2009 year2010 year2011 year2012 year2013 year2014 year2015
0.5 2.5 11.5 41.5 66.5 7.5 11.5 21.5
$oak
Call:
lm(formula = temp ~ year, data = j)
Coefficients:
(Intercept) year2009 year2010 year2011 year2012 year2013 year2014 year2015
1.100e+01 -7.000e+00 -8.000e+00 -5.000e+00 6.500e+01 -1.000e+01 -3.263e-15 -1.010e+01
With tidyverse, we may be able to do this a couple of ways
library(dplyr)
library(tidyr)
data %>%
nest_by(site) %>%
mutate(model = list(lm(temp ~ year, data = data))) %>%
ungroup
# A tibble: 2 × 3
site data model
<chr> <list<tibble[,2]>> <list>
1 alder [8 × 2] <lm>
2 oak [8 × 2] <lm>
Or use reframe # dplyr version >= 1.1.0
data %>%
reframe(model = list(lm(temp ~year)), .by = site) %>%
as_tibble
-output
# A tibble: 2 × 2
site model
<chr> <list>
1 alder <lm>
2 oak <lm>