Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

using a loop to make a codebook for select tidy census variables through the years, trouble with calling for looped dataset

I’m not really great at loops but I’m trying to get better at working through them. I am using tidycensus to select and pull in a few variables throughout the year (dummy data in example below is representative). So, for a given set of selected variables (dv_acs), I want to pull the information in the comprehensive codebook that you can download through load_variables for every year and then full_join them. In most cases, this would be the same information throughout the years, but I want to have this complete so I can double check it and note any discrepancies.

Here is the setup, which is working:

library(tidycensus)
library(dplyr)


#getting codebook for all ACS years for every single variable possible
for(x in c(2009:2020)) {
  filename <- paste0("v", x)
  assign(filename, (load_variables(x, "acs5", cache = TRUE)))
}


#selecing and recoding variables to pull in
dv_acs = c(
  hus          = "B25002_001", 
  husocc       = "B25002_002", 
  husvac       = "B25002_003"
)

This is accomplishing what I want a year at a time, from which I could just do a full bind piece by piece

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#creating a codebook a year at a time for variables I'm interested in
codebook <- v2009 %>%
  filter(name %in% dv_acs) %>%
  mutate(id = names(dv_acs), .before = 1)

colnames(codebook) = c("id", "name", "label_2009", "concept_2009")  

codebook2 <- v2010 %>%
  filter(name %in% dv_acs) %>%
  mutate(id = names(dv_acs), .before = 1)

colnames(codebook2) = c("id", "name", "label_2010", "concept_2010")  

codebook <- full_join(codebook, codebook2, by=c("id", "name"))

And here is where I try and fail to make a loop to create the codebook for my specific variables throughout the year all in one go:

#creating a loop to pull in an join a codebook for all years
for(x in c(2009:2010)){
    codebook <- data.frame(matrix(ncol = 2, nrow = 0)) #create a master file I can join the the files to as they load in through the loop
  colnames(codebook) <- c("id", "name") #giving right label names
  filename <- paste0("v", x) #this is where I'm starting to have trouble; this saves as a value, and I can't then use it to call the dataframe
  temp <- filename %>% (name %in% dv_acs) %>%
    mutate(id = names(dv_acs), .before = 1)
  colnames(temp) <- c("id", "name", paste0("label_", x), paste0("concept_", x))
  codebook <- full_join(codebook, temp, by=c("id", "name"))
}

Reported error is: "Error in name %in% dv_acs : object ‘name’ not found"

>Solution :

It is better to not create objects in global environment. Instead, it could be stored in a list. Here, the values of the objects can be retrieved with mget

library(stringr)
library(purrr)
library(dplyr)
out <- mget(str_c("v", 2009:2020)) %>%
  imap(~ {
    nm <- str_c(c("label", "concept"), str_remove(.y, "v"))

    .x %>% 
   select(-any_of("geography")) %>%
   filter(name %in% dv_acs) %>%
   mutate(id = names(dv_acs), .before = 1) %>%
   rename_with(~ nm, c("label", "concept"))
   }) %>%
   reduce(full_join)

-output

> out
# A tibble: 3 × 26
  id    name  label…¹ conce…² label…³ conce…⁴ label…⁵ conce…⁶ label…⁷ conce…⁸ label…⁹ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟
  <chr> <chr> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1 hus   B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
2 huso… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
3 husv… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
# … with 5 more variables: concept2018 <chr>, label2019 <chr>, concept2019 <chr>, label2020 <chr>, concept2020 <chr>, and abbreviated variable names ¹​label2009,
#   ²​concept2009, ³​label2010, ⁴​concept2010, ⁵​label2011, ⁶​concept2011, ⁷​label2012, ⁸​concept2012, ⁹​label2013, ˟​concept2013, ˟​label2014, ˟​concept2014, ˟​label2015,
#   ˟​concept2015, ˟​label2016, ˟​concept2016, ˟​label2017, ˟​concept2017, ˟​label2018
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading