Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating new variable using piping in R

I’m trying to create a new variable confirmed_delta_perc in a list of commands (piping) but am having an issue with the variable active_delta showing it is not found. I have confirmed it is in the data frame but is not being read. It also doesn’t add the new variable.

COVID %>%
  select(county, confirmed, confirmed_delta) %>%
  mutate(confirmed_delta_perc = active_delta/active * 100) %>%
  filter(confirmed_delta_perc == 32)

Error:

Error in `mutate()`:
! Problem while computing `confirmed_delta_perc =
  active_delta/active`.
Caused by error:
! object 'active_delta' not found

This is the full list of directions to including in the pipe:
Using piping, create a link of commands that selects the county, confirmed, and confirmed_delta variables. Create a new variable called confirmed_delta_perc using the mutate() function. The values in this column should be the percentage of active delta cases of all active cases. Filter for all observation(s) that have a confirmed_delta_perc value of 32. Print out all observation(s).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I’ve tried modifing the mutate() by renaming the dataframe so it "redoes" it and adds the new variable but it doesn’t work either.

There’s not any observations that actually equal 32 but it still should add the variable but is not.

Does anyone have any ideas?

dput(head(COVID))

structure(list(county = c("Washington", "Fountain", "Jay", "Wabash", 
"Fayette", "Washington"), confirmed = c(620L, 737L, 930L, 1530L, 
1336L, 675L), confirmed_delta = c(18L, 12L, 11L, 49L, 19L, 29L
), deaths = c(5L, 8L, 14L, 25L, 33L, 6L), deaths_delta = c(0L, 
1L, 0L, 1L, 0L, 1L), recovered = c(0L, 0L, 0L, 0L, 0L, 0L), recovered_delta = c(0L, 
0L, 0L, 0L, 0L, 0L), active = c(615L, 729L, 918L, 1512L, 1305L, 
669L), active_delta = c(18L, 11L, 11L, 49L, 19L, 28L), active_delta_perc = c(0.0292682926829268, 
0.0150891632373114, 0.0119825708061002, 0.0324074074074074, 0.0145593869731801, 
0.0418535127055306)), row.names = c(NA, 6L), class = "data.frame")```

>Solution :

For most numbers of cases, it is impossible for any portion of them to be exactly 32%. For instance what we would report 29 of 90 cases as "32%" but that’s really 32.222222 which is not strictly equal to 32. So you will need to specify what range around 32 counts as a match. Here, I say anything within 0.5 of 32 on either side, from 31.5 to 32.5, is close enough.

COVID <- COVID %>%
  mutate(confirmed_delta_perc = active_delta/active * 100) %>%
  filter(abs(confirmed_delta_perc - 32) <= 0.5)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading