Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

dplyr if else without the else/ conditional mutate in one chunk

I am trying to add a column to my dataframe based on if a string is detected in another column. I have done this in two chunks of code and then merged them together, but I am trying to streamline my code so that there is less to type out in the future. I also noticed I performed a join incorrectly on a dataset I’ve been working with for months, so the fewers joins, the better.

Here is what currently works for me, but feels unnecessarily long.

dtc_final2022<- dtc_final1 %>% 
  filter (str_detect(detection_timestamp_utc, "2022")) %>%
  mutate(Year = "2022") 

dtc_final2021 <-  dtc_final1 %>% 
  filter (str_detect(detection_timestamp_utc, "2021")) %>%
  mutate(Year = "2021")

dtc_final2 <- full_join(dtc_final2021, dtc_final2022)

dtc_final1 is a dataset with timestamps from many years. I am only interested in adding a "year" to timestamps that contain 2021 and 2022. In the future, I will add 2023 and 2024.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

This is what I would like to do, but in doing so, I replace the previous year with NA. Is there a way to run an ifelse function without the ‘else’? Also, please remember that I cant use the other year as the ‘else’ since in the future, I will have 4 years to deal with, and not just 2.

dtc_final2 <- dtc_final1 %>%
  mutate(Year = ifelse(str_detect(detection_timestamp_utc, "2021"), "2021", NA),
         Year = ifelse(str_detect(detection_timestamp_utc, "2022"), "2022", NA))

I try to do everyling in dplyr but if a for loop does the trick, then I guess I’ll buck up.

Thanks in advance!

>Solution :

You may use str_extract() rather than str_detect() here, and use a regular expression that captures both of the years of interest:

mutate(dtc_final1, Year=str_extract(detection_timestamp_utc, "^202[12]"))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading