Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert set of columns using conversion key

My dataframe has a set of columns that starts with "i10_dx". I would like to use the conversion key I have to create a new set of columns starting with "ccs" that corresponds to the former set of columns.

This is a snippet of dataframe

df <- data.frame(visitlink = c(1001976, 1001976, 1002247), 
                 i10_dx1 = c("C7931", "C7949", NA), 
                 i10_dx2 = c("G9519", "C7951", "C3490"), 
                 i10_dx3 = c("G936", "C7931", "C7931"), 
                 i10_dx4 = c("C7949", NA, "I10"))

This is the conversion key

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

key <- data.frame(icd10 = c("I10", "C3490", "C7931", "C7949", "C7951", "G936", "G9519"), 
                  ccs = c("CIR007", "NEO022", "NEO070", "NEO070", "NEO070", "NVS020", "NVS020"))

>Solution :

A dplyr solution with across() + match():

library(dplyr)

df %>%
  mutate(across(i10_dx1:i10_dx4, ~ key$ccs[match(.x, key$icd10)],
                .names = "{sub('i10_dx', 'ccs', .col)}"))

#   visitlink i10_dx1 i10_dx2 i10_dx3 i10_dx4   ccs1   ccs2   ccs3   ccs4
# 1   1001976   C7931   G9519    G936   C7949 NEO070 NVS020 NVS020 NEO070
# 2   1001976   C7949   C7951   C7931    <NA> NEO070 NEO070 NEO070   <NA>
# 3   1002247    <NA>   C3490   C7931     I10   <NA> NEO022 NEO070 CIR007

Alternatively, transform the data to long, left join it with the lookup table key, and transform back to wide.

library(tidyr)
library(dplyr)

df %>%
  mutate(rowid = row_number()) %>%
  pivot_longer(i10_dx1:i10_dx4, names_prefix = "i10_dx", values_to = "i10_dx") %>%
  left_join(key, by = join_by(i10_dx == icd10)) %>%
  pivot_wider(values_from = c(i10_dx, ccs), names_sep = '')

# # A tibble: 3 × 10
#   visitlink rowid i10_dx1 i10_dx2 i10_dx3 i10_dx4 ccs1   ccs2   ccs3   ccs4  
#       <dbl> <int> <chr>   <chr>   <chr>   <chr>   <chr>  <chr>  <chr>  <chr> 
# 1   1001976     1 C7931   G9519   G936    C7949   NEO070 NVS020 NVS020 NEO070
# 2   1001976     2 C7949   C7951   C7931   NA      NEO070 NEO070 NEO070 NA    
# 3   1002247     3 NA      C3490   C7931   I10     NA     NEO022 NEO070 CIR007
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading