Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to filter numeric characters range in R?

I need to create dummy variables using ICD-10 codes. For example, chapter 2 starts with C00 and ends with D48X. Data looks like this:

data <- data.frame(LINHAA1 = c("B342", "C000", "D450", "0985"),
                   LINHAA2 = c("U071", "C99", "D68X", "J061"),
                   LINHAA3 = c("D48X", "Y098", "X223", "D640"))

Then I need to create a column that receives 1 if it’s between the C00-D48X range and 0 if it’s not. The result I desire:

LINHAA1   LINHAA2   LINHAA3  CHAPTER2
B342      U071      D48X         1
C000      C99       Y098         1
D450      D68X      X223         1
O985      J061      D640         0

It needs to go through LINHAA1 to LINHAA3. Thanks in advance!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

This should do it:

as.numeric(apply(apply(data, 1, 
    function(x) { x >="C00" & x <= "D48X" }), 2, any))
[1] 1 1 1 0

A little explanation: Checking if the codes are in the range can just be checked using alphabetic order (which you can get from <= etc). The inner apply checks each element and produces a matrix of logical values. The outer apply uses any to check if any one of the three logical values is true. as.numeric changes the result from TRUE/False to 1/0.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading