Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R How To Set Up Cut Function

set.seed(1)
DATA = data.frame(X = sample(c(0:100), 1000, replace = TRUE))
DATA$CUT = with(DATA, cut(X, breaks = c(10,20,30,40,50,60,70,80,90), right = FALSE))

I wish to get groups: 0-9, 10-19, 20-29,..,80-89, 90+ but no matter how I do cut function I do not get these breaks.

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

You need to include the extreme bounds. For example

breaks <- c(0,10,20,30,40,50,60,70,80,90, Inf)
DATA <- transform(DATA, CUT=cut(X, breaks=breaks, right = FALSE))

which results in

table(DATA$CUT)
#   [0,10)  [10,20)  [20,30)  [30,40)  [40,50)  [50,60)  [60,70)  [70,80)  [80,90) [90,Inf) 
#     102       84       96      102       96      102       90       94       122      112 

Since cut() usually expects continuous values and not counts, if you have integers, [0,10) is the same as [0,9] or 0-9

If you want to set the labels, you can do

breaks <- c(0,10,20,30,40,50,60,70,80,90, Inf)
labels <- paste(head(breaks, -1), tail(breaks, -1)-1, sep="-")
DATA <- transform(DATA, CUT=cut(X, breaks=breaks, labels=labels, right = FALSE))

which now results in

table(DATA$CUT)
#    0-9  10-19  20-29  30-39  40-49  50-59  60-69  70-79  80-89 90-Inf 
#    102     84     96    102     96    102     90     94    122    112 
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading