Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating many factor variables from multiple numeric variables using cut(), with breaks drawn from a list

I have a dataset with multiple numeric variables, that need to be converted into factors. The breaks are generated by a set of models that were run earlier, and are stored in a named list using lapply. But the number of breaks is different for each variable. And there are a lot of them. The parameters from the model don’t include 0 and the variable max, so I need to pass those to cut() or the output appears to be garbage.

I’ve tried both mutate(across) and data.table with .SDcols and lapply, and I can’t get either working or figure out what I’m doing wrong. It works when I do it manually, column by column, so do I just need to bite the bullet and use a for loop?

Sample data:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

V1 <- runif(100)
V2 <- runif(100)
V3 <- runif(100)
df <- data.frame(V1, V2, V3)
cutlist <- list(c(0.1), c(0.3, 0.4), c(0.1, 0.5, 0.6))
names(cutlist) <- c("V1","V2","V3")

Here’s what I’ve tried with dplyr and mutate:

df %>%
  mutate(
    across(.cols = names(cutlist),
           .fns = ~cut(.x, breaks = c(0, cutlist[which(names(cutlist) == names(.x))], max(.x, na.rm=T))),
           .names = "{.col}_cuts")
  )

The columns are created but they contain garbage.

Using data.table:

out_cols = paste0(names(cutlist),".cut")
df[, c(out_cols) := lapply(.SD, function(x){cut(x, 
                                                   breaks = c(0, cutlist[which(names(cutlist) == names(x))], 
                                                              max(x, na.rm=T)))}), 
      .SDcols = names(cutlist)]

Again, it runs, but the output is garbage.
Manually, this is what works:

df$V1.cut <- cut(df$V1,breaks=c(0, 
             cutlist$V1, max(df$V1, na.rm=T)))

So what am I missing?

>Solution :

Using cur_column() to get the name of the current column inside across you could do:

Note: Note the use [[ instead of [.

set.seed(123)

library(dplyr)

df %>%
  mutate(
    across(
      .cols = names(cutlist),
      .fns = ~ cut(.x, breaks = c(0, cutlist[[cur_column()]], max(.x, na.rm = T))),
      .names = "{.col}_cuts"
    )
  )
#>               V1         V2          V3     V1_cuts     V2_cuts     V3_cuts
#> 1   0.2875775201 0.59998896 0.238726027 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 2   0.7883051354 0.33282354 0.962358936 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 3   0.4089769218 0.48861303 0.601365726 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 4   0.8830174040 0.95447383 0.515029727 (0.1,0.994] (0.4,0.986]   (0.5,0.6]
#> 5   0.9404672843 0.48290240 0.402573342 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 6   0.0455564994 0.89035022 0.880246541     (0,0.1] (0.4,0.986] (0.6,0.982]
#> 7   0.5281054880 0.91443819 0.364091865 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 8   0.8924190444 0.60873498 0.288239281 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 9   0.5514350145 0.41068978 0.170645235 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 10  0.4566147353 0.14709469 0.172171746 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 11  0.9568333453 0.93529980 0.482042606 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 12  0.4533341562 0.30122890 0.252964929 (0.1,0.994]   (0.3,0.4]   (0.1,0.5]
#> 13  0.6775706355 0.06072057 0.216254790 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 14  0.5726334020 0.94772694 0.674376388 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 15  0.1029246827 0.72059627 0.047663627 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 16  0.8998249704 0.14229430 0.700853087 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 17  0.2460877344 0.54928466 0.351888638 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 18  0.0420595335 0.95409124 0.408943998     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 19  0.3279207193 0.58548335 0.820951324 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 20  0.9545036491 0.40451028 0.918857348 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 21  0.8895393161 0.64789348 0.282528330 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 22  0.6928034062 0.31982062 0.961104794 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 23  0.6405068138 0.30772001 0.728394428 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 24  0.9942697766 0.21976763 0.686375082 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 25  0.6557057991 0.36948887 0.052843943 (0.1,0.994]   (0.3,0.4]     (0,0.1]
#> 26  0.7085304682 0.98421920 0.395220135 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 27  0.5440660247 0.15420230 0.477845380 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 28  0.5941420204 0.09104400 0.560253264 (0.1,0.994]     (0,0.3]   (0.5,0.6]
#> 29  0.2891597373 0.14190691 0.698261595 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 30  0.1471136473 0.69000710 0.915683538 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 31  0.9630242325 0.61925648 0.618351227 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 32  0.9022990451 0.89139412 0.428421509 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 33  0.6907052784 0.67299909 0.542080367 (0.1,0.994] (0.4,0.986]   (0.5,0.6]
#> 34  0.7954674177 0.73707774 0.058478489 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 35  0.0246136845 0.52113573 0.260856857     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 36  0.4777959711 0.65983845 0.397151953 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 37  0.7584595375 0.82180546 0.197744737 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 38  0.2164079358 0.78628155 0.831927563 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 39  0.3181810076 0.97982192 0.152887223 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 40  0.2316257854 0.43943154 0.803418542 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 41  0.1428000224 0.31170220 0.546826157 (0.1,0.994]   (0.3,0.4]   (0.5,0.6]
#> 42  0.4145463358 0.40947495 0.662317642 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 43  0.4137243263 0.01046711 0.171698494 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 44  0.3688454509 0.18384952 0.633055360 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 45  0.1524447477 0.84272932 0.311869747 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 46  0.1388060634 0.23116178 0.724554346 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 47  0.2330340995 0.23909996 0.398939825 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 48  0.4659624503 0.07669117 0.969356411 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 49  0.2659726404 0.24572368 0.967398371 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 50  0.8578277153 0.73213521 0.726702539 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 51  0.0458311667 0.84745317 0.257216746     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 52  0.4422000742 0.49752727 0.221787935 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 53  0.7989248456 0.38790903 0.593045652 (0.1,0.994]   (0.3,0.4]   (0.5,0.6]
#> 54  0.1218992600 0.24644899 0.267521432 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 55  0.5609479838 0.11109646 0.531070399 (0.1,0.994]     (0,0.3]   (0.5,0.6]
#> 56  0.2065313896 0.38999444 0.785291671 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 57  0.1275316502 0.57193531 0.168060811 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 58  0.7533078643 0.21689276 0.404399181 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 59  0.8950453592 0.44476800 0.471576278 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 60  0.3744627759 0.21799067 0.868106807 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 61  0.6651151946 0.50229956 0.925707956 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 62  0.0948406609 0.35390457 0.881977559     (0,0.1]   (0.3,0.4] (0.6,0.982]
#> 63  0.3839696378 0.64998516 0.674186843 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 64  0.2743836446 0.37471396 0.950166979 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 65  0.8146400389 0.35544538 0.516444894 (0.1,0.994]   (0.3,0.4]   (0.5,0.6]
#> 66  0.4485163414 0.53368795 0.576519021 (0.1,0.994] (0.4,0.986]   (0.5,0.6]
#> 67  0.8100643530 0.74033436 0.336331206 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 68  0.8123895095 0.22110294 0.347324631 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 69  0.7943423211 0.41274612 0.020024301 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 70  0.4398316876 0.26568669 0.502813046 (0.1,0.994]     (0,0.3]   (0.5,0.6]
#> 71  0.7544751586 0.62997305 0.871043414 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 72  0.6292211316 0.18382849 0.006300784 (0.1,0.994]     (0,0.3]     (0,0.1]
#> 73  0.7101824014 0.86364411 0.072057124 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 74  0.0006247733 0.74656800 0.164211225     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 75  0.4753165741 0.66828465 0.770334074 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 76  0.2201188852 0.61801787 0.735184306 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 77  0.3798165377 0.37223806 0.971875636 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 78  0.6127710033 0.52983569 0.466472377 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 79  0.3517979092 0.87468234 0.074384513 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 80  0.1111354243 0.58175010 0.648818124 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 81  0.2436194727 0.83976776 0.758593170 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 82  0.6680555874 0.31244816 0.137106081 (0.1,0.994]   (0.3,0.4]   (0.1,0.5]
#> 83  0.4176467797 0.70829032 0.396584595 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 84  0.7881958340 0.26501781 0.224985329 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 85  0.1028646443 0.59434319 0.057958561 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 86  0.4348927415 0.48128980 0.395892688 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 87  0.9849569800 0.26503273 0.064928300 (0.1,0.994]     (0,0.3]     (0,0.1]
#> 88  0.8930511144 0.56459043 0.225886433 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 89  0.8864690608 0.91318822 0.054629109 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 90  0.1750526503 0.90187439 0.670282040 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 91  0.1306956916 0.27416662 0.297741783 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 92  0.6531019250 0.32148276 0.100721582 (0.1,0.994]   (0.3,0.4]   (0.1,0.5]
#> 93  0.3435164723 0.98564088 0.071904097 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 94  0.6567581280 0.61999331 0.880440569 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 95  0.3203732425 0.93731409 0.754247402 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 96  0.1876911193 0.46653270 0.816605888 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 97  0.7822943013 0.40683259 0.982140374 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 98  0.0935949867 0.65923032 0.103599645     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 99  0.4667790416 0.15234662 0.099041829 (0.1,0.994]     (0,0.3]     (0,0.1]
#> 100 0.5115054599 0.57286706 0.798831611 (0.1,0.994] (0.4,0.986] (0.6,0.982]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading