Creating many factor variables from multiple numeric variables using cut(), with breaks drawn from a list

I have a dataset with multiple numeric variables, that need to be converted into factors. The breaks are generated by a set of models that were run earlier, and are stored in a named list using lapply. But the number of breaks is different for each variable. And there are a lot of them. The parameters from the model don’t include 0 and the variable max, so I need to pass those to cut() or the output appears to be garbage.

I’ve tried both mutate(across) and data.table with .SDcols and lapply, and I can’t get either working or figure out what I’m doing wrong. It works when I do it manually, column by column, so do I just need to bite the bullet and use a for loop?

Sample data:

V1 <- runif(100)
V2 <- runif(100)
V3 <- runif(100)
df <- data.frame(V1, V2, V3)
cutlist <- list(c(0.1), c(0.3, 0.4), c(0.1, 0.5, 0.6))
names(cutlist) <- c("V1","V2","V3")

Here’s what I’ve tried with dplyr and mutate:

df %>%
  mutate(
    across(.cols = names(cutlist),
           .fns = ~cut(.x, breaks = c(0, cutlist[which(names(cutlist) == names(.x))], max(.x, na.rm=T))),
           .names = "{.col}_cuts")
  )

The columns are created but they contain garbage.

Using data.table:

out_cols = paste0(names(cutlist),".cut")
df[, c(out_cols) := lapply(.SD, function(x){cut(x, 
                                                   breaks = c(0, cutlist[which(names(cutlist) == names(x))], 
                                                              max(x, na.rm=T)))}), 
      .SDcols = names(cutlist)]

Again, it runs, but the output is garbage.
Manually, this is what works:

df$V1.cut <- cut(df$V1,breaks=c(0, 
             cutlist$V1, max(df$V1, na.rm=T)))

So what am I missing?

>Solution :

Using cur_column() to get the name of the current column inside across you could do:

Note: Note the use [[ instead of [.

set.seed(123)

library(dplyr)

df %>%
  mutate(
    across(
      .cols = names(cutlist),
      .fns = ~ cut(.x, breaks = c(0, cutlist[[cur_column()]], max(.x, na.rm = T))),
      .names = "{.col}_cuts"
    )
  )
#>               V1         V2          V3     V1_cuts     V2_cuts     V3_cuts
#> 1   0.2875775201 0.59998896 0.238726027 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 2   0.7883051354 0.33282354 0.962358936 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 3   0.4089769218 0.48861303 0.601365726 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 4   0.8830174040 0.95447383 0.515029727 (0.1,0.994] (0.4,0.986]   (0.5,0.6]
#> 5   0.9404672843 0.48290240 0.402573342 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 6   0.0455564994 0.89035022 0.880246541     (0,0.1] (0.4,0.986] (0.6,0.982]
#> 7   0.5281054880 0.91443819 0.364091865 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 8   0.8924190444 0.60873498 0.288239281 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 9   0.5514350145 0.41068978 0.170645235 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 10  0.4566147353 0.14709469 0.172171746 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 11  0.9568333453 0.93529980 0.482042606 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 12  0.4533341562 0.30122890 0.252964929 (0.1,0.994]   (0.3,0.4]   (0.1,0.5]
#> 13  0.6775706355 0.06072057 0.216254790 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 14  0.5726334020 0.94772694 0.674376388 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 15  0.1029246827 0.72059627 0.047663627 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 16  0.8998249704 0.14229430 0.700853087 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 17  0.2460877344 0.54928466 0.351888638 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 18  0.0420595335 0.95409124 0.408943998     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 19  0.3279207193 0.58548335 0.820951324 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 20  0.9545036491 0.40451028 0.918857348 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 21  0.8895393161 0.64789348 0.282528330 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 22  0.6928034062 0.31982062 0.961104794 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 23  0.6405068138 0.30772001 0.728394428 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 24  0.9942697766 0.21976763 0.686375082 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 25  0.6557057991 0.36948887 0.052843943 (0.1,0.994]   (0.3,0.4]     (0,0.1]
#> 26  0.7085304682 0.98421920 0.395220135 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 27  0.5440660247 0.15420230 0.477845380 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 28  0.5941420204 0.09104400 0.560253264 (0.1,0.994]     (0,0.3]   (0.5,0.6]
#> 29  0.2891597373 0.14190691 0.698261595 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 30  0.1471136473 0.69000710 0.915683538 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 31  0.9630242325 0.61925648 0.618351227 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 32  0.9022990451 0.89139412 0.428421509 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 33  0.6907052784 0.67299909 0.542080367 (0.1,0.994] (0.4,0.986]   (0.5,0.6]
#> 34  0.7954674177 0.73707774 0.058478489 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 35  0.0246136845 0.52113573 0.260856857     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 36  0.4777959711 0.65983845 0.397151953 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 37  0.7584595375 0.82180546 0.197744737 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 38  0.2164079358 0.78628155 0.831927563 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 39  0.3181810076 0.97982192 0.152887223 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 40  0.2316257854 0.43943154 0.803418542 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 41  0.1428000224 0.31170220 0.546826157 (0.1,0.994]   (0.3,0.4]   (0.5,0.6]
#> 42  0.4145463358 0.40947495 0.662317642 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 43  0.4137243263 0.01046711 0.171698494 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 44  0.3688454509 0.18384952 0.633055360 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 45  0.1524447477 0.84272932 0.311869747 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 46  0.1388060634 0.23116178 0.724554346 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 47  0.2330340995 0.23909996 0.398939825 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 48  0.4659624503 0.07669117 0.969356411 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 49  0.2659726404 0.24572368 0.967398371 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 50  0.8578277153 0.73213521 0.726702539 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 51  0.0458311667 0.84745317 0.257216746     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 52  0.4422000742 0.49752727 0.221787935 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 53  0.7989248456 0.38790903 0.593045652 (0.1,0.994]   (0.3,0.4]   (0.5,0.6]
#> 54  0.1218992600 0.24644899 0.267521432 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 55  0.5609479838 0.11109646 0.531070399 (0.1,0.994]     (0,0.3]   (0.5,0.6]
#> 56  0.2065313896 0.38999444 0.785291671 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 57  0.1275316502 0.57193531 0.168060811 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 58  0.7533078643 0.21689276 0.404399181 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 59  0.8950453592 0.44476800 0.471576278 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 60  0.3744627759 0.21799067 0.868106807 (0.1,0.994]     (0,0.3] (0.6,0.982]
#> 61  0.6651151946 0.50229956 0.925707956 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 62  0.0948406609 0.35390457 0.881977559     (0,0.1]   (0.3,0.4] (0.6,0.982]
#> 63  0.3839696378 0.64998516 0.674186843 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 64  0.2743836446 0.37471396 0.950166979 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 65  0.8146400389 0.35544538 0.516444894 (0.1,0.994]   (0.3,0.4]   (0.5,0.6]
#> 66  0.4485163414 0.53368795 0.576519021 (0.1,0.994] (0.4,0.986]   (0.5,0.6]
#> 67  0.8100643530 0.74033436 0.336331206 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 68  0.8123895095 0.22110294 0.347324631 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 69  0.7943423211 0.41274612 0.020024301 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 70  0.4398316876 0.26568669 0.502813046 (0.1,0.994]     (0,0.3]   (0.5,0.6]
#> 71  0.7544751586 0.62997305 0.871043414 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 72  0.6292211316 0.18382849 0.006300784 (0.1,0.994]     (0,0.3]     (0,0.1]
#> 73  0.7101824014 0.86364411 0.072057124 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 74  0.0006247733 0.74656800 0.164211225     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 75  0.4753165741 0.66828465 0.770334074 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 76  0.2201188852 0.61801787 0.735184306 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 77  0.3798165377 0.37223806 0.971875636 (0.1,0.994]   (0.3,0.4] (0.6,0.982]
#> 78  0.6127710033 0.52983569 0.466472377 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 79  0.3517979092 0.87468234 0.074384513 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 80  0.1111354243 0.58175010 0.648818124 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 81  0.2436194727 0.83976776 0.758593170 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 82  0.6680555874 0.31244816 0.137106081 (0.1,0.994]   (0.3,0.4]   (0.1,0.5]
#> 83  0.4176467797 0.70829032 0.396584595 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 84  0.7881958340 0.26501781 0.224985329 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 85  0.1028646443 0.59434319 0.057958561 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 86  0.4348927415 0.48128980 0.395892688 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 87  0.9849569800 0.26503273 0.064928300 (0.1,0.994]     (0,0.3]     (0,0.1]
#> 88  0.8930511144 0.56459043 0.225886433 (0.1,0.994] (0.4,0.986]   (0.1,0.5]
#> 89  0.8864690608 0.91318822 0.054629109 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 90  0.1750526503 0.90187439 0.670282040 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 91  0.1306956916 0.27416662 0.297741783 (0.1,0.994]     (0,0.3]   (0.1,0.5]
#> 92  0.6531019250 0.32148276 0.100721582 (0.1,0.994]   (0.3,0.4]   (0.1,0.5]
#> 93  0.3435164723 0.98564088 0.071904097 (0.1,0.994] (0.4,0.986]     (0,0.1]
#> 94  0.6567581280 0.61999331 0.880440569 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 95  0.3203732425 0.93731409 0.754247402 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 96  0.1876911193 0.46653270 0.816605888 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 97  0.7822943013 0.40683259 0.982140374 (0.1,0.994] (0.4,0.986] (0.6,0.982]
#> 98  0.0935949867 0.65923032 0.103599645     (0,0.1] (0.4,0.986]   (0.1,0.5]
#> 99  0.4667790416 0.15234662 0.099041829 (0.1,0.994]     (0,0.3]     (0,0.1]
#> 100 0.5115054599 0.57286706 0.798831611 (0.1,0.994] (0.4,0.986] (0.6,0.982]

Leave a Reply