Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Split dataframe into multiple by every nth column

I would prefer tidyverse solution!
The question is related to this post.

Example data

structure(list(A = c(79L, 42L, 74L, 49L, 82L, 22L, 88L, 13L, 
54L, 68L), B = c(41L, 22L, 1L, 40L, 96L, 48L, 40L, 56L, 19L, 
84L), C = c(20L, 10L, 1L, 27L, 34L, 27L, 35L, 3L, 78L, 36L), 
    D = c(40L, 92L, 76L, 81L, 73L, 30L, 10L, 57L, 19L, 18L), 
    G = c(50L, 74L, 37L, 60L, 23L, 42L, 22L, 94L, 28L, 68L), 
    H = c(54L, 62L, 92L, 61L, 91L, 76L, 51L, 60L, 89L, 36L), 
    J = c(64L, 59L, 1L, 99L, 36L, 26L, 15L, 16L, 83L, 39L), K = c(29L, 
    30L, 80L, 33L, 44L, 28L, 9L, 53L, 11L, 68L), L = c(42L, 29L, 
    10L, 75L, 24L, 68L, 56L, 77L, 23L, 92L), M = c(56L, 27L, 
    61L, 40L, 76L, 50L, 31L, 15L, 72L, 40L), N = c(45L, 33L, 
    37L, 32L, 5L, 20L, 45L, 38L, 25L, 32L), Z = c(52L, 88L, 74L, 
    91L, 86L, 43L, 4L, 6L, 61L, 69L), X = c(58L, 92L, 19L, 99L, 
    9L, 58L, 53L, 49L, 48L, 32L), Y = c(75L, 13L, 63L, 37L, 30L, 
    98L, 98L, 94L, 38L, 25L), S = c(99L, 64L, 27L, 30L, 100L, 
    40L, 76L, 2L, 10L, 57L), P = c(16L, 76L, 69L, 64L, 68L, 34L, 
    96L, 22L, 48L, 1L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -10L))

# A tibble: 10 x 16
       A     B     C     D     G     H     J     K     L     M     N
   <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1    79    41    20    40    50    54    64    29    42    56    45
 2    42    22    10    92    74    62    59    30    29    27    33
 3    74     1     1    76    37    92     1    80    10    61    37
 4    49    40    27    81    60    61    99    33    75    40    32
 5    82    96    34    73    23    91    36    44    24    76     5
 6    22    48    27    30    42    76    26    28    68    50    20
 7    88    40    35    10    22    51    15     9    56    31    45
 8    13    56     3    57    94    60    16    53    77    15    38
 9    54    19    78    19    28    89    83    11    23    72    25
10    68    84    36    18    68    36    39    68    92    40    32
# ... with 5 more variables: Z <int>, X <int>, Y <int>, S <int>,
#   P <int>

Split up the dataframe in to multiple every 4 columns, into a list, such that one can loop/map export them individually.
Desired output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[[1]]
# A tibble: 10 x 4
       A     B     C     D
   <int> <int> <int> <int>
 1    79    41    20    40
 2    42    22    10    92
 3    74     1     1    76
 4    49    40    27    81
 5    82    96    34    73
 6    22    48    27    30
 7    88    40    35    10
 8    13    56     3    57
 9    54    19    78    19
10    68    84    36    18

[[2]]
# A tibble: 10 x 4
       G     H     J     K
   <int> <int> <int> <int>
 1    50    54    64    29
 2    74    62    59    30
 3    37    92     1    80
 4    60    61    99    33
 5    23    91    36    44
 6    42    76    26    28
 7    22    51    15     9
 8    94    60    16    53
 9    28    89    83    11
10    68    36    39    68

And so on...

The intention is to export each of them into CSV individually.

>Solution :

This is a straightforward one-liner in base R:

lapply(seq(ncol(df)/4) - 1, function(x) df[4 * x + 1:4])
#> [[1]]
#> # A tibble: 10 x 4
#>        A     B     C     D
#>    <int> <int> <int> <int>
#>  1    79    41    20    40
#>  2    42    22    10    92
#>  3    74     1     1    76
#>  4    49    40    27    81
#>  5    82    96    34    73
#>  6    22    48    27    30
#>  7    88    40    35    10
#>  8    13    56     3    57
#>  9    54    19    78    19
#> 10    68    84    36    18
#> 
#> [[2]]
#> # A tibble: 10 x 4
#>        G     H     J     K
#>    <int> <int> <int> <int>
#>  1    50    54    64    29
#>  2    74    62    59    30
#>  3    37    92     1    80
#>  4    60    61    99    33
#>  5    23    91    36    44
#>  6    42    76    26    28
#>  7    22    51    15     9
#>  8    94    60    16    53
#>  9    28    89    83    11
#> 10    68    36    39    68
#> 
#> [[3]]
#> # A tibble: 10 x 4
#>        L     M     N     Z
#>    <int> <int> <int> <int>
#>  1    42    56    45    52
#>  2    29    27    33    88
#>  3    10    61    37    74
#>  4    75    40    32    91
#>  5    24    76     5    86
#>  6    68    50    20    43
#>  7    56    31    45     4
#>  8    77    15    38     6
#>  9    23    72    25    61
#> 10    92    40    32    69
#> 
#> [[4]]
#> # A tibble: 10 x 4
#>        X     Y     S     P
#>    <int> <int> <int> <int>
#>  1    58    75    99    16
#>  2    92    13    64    76
#>  3    19    63    27    69
#>  4    99    37    30    64
#>  5     9    30   100    68
#>  6    58    98    40    34
#>  7    53    98    76    96
#>  8    49    94     2    22
#>  9    48    38    10    48
#> 10    32    25    57     1

Though if for some reason you need a tidyverse solution, the equivalent would be:

purrr::map(seq(ncol(df)/4) - 1, ~ df[4 * .x + 1:4])

Created on 2022-06-29 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading