Home Select every other nth row of data frame and add to a list of data frames in R

Questions

Select every other nth row of data frame and add to a list of data frames in R

March 10, 2022

I currently have the below data frame and am trying to develop a list of 5 unique data frames containing every 5th row of the original df. Is there a way to select every other 5th row and add it to a new data frame in a list? Either using a for loop or lapply?

df
X1 X2 X3     X4 X5
1  0 0 1.501990  0
2  0 0 1.883904  0
3  0 0 1.333195  0
4  0 0 0.000000  0
5  0 0 2.136760  0
6  0 0 2.186790  0
7  0 0 1.269592  0
8  0 0 1.458405  0
9  0 0 1.816493  0
10 0 0 0.000000  0
11 0 0 2.190029  0
12 0 0 0.000000  0
13 0 0 1.460534  0
14 0 0 1.470776  0
15 0 0 1.675406  0
16 0 0 1.842470  0
17 0 0 1.937999  0
18 0 0 0.000000  0
19 0 0 1.649926  0
20 0 0 2.067902  0

For example, the first data frame would consist of the 1st, 6th, 11th, and 16th row, while the next would start with the 2nd row and carry on down the rows of the df?

>Solution :

Use split with 1:5 to create dataframes with a 5-row interval.

split(df, 1:5)

output

$`1`
   X1 X2 X3       X4 X5
1   1  0  0 1.501990  0
6   6  0  0 2.186790  0
11 11  0  0 2.190029  0
16 16  0  0 1.842470  0

$`2`
   X1 X2 X3       X4 X5
2   2  0  0 1.883904  0
7   7  0  0 1.269592  0
12 12  0  0 0.000000  0
17 17  0  0 1.937999  0

$`3`
   X1 X2 X3       X4 X5
3   3  0  0 1.333195  0
8   8  0  0 1.458405  0
13 13  0  0 1.460534  0
18 18  0  0 0.000000  0

$`4`
   X1 X2 X3       X4 X5
4   4  0  0 0.000000  0
9   9  0  0 1.816493  0
14 14  0  0 1.470776  0
19 19  0  0 1.649926  0

$`5`
   X1 X2 X3       X4 X5
5   5  0  0 2.136760  0
10 10  0  0 0.000000  0
15 15  0  0 1.675406  0
20 20  0  0 2.067902  0

An alternative with dplyr::group_split is:

group_split(df, rep(1:5, nrow(df)/5), .keep = F)

data

df <- structure(list(X1 = 1:20, X2 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X3 = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L), X4 = c(1.50199, 1.883904, 1.333195, 0, 2.13676, 
2.18679, 1.269592, 1.458405, 1.816493, 0, 2.190029, 0, 1.460534, 
1.470776, 1.675406, 1.84247, 1.937999, 0, 1.649926, 2.067902), 
    X5 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-20L))