looping in a Data frame

September 18, 2022

I’m pretty new in R and I would like to know why this operation doesn’t allow me to keep looping through a data frame.

I have a dataframe filled with the following data (take note that it only has one column):

1,230.1,37.8,69.2,22.1 <-- this is treated as a column like "1,230.1,37.8,69.2,22.1"
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9

for that I’m creating a empty dataframe as the target:

my.df <- data.frame(matrix(ncol = 5, nrow = 200))
colnames(my.df) <- c('i', 'a', 'b', 'c', 'label')

And I’m trying to parse every string to the target dataframe. But it doesn’t fill it completly, just the first row. I believe the problem is with this line temp <- strsplit(a, ",")[[1]] it should only substring one value and keep going, but it stops in the first iteration.

helper <- 1
for(a in DF){
  temp <- strsplit(a, ",")[[1]]
  print(temp)
  my.df[helper, 1] <- temp[1]
  my.df[helper, 2] <- temp[2]
  my.df[helper, 3] <- temp[3]
  my.df[helper, 4] <- temp[4]
  my.df[helper, 5] <- temp[5]
  helper <- helper + 1
}

At the end I only have this output:

     i    a    b          c   label
1    1 230.1  37.8      69.2  22.1
2 <NA>  <NA>  <NA>      <NA>  <NA>
3 <NA>  <NA>  <NA>      <NA>  <NA>
4 <NA>  <NA>  <NA>      <NA>  <NA>
5 <NA>  <NA>  <NA>      <NA>  <NA>
6 <NA>  <NA>  <NA>      <NA>  <NA>

Any ideas on how to accomplish this task or explain me why the loop dies in the first iteration?
thank you.

>Solution :

Instead of splitting the string, we could directly parse it with read.csv

read.csv(text = DF, header = FALSE, 
     col.names = c('i', 'a', 'b', 'c', 'label'), fill = TRUE)

-output

 i     a    b    c label
1 1 230.1 37.8 69.2  22.1
2 2  44.5 39.3 45.1  10.4
3 3  17.2 45.9 69.3   9.3
4 4 151.5 41.3 58.5  18.5
5 5 180.8 10.8 58.4  12.9

If it is a column in the dataset, extract the column and use read.csv

read.csv(text = DF1$col1, header = FALSE, 
     col.names = c('i', 'a', 'b', 'c', 'label'), fill = TRUE)
  i     a    b    c label
1 1 230.1 37.8 69.2  22.1
2 2  44.5 39.3 45.1  10.4
3 3  17.2 45.9 69.3   9.3
4 4 151.5 41.3 58.5  18.5
5 5 180.8 10.8 58.4  12.9

Regarding the for loop issue, the helper <- helper + 1 doesn’t have any effect on the iteration. Instead, it can be

for(i in seq_along(DF1[[1]])) my.df[i,] <- strsplit(DF1[[1]][i], ",")[[1]]
> head(my.df)
     i     a    b    c label
1    1 230.1 37.8 69.2 22.1 
2    2  44.5 39.3 45.1  10.4
3    3  17.2 45.9 69.3   9.3
4    4 151.5 41.3 58.5  18.5
5    5 180.8 10.8 58.4  12.9
6 <NA>  <NA> <NA> <NA>  <NA>

data

DF <- "1,230.1,37.8,69.2,22.1 
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9"
DF1 <- structure(list(col1 = c("1,230.1,37.8,69.2,22.1 ", "2,44.5,39.3,45.1,10.4", 
"3,17.2,45.9,69.3,9.3", "4,151.5,41.3,58.5,18.5", "5,180.8,10.8,58.4,12.9"
)), class = "data.frame", row.names = c(NA, -5L))