Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

looping in a Data frame

I’m pretty new in R and I would like to know why this operation doesn’t allow me to keep looping through a data frame.

I have a dataframe filled with the following data (take note that it only has one column):

1,230.1,37.8,69.2,22.1 <-- this is treated as a column like "1,230.1,37.8,69.2,22.1"
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9

for that I’m creating a empty dataframe as the target:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

my.df <- data.frame(matrix(ncol = 5, nrow = 200))
colnames(my.df) <- c('i', 'a', 'b', 'c', 'label')

And I’m trying to parse every string to the target dataframe. But it doesn’t fill it completly, just the first row. I believe the problem is with this line temp <- strsplit(a, ",")[[1]] it should only substring one value and keep going, but it stops in the first iteration.

helper <- 1
for(a in DF){
  temp <- strsplit(a, ",")[[1]]
  print(temp)
  my.df[helper, 1] <- temp[1]
  my.df[helper, 2] <- temp[2]
  my.df[helper, 3] <- temp[3]
  my.df[helper, 4] <- temp[4]
  my.df[helper, 5] <- temp[5]
  helper <- helper + 1
}

At the end I only have this output:

     i    a    b          c   label
1    1 230.1  37.8      69.2  22.1
2 <NA>  <NA>  <NA>      <NA>  <NA>
3 <NA>  <NA>  <NA>      <NA>  <NA>
4 <NA>  <NA>  <NA>      <NA>  <NA>
5 <NA>  <NA>  <NA>      <NA>  <NA>
6 <NA>  <NA>  <NA>      <NA>  <NA>

Any ideas on how to accomplish this task or explain me why the loop dies in the first iteration?
thank you.

>Solution :

Instead of splitting the string, we could directly parse it with read.csv

read.csv(text = DF, header = FALSE, 
     col.names = c('i', 'a', 'b', 'c', 'label'), fill = TRUE)

-output

 i     a    b    c label
1 1 230.1 37.8 69.2  22.1
2 2  44.5 39.3 45.1  10.4
3 3  17.2 45.9 69.3   9.3
4 4 151.5 41.3 58.5  18.5
5 5 180.8 10.8 58.4  12.9

If it is a column in the dataset, extract the column and use read.csv

read.csv(text = DF1$col1, header = FALSE, 
     col.names = c('i', 'a', 'b', 'c', 'label'), fill = TRUE)
  i     a    b    c label
1 1 230.1 37.8 69.2  22.1
2 2  44.5 39.3 45.1  10.4
3 3  17.2 45.9 69.3   9.3
4 4 151.5 41.3 58.5  18.5
5 5 180.8 10.8 58.4  12.9

Regarding the for loop issue, the helper <- helper + 1 doesn’t have any effect on the iteration. Instead, it can be

for(i in seq_along(DF1[[1]])) my.df[i,] <- strsplit(DF1[[1]][i], ",")[[1]]
> head(my.df)
     i     a    b    c label
1    1 230.1 37.8 69.2 22.1 
2    2  44.5 39.3 45.1  10.4
3    3  17.2 45.9 69.3   9.3
4    4 151.5 41.3 58.5  18.5
5    5 180.8 10.8 58.4  12.9
6 <NA>  <NA> <NA> <NA>  <NA>

data

DF <- "1,230.1,37.8,69.2,22.1 
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9"
DF1 <- structure(list(col1 = c("1,230.1,37.8,69.2,22.1 ", "2,44.5,39.3,45.1,10.4", 
"3,17.2,45.9,69.3,9.3", "4,151.5,41.3,58.5,18.5", "5,180.8,10.8,58.4,12.9"
)), class = "data.frame", row.names = c(NA, -5L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading