I'm trying to create several dataframes from one, based on category in R. Using a loop, but only last itterations are kept

I’m trying to divide my df (of companies) so that I create new data frames based on sector.
Say I have three sectors: industry, IT, Healthcare. I have created a column containing the sectors as such:
Df[, ncol(Df)+1] <- x$sector.

Then I tried looping the df to get new frames as such:

IT = NULL
Industry = NULL
Health = NULL 

for (i in 1:nrow(df)) {
  if (df[i, ncol(df)] == "IT") {
      IT<- df[i, ]
  }
  else if (df[i, ncol(df)] == "Industry") {
      Industry <- df[i, ]
  }
  else if (df[i, ncol(df)] == "Healthcare") {
        Health <- df[i, ]
  }
}

The new dataframes (IT, Industry and Healthcare) however only contains the last iteration, so each df only contains one observation. What am I missing here? Is it possible to achieve this with a loop, or should I be using another method here?

>Solution :

You need to accumulate the results across the loop. Something like this would be in keeping with your original code:

IT = NULL
Industry = NULL
Health = NULL 

for (i in 1:nrow(df)) {
  if (df[i, ncol(df)] == "IT") {
    IT<- rbind(IT, df[i, ])
  }
  else if (df[i, ncol(df)] == "Industry") {
    Industry <- rbind(Industry, df[i, ])
  }
  else if (df[i, ncol(df)] == "Healthcare") {
    Health <- rbind(Health, df[i, ])
  }
}

That said, you could accomplish this much more easily with:

dats <- split(DF, x$sector)
dat1 <- dats[[1]]
dat2 <- dats[[2]]
dat3 <- dats[[3]]

You should be able to identify which category belongs to which dataset by looking at names(dats).

Leave a Reply