Say I have the list of dataframes:
#Example data frame columns
Image <- c("001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001")
Size <- c("Big", "Small", "Medium", "Tiny", "Big", "Small", "Medium", "Tiny", "Big", "Small", "Medium", "Tiny")
n <- c(111778, 56, 7099, 3, 3682081, 88, 9078, 7, 198346, 422, 30077, 8)
#make example data frame
data <- data.frame(Image, Size, n)
#Split dataframe into a list of dataframes
df <- split(data, f = data$Image)
df
How could I add an empty column (named new), to each of the dataframes contained in this list.
I have tried
df$new <- NA
But nothing happens with no error
There are much more complicated answers to this question, based on specific conditions etc but I get lost trying to simplify it!
>Solution :
Two problems:
-
Your
nhas an extra comma,, remove it. -
Your
dfis alist, not a frame, so you cannot just reference a column in it with the$special operator. You can either add thenewcolumn beforesplitting it, or add it specially to each split-frame.
First option:
data$new <- NA
df <- split(data, f = data$Image)
df
# $`001`
# Image Size n new
# 1 001 Big 111778 NA
# 2 001 Small 56 NA
# 3 001 Medium 7099 NA
# 4 001 Tiny 3 NA
# 5 001 Big 3682081 NA
# 6 001 Small 88 NA
# 7 001 Medium 9078 NA
# 8 001 Tiny 7 NA
# 9 001 Big 198346 NA
# 10 001 Small 422 NA
# 11 001 Medium 30077 NA
# 12 001 Tiny 8 NA
Second option, add to each frame in the list:
### original data, without `new`
df <- split(data, f = data$Image)
df <- lapply(df, transform, new = NA)
df
# $`001`
# Image Size n new
# 1 001 Big 111778 NA
# 2 001 Small 56 NA
# 3 001 Medium 7099 NA
# 4 001 Tiny 3 NA
# 5 001 Big 3682081 NA
# 6 001 Small 88 NA
# 7 001 Medium 9078 NA
# 8 001 Tiny 7 NA
# 9 001 Big 198346 NA
# 10 001 Small 422 NA
# 11 001 Medium 30077 NA
# 12 001 Tiny 8 NA
Data
data <- structure(list(Image = c("001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001", "001"), Size = c("Big", "Small", "Medium", "Tiny", "Big", "Small", "Medium", "Tiny", "Big", "Small", "Medium", "Tiny"), n = c(111778, 56, 7099, 3, 3682081, 88, 9078, 7, 198346, 422, 30077, 8)), row.names = c(NA, -12L), class = "data.frame")