Is it possible to update multiple datasets using lapply in R?

Advertisements

EDIT: I decided to not complicate things and just work with one dataset.

I might get downvoted, but I have recently started to learn R a week ago and been searching through the web for hours.

I am currently trying to update multiple datasets by adding a new column to each of them.

I did read the solution on Adding new column to multiple data frames at the same time

However running

lapply(list(annual_2022_v2, bottom_2022_v2, q1_2022_v2, q2_2022_v2, q3_2022_v2, q4_2022_v2, top_2022_v2), transform, start_hour = hour(started_at))

only printed the correct output, but didn’t update or added the new column to my original datasets.

To test it on an individual dataset I did,

lapply(list(q1_2022_v2), transform, start_hour = hour(started_at)).

Although it did print the correct dataset with the new column, it didn’t update it.

I am trying to figure out the "optimal" way to be able to write some sort of loop, rather than hard-coding 8 different datasets, such as

q1_2022_v2$start_hour <- hour(q1_2022_v2$started_at)
q2_2022_v2$start_hour <- hour(q2_2022_v2$started_at)
q3_2022_v2$start_hour <- hour(q3_2022_v2$started_at)
q4_2022_v2$start_hour <- hour(q4_2022_v2$started_at)

I also see solutions using Map() and cbind(), but I am confused on how they work.

Thank you

>Solution :

If you don’t assign it, lapply‘s return value is lost. lapply is not a for loop, it does functional programming. What you see printed is its return value.

Start with putting these datasets into a list. I strongly suspect they all have the same structure, which means they should have never been separate, i.e. put them into the list when they are created/imported.

all_2022_v2 <- mget(ls(pattern = glob2rx("*_2022_v2")))

all_2022_v2 <- lapply(all_2022_v2, transform, start_hour = hour(started_at))

You should probably rbind the four datasets and have q as a grouping column.

Leave a ReplyCancel reply