Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is it possible to update multiple datasets using lapply in R?

EDIT: I decided to not complicate things and just work with one dataset.

I might get downvoted, but I have recently started to learn R a week ago and been searching through the web for hours.

I am currently trying to update multiple datasets by adding a new column to each of them.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I did read the solution on Adding new column to multiple data frames at the same time

However running

lapply(list(annual_2022_v2, bottom_2022_v2, q1_2022_v2, q2_2022_v2, q3_2022_v2, q4_2022_v2, top_2022_v2), transform, start_hour = hour(started_at))

only printed the correct output, but didn’t update or added the new column to my original datasets.

To test it on an individual dataset I did,

lapply(list(q1_2022_v2), transform, start_hour = hour(started_at)).

Although it did print the correct dataset with the new column, it didn’t update it.

I am trying to figure out the "optimal" way to be able to write some sort of loop, rather than hard-coding 8 different datasets, such as

q1_2022_v2$start_hour <- hour(q1_2022_v2$started_at)
q2_2022_v2$start_hour <- hour(q2_2022_v2$started_at)
q3_2022_v2$start_hour <- hour(q3_2022_v2$started_at)
q4_2022_v2$start_hour <- hour(q4_2022_v2$started_at)

I also see solutions using Map() and cbind(), but I am confused on how they work.

Thank you

>Solution :

If you don’t assign it, lapply‘s return value is lost. lapply is not a for loop, it does functional programming. What you see printed is its return value.

Start with putting these datasets into a list. I strongly suspect they all have the same structure, which means they should have never been separate, i.e. put them into the list when they are created/imported.

all_2022_v2 <- mget(ls(pattern = glob2rx("*_2022_v2")))

all_2022_v2 <- lapply(all_2022_v2, transform, start_hour = hour(started_at))

You should probably rbind the four datasets and have q as a grouping column.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading