Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to substract columns from two datasets by colname

suppose you have two datasets, and these data frames have the same columns, same row number, just the order where the columns appear is different.

dataset a are predicted values from my model, while dataset b contains the real values from these variables.

I want to get a new dataset that has the a$i - b$i computation, like FYFF from dataset a minus FYFF from dataset B. (a$FYFF-b$FYFF)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I don’t know how (or if I have) to make a loop that matches the column names and subtracts them.

Thanks in advance!

Data:

> dput(a)
structure(list(FYFF = c(5.62481291704216, 5.77021269533357, 5.80660266666805, 
5.89556030216938, 5.81687106929874, 5.89645562124814, 5.88639911374851, 
5.90687872475339, 5.95506281594889, 6.05004047596607, 6.11439503144994, 
6.2045773479442), IP = c(0.00550691992815247, 0.00592967603768478, 
0.00496743469475157, 0.00439395197656857, 0.00436417085033269, 
0.00368796833846484, 0.00375828785751239, 0.00379577545756551, 
0.00347980689447873, 0.00416191362799741, 0.00400028831069191, 
0.0039837438592708), PUNEW = c(0.00248906763444025, 0.00289206479346909, 
0.00356897184657621, 0.00315713460136047, 0.00374885320757934, 
0.00320757113077844, 0.00308236691113797, 0.00322111379093545, 
0.00330962741169567, 0.00332405808527479, 0.00345482092419552, 
0.00361550086806829)), class = "data.frame", row.names = c(NA, 
-12L))
> dput(b)
structure(list(IP = c(-0.0019063187, -0.0010909588, 0.0055955858, 
0.0050583338, 0.0041930195, -0.0029113572, -0.0058143629, 0.01612572, 
0.0074449866, 0.0042460103, 0.011474407, 0.021971466), PUNEW = c(0, 
0.0025031302, 0.0024968802, 0.0049751346, 0.0049505052, 0.0024660925, 
0.0024600258, 0.002453989, 0.0024479816, 0.0024420037, 0.0024360548, 
0.0024301349), FYFF = c(3.72, 3.71, 4.15, 4.63, 4.91, 5.31, 5.57, 
5.55, 5.2, 4.91, 4.14, 3.5)), row.names = c(NA, -12L), class = "data.frame")

>Solution :

A simple solution would be to just sort b according to a and subtract the two data frames:

#Dplyr
new_data <- a - dplyr::select(b, names(a))
#Base R
new_data <- a - b[names(a)]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading