Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find the index of subset data frame from its original data frame in R

I have a data frame in R, say a1 here for a toy example, and I have a row subset from it, say a2.
I want to find the original subset index (2,4).

I tried which or match, but did not succeed.

 set.seed(123)
 a1=data.frame(x1=rnorm(5),x2=runif(5),x3=runif(5))
 a2=a1[c(2,4),]
 a2index=rep(NA,dim(a2)[1])

Here is my a1 data.frame

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    a1
           x1        x2         x3
1 -0.56047565 0.9568333 0.89982497
2 -0.23017749 0.4533342 0.24608773
3  1.55870831 0.6775706 0.04205953
4  0.07050839 0.5726334 0.32792072
5  0.12928774 0.1029247 0.95450365

a2 is a row subset of a1:

 a2
           x1        x2        x3
2 -0.23017749 0.4533342 0.2460877
4  0.07050839 0.5726334 0.3279207

I managed to obtain the index using double loop. But it is too slow, is there a way to speed it up?

Thanks for help.

for (i in 1:dim(a2)[1] )
   for (j in 1:dim(a1)[1])
     if (all(a2[i,]==a1[j,])){
       a2index[i]=j
       break;
       } 

# return the index vector (2,4) 
a2index``

>Solution :

You can use match after transposing your data frames, so that it can have column-wise comparison.

match(as.data.frame(t(a2)), as.data.frame(t(a1)))
[1] 2 4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading