Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find unique values in R but annotate them based on another column

So I have a dataframe like so:

summary(deg)
      L2FC            Gene                    diffexp    comp   
 Min.   :-3.825   Length:926         Downregulated:210   A:195  
 1st Qu.: 1.010   Class :character   Upregulated  :716   B:731  
 Median : 1.163   Mode  :character                              
 Mean   : 0.860                                                 
 3rd Qu.: 1.431                                                 
 Max.   : 6.505    

head(deg)
       L2FC    Gene       diffexp comp
1 -2.754236 SLC13A2 Downregulated    A
2  3.161623   SNAI2   Upregulated    A
3 -2.821350   STYK1 Downregulated    A
4 -1.798022    CD84 Downregulated    A
5 -1.293536    TLE6 Downregulated    A
6 -1.011016   P2RX1 Downregulated    A

What I want is simply the unique gene symbols annotated based on whether they are in A only, B only, or shared across both. Desired output is like this:

   Gene comp
1 GENE1    1
2 GENE2    0
3 GENE3   -1

Where the Gene column only has the unique values from deg and the comp shows +1 for belonging to only A, -1 for belonging to only B, or 0 for belonging to both.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thanks!

>Solution :

Try

library(tidyverse)

deg |>
  summarize(
    comp = any(comp == "A") - any(comp == "B"),
    .by = Gene
  )
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading