Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to color scatter plot points that meet 2 or more conditions in different columns in R

Here is a part of a dataframe that I have

value1 value2 condition1 condition2 condition3
2.3 0.1 FALSE FALSE TRUE
3.5 2.6 FALSE FALSE TRUE
3.1 2.5 TRUE TRUE TRUE
3.2 2.3 FALSE TRUE TRUE
2.4 1.1 TRUE TRUE FALSE
2.7 2.2 FALSE TRUE FALSE
2.5 3 TRUE FALSE TRUE
2.9 2 TRUE TRUE TRUE
4.2 1 FALSE FALSE TRUE
2.2 1.5 FALSE TRUE TRUE

I would like to plot a scatter plot of value1 vs value2 and color the points that have 2 or more TRUE conditions,

Do you have any suggestions on how to do this(using ggplot2 and the tidyverse)? Thank you for your time and help

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I have tried to group the conditions with group_by but I have not been successful.

>Solution :

We may get the color by using rowSums on the condition columns and then use plot from base R

colr <- ifelse(rowSums(df1[3:5]) > 1, "red", "black")
plot(df1$value1, df1$value2, col = colr, xlab = "value1", ylab = "value2")

Or using tidyverse, create colour column based on rowSums on the logical columns, then use geom_point with colour = colr column and add scale_colour_identity()to use the already existing scaled data

library(dplyr)
library(ggplot2)
df1 %>% 
  mutate(colr = case_when(rowSums(across(starts_with("condition"))) > 1
     ~ "red", TRUE ~ "black")) %>% 
  ggplot(aes(value1, value2, colour = colr)) +
   geom_point() +
    scale_colour_identity()

data

df1 <- structure(list(value1 = c(2.3, 3.5, 3.1, 3.2, 2.4, 2.7, 2.5, 
2.9, 4.2, 2.2), value2 = c(0.1, 2.6, 2.5, 2.3, 1.1, 2.2, 3, 2, 
1, 1.5), condition1 = c(FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, 
TRUE, TRUE, FALSE, FALSE), condition2 = c(FALSE, FALSE, TRUE, 
TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE), condition3 = c(TRUE, 
TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)),
 class = "data.frame", row.names = c(NA, 
-10L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading