Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Subsetting under multiple conditions

I want to return the number of Transmitter codes that have been seen in both Season Winter1 AND Winter2. The answer should be 6 (6 different codes that were seen in Winter1 and Winter2). But the following command returns 0:

length(unique(Dispersion[(Dispersion$Season == "Winter1") & (Dispersion$Season == "Winter2"),]$Transmitter))

What command is appropriate for my problem?

structure(list(Transmitter = c("A69-1602-59814", "A69-1602-59814", 
"A69-1602-59815", "A69-1602-59815", "A69-1602-59819", "A69-1602-59820", 
"A69-1602-59821", "A69-1602-59822", "A69-1602-59823", "A69-1602-59824", 
"A69-1602-59825", "A69-1602-59826", "A69-1602-59826", "A69-1602-59827", 
"A69-1602-59828", "A69-1602-59828", "A69-1602-59830", "A69-1602-59831", 
"A69-1602-59831", "A69-1602-59832", "A69-1602-59833", "A69-1602-59834", 
"A69-1602-59835", "A69-1602-59835", "A69-1602-59836"), Batch.location = c("Lemmer", 
"Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
"Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
"Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
"Lemmer", "Lemmer", "Lemmer"), Location.Dispersion = c("Lemmer", 
"Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
"Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
"Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
"Lemmer", "Lemmer", "Lemmer"), Season = c("Winter1", "Winter2", 
"Winter1", "Winter2", "Winter1", "Winter1", "Winter1", "Winter1", 
"Winter1", "Winter1", "Winter1", "Winter1", "Winter2", "Winter1", 
"Winter1", "Winter2", "Winter1", "Winter1", "Winter2", "Winter1", 
"Winter1", "Winter1", "Winter1", "Winter2", "Winter1"), Freq = c(1961L, 
2075L, 310L, 1L, 2880L, 305L, 366L, 834L, 19L, 2580L, 564L, 997L, 
3475L, 6447L, 988L, 2991L, 355L, 3147L, 6155L, 903L, 484L, 321L, 
76L, 1921L, 3329L)), row.names = c(NA, -25L), groups = structure(list(
    Transmitter = c("A69-1602-59814", "A69-1602-59815", "A69-1602-59819", 
    "A69-1602-59820", "A69-1602-59821", "A69-1602-59822", "A69-1602-59823", 
    "A69-1602-59824", "A69-1602-59825", "A69-1602-59826", "A69-1602-59827", 
    "A69-1602-59828", "A69-1602-59830", "A69-1602-59831", "A69-1602-59832", 
    "A69-1602-59833", "A69-1602-59834", "A69-1602-59835", "A69-1602-59836"
    ), Batch.location = c("Lemmer", "Lemmer", "Lemmer", "Lemmer", 
    "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
    "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
    "Lemmer", "Lemmer", "Lemmer"), Location.Dispersion = c("Lemmer", 
    "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
    "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", 
    "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer", "Lemmer"
    ), .rows = structure(list(1:2, 3:4, 5L, 6L, 7L, 8L, 9L, 10L, 
        11L, 12:13, 14L, 15:16, 17L, 18:19, 20L, 21L, 22L, 23:24, 
        25L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -19L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You need to group by Transmitter (missing from your attempt) and ensure that both values are in each group of Season.

dplyr

library(dplyr)
out <- dat %>%
  group_by(Transmitter) %>%
  filter(all(c("Winter1", "Winter2") %in% Season)) %>%
  ungroup()
out
# # A tibble: 12 x 5
#    Transmitter    Batch.location Location.Dispersion Season   Freq
#    <chr>          <chr>          <chr>               <chr>   <int>
#  1 A69-1602-59814 Lemmer         Lemmer              Winter1  1961
#  2 A69-1602-59814 Lemmer         Lemmer              Winter2  2075
#  3 A69-1602-59815 Lemmer         Lemmer              Winter1   310
#  4 A69-1602-59815 Lemmer         Lemmer              Winter2     1
#  5 A69-1602-59826 Lemmer         Lemmer              Winter1   997
#  6 A69-1602-59826 Lemmer         Lemmer              Winter2  3475
#  7 A69-1602-59828 Lemmer         Lemmer              Winter1   988
#  8 A69-1602-59828 Lemmer         Lemmer              Winter2  2991
#  9 A69-1602-59831 Lemmer         Lemmer              Winter1  3147
# 10 A69-1602-59831 Lemmer         Lemmer              Winter2  6155
# 11 A69-1602-59835 Lemmer         Lemmer              Winter1    76
# 12 A69-1602-59835 Lemmer         Lemmer              Winter2  1921

And from here you can use n_distinct or something else to count the unique Transmitter values you need.

summarize(out, n = n_distinct(Transmitter))
# # A tibble: 1 x 1
#       n
#   <int>
# 1     6

or just

length(unique(out$Transmitter))
# [1] 6

base R, option 1

ind <- ave(dat$Season, dat$Transmitter,
           FUN = function(z) all(c("Winter1", "Winter2") %in% z)) == "TRUE"
ind
#  [1]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE
# [21] FALSE FALSE  TRUE  TRUE FALSE
dat[ind,]
# ...

length(unique(dat[ind, "Transmitter"]))
# [1] 6

The == "TRUE" use of a character "TRUE" is because ave forces the return value to be the same class as its first argument, which is dat$Season. Internally it calculates logical but is coerced to string afterwards. (Just run the ave(..) without ==... to see this in action.)

base R, option 2

sum(aggregate(Season ~ Transmitter, data = dat,
              FUN = function(z) all(c("Winter1", "Winter2") %in% z))$Season)
# [1] 6
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading