I have a following DF (demo). I would like to subset the data only for those IDs present on same Date with different City.
| ID | DATE | CITY |
|---|---|---|
| ABC | 2022-07-08 | AAA |
| ABC | 2022-07-22 | BBB |
| XYZ | 2022-07-09 | CCC |
| XYZ | 2022-07-09 | YYY |
| PQR | 2022-09-22 | FFF |
| PQR | 2022-09-26 | EEE |
| EFG | 2022-10-03 | AAA |
| EFG | 2022-10-16 | KKK |
| EFG | 2022-10-16 | PPP |
| EFG | 2022-10-16 | QQQ |
Desired Output:
| ID | DATE | CITY |
|---|---|---|
| XYZ | 2022-07-09 | CCC |
| XYZ | 2022-07-09 | YYY |
| EFG | 2022-10-16 | KKK |
| EFG | 2022-10-16 | PPP |
| EFG | 2022-10-16 | QQQ |
>Solution :
You can group_by ID and DATE and keep groups that have more than 1 n_distinct CITY:
library(dplyr)
df %>%
group_by(ID, DATE) %>%
filter(n_distinct(CITY) > 1)
output
# A tibble: 5 × 3
# Groups: ID, DATE [2]
ID DATE CITY
<chr> <chr> <chr>
1 XYZ 2022-07-09 CCC
2 XYZ 2022-07-09 YYY
3 EFG 2022-10-16 KKK
4 EFG 2022-10-16 PPP
5 EFG 2022-10-16 QQQ