Home Remove distinct values based on condition

Questions

Remove distinct values based on condition

July 22, 2022

I have a dataset I am trying to remove duplicate values on but need to retain the rows where a condition is met. It looks like,

col1 col2
a    NA
a    1
b    1
c    1
d    1
d    2

If I just run the normal distinct functions I retain just the first value/row of the duplicates

col1 col2
a    NA
b    1
c    1
d    1

BUT – I need to retain

col1 col2
a    1
b    1
c    1
d    1

I have tried

df <- df %>% 
  group_by(col1) %>%
  top_n(1, col2)

But it seems to be removing extra rows within a larger dataset that does not represent duplicates from col1. It is somehow running it’s own condition on col2 and removing extra beyond the duplicates.

In my real example col1 are serial #’s and col2 are dates. I am trying to remove NA’s from col2 while also trying to preserve any that have the max date of potentially two date values (an older date and a newer date)

>Solution :

We could group arrange and slice:

library(dplyr)

df %>% 
  group_by(col1) %>% 
  arrange(col2, .by_group = TRUE) %>% 
  slice(1)

This (for this example!!!) gives the same result using add_count:

library(dplyr)
df %>%
  add_count(col2) %>% 
  filter(n!=1) %>%
  select(-n)

 col1   col2
  <chr> <int>
1 a         1
2 b         1
3 c         1
4 d         1

dplyr

byMR

Published July 22, 2022

Add a comment

How to update all records in a table with Eloquent?

byMR

July 22, 2022

Questions

pandas math operation with where condition

byMR

July 22, 2022

Questions

No Docker config.json in Amazon Linux EC2

byMR

July 22, 2022

Questions

Convert datetime to include utc timezone php

byMR

July 22, 2022

Questions

How to pass data between views and forms using form_valid

byMR

July 22, 2022

Questions

Conditionally display block of text with code inside string section

byMR

July 22, 2022

Remove distinct values based on condition

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How to update all records in a table with Eloquent?

pandas math operation with where condition

No Docker config.json in Amazon Linux EC2

Convert datetime to include utc timezone php

How to pass data between views and forms using form_valid

Conditionally display block of text with code inside string section

Keep Up to Date with the Most Important News

Remove distinct values based on condition

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How to update all records in a table with Eloquent?

pandas math operation with where condition

No Docker config.json in Amazon Linux EC2

Convert datetime to include utc timezone php

How to pass data between views and forms using form_valid

Conditionally display block of text with code inside string section

Discover more from Dev solutions