Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to add column with altered ID-names for duplicates within same day

I have a data frame with Days, ID, Date collected and a count value (number of hatched eggs) for several samples each day. The ID stems from the replicate (mother) from which the sample (a number of eggs) was taken, so it requires the information from the "Date collected" column in order to distinguish them as separate samples in for instance a plot.

I want to add a new column called sampleID in which I give each unique sample its own ID.

Example data:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

d1 <- as.Date.POSIXct(2021-06-07)
d2 <- as.Date.POSIXct(2021-06-08)
d3 <- as.Date.POSIXct(2021-06-09)
df <- data.frame(Days = c(1,1,2,2,2,2,3,3,3,3,3),
ID = c(2,5,2,2,5,9,2,2,5,5,9),
Collected =c(d1,d1,d2,d1,d1,d2,d1,d2,d1,d3,d2))

I would like an output to look like:

Days ID Collected SampleID Count
1 2 2021-06-07 2-1 3
1 5 2021-06-07 5-1 5
2 2 2021-06-08 2-1 4
2 2 2021-06-07 2-2 1
2 5 2021-06-07 5-1 7
2 9 2021-06-08 9-1 2
3 2 2021-06-07 2-1 8
3 2 2021-06-08 2-2 5
3 5 2021-06-07 5-1 7
3 5 2021-06-09 5-2 2
3 9 2021-06-08 9-1 2

and I have been trying something like:

df <- df %>% 
group_by(Days) %>% 
mutate(ReplicateID = case_when(ID == ID & Collected != Collected ~ paste(as.character(ID)+"-1")))

Which doesn’t work, but even if it did it would not be able to add -2 or -3 to ID’s repeated more than once within the same day.. So I am kind of lost and would appreciate some help!

>Solution :

Maybe something like this?

library(dplyr)
d1 <- as.Date('2021-06-07')
d2 <- as.Date('2021-06-08')
d3 <- as.Date('2021-06-09')
df <- data.frame(Days = c(1,1,2,2,2,2,3,3,3,3,3),
                 ID = c(2,5,2,2,5,9,2,2,5,5,9),
                 Collected =c(d1,d1,d2,d1,d1,d2,d1,d2,d1,d3,d2))

df |>
  arrange(Days,ID,Collected) |>
  group_by(Days,ID) |>
  mutate(SampleID = paste(ID,row_number(),sep = '-'))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading