Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to remove duplicates from list of strings based on timestamp

I have the following list:

ls = ["2022-07-17 16:00:02 txt xyz", "2022-07-17 15:00:02 txt xyz", "2022-07-17 16:00:02 txt abc"]

I only want to keep entries where the text is unique (xyz and abc), and where the timestamp is newer. This is my expected outcome:

ls = ["2022-07-17 16:00:02 txt xyz", "2022-07-17 16:00:02 txt abc"]

My approach was to use a dictionary sorted by value, but then I still don’t know how to remove the older timestamp.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import re

keep_message = {}
for i in range(len(ls)):
    timestamp_str = re.search(r"^(.*?) txt", ls[i]).group(1)
    timestamp = datetime.datetime.strptime(timestamp_str, "%Y-%m-%d %H:%M:%S")
    text = re.search(r"txt (.*?)$", ls[i]).group(1)
    keep_message[text + "_" + timestamp_str] = timestamp

keep_message_sorted = dict(sorted(keep_message.items(), key=lambda item: item[1]))

Is there a better solution?

>Solution :

Use a dictionary to keep track of the most recent date per text:

d = {}
for x in ls:
    # get txt (NB. you can also use a regex)
    ts, txt = x.split(' txt ', 1)
    if txt not in d or x > d[txt]:
        d[txt] = x

out = list(d.values())

NB. I used a simple split to get the txt and also performed the comparison on the full string as the date is first and in a format compatible with sorting as string. However, you can use another extraction method (regex), and perform the comparison only on the datetime part.

Output:

['2022-07-17 16:00:02 txt xyz', '2022-07-17 16:00:02 txt abc']
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading