Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove duplicates inside a sequence of records in a group with SAS

Is it possible to remove duplicated records in sequence inside a specific group and output only last of them (based od date) with 4GL (SAS)? I have data like:

data example;
input obs id dt value WANT_TO_SELECT;
cards;
1 10 1 500 0
2 10 2 750 1
3 10 3 750 1
4 10 4 750 0
5 10 5 500 0
6 20 1 150 1
7 20 2 150 0
8 20 3 370 0
9 20 4 150 0
;
run;

As You see for id=10 I would like to have only one (last) record with value 750, because there is one after the other while value 500 can be twice because they are separated. I was trying use last/first but I am not sure how to sort the data.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Looks like a use case for the NOTSORTED keyword of the BY statement. This will let you use VALUE as a BY variable even though the data is not actually sorted by VALUE. That way the LAST.VALUE flag can be used.

data want;
  set example;
  by id value notsorted;
  if last.value;
run;

Results:

                                   WANT_TO_
Obs    obs    id    dt    value     SELECT

 1      1     10     1     500         0
 2      4     10     4     750         0
 3      5     10     5     500         0
 4      7     20     2     150         0
 5      8     20     3     370         0
 6      9     20     4     150         0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading