Subset records based on the occurrence of one observation but also additional ones

suppose to have the following:

  ID       Rep    Year     
 0001      exA    2015
 0001      exA    2015
 0002      exA    2015
 0002      exB    2015
 0002      exA    2015
 0002      exC    2015 
 0003      exC    2016
 0003      exA    2016
 0003      exD    2016
 0003      exA    2016 
.....     ...    ....

is there a way to subset IDs based on the appearance of exA in Rep column but also other ex*? This, only for 2015. ID 0001 that has only exA should not be extracted. The criteria is the presence of exA (mandatory) but also other ex*

Desired output:

  ID       Rep    Year           
 0002      exA    2015       
 0002      exB    2015
 0002      exA    2015
 0002      exC    2015

>Solution :

Try this

data have;
input ID $ Rep $ Year;
datalines;
0001 exA 2015 
0001 exA 2015 
0002 exA 2015 
0002 exB 2015 
0002 exA 2015 
0002 exC 2015 
0003 exC 2016 
0003 exA 2016 
0003 exD 2016 
0003 exA 2016 
;

proc sql;
   create table want as
   select * from have
   where year = 2015
   group by ID
   having sum(Rep = 'exA') > 0
      and count(distinct Rep) > 1
   ;
quit;

Related

Leave a ReplyCancel reply