I have a data set that captures a persons ‘route’ by listing the nodes. The data looks like this:
data have;
input id node $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
4 A
4 B
4 C
;
run;
I am trying to capture all different routes the IDs have taken, and then count the number of times that appears. Example output:
Route Count
ABC 2
BC 1
AC 1
Is the only way to achieve this with a transpose statement to measure stop 1,2,3,etc?
>Solution :
Probably easiest to just build up a string for each "path".
data want;
do until(last.id);
set have;
by id;
length route $30 ;
route=cats(route,node);
end;
drop node;
run;
proc freq ;
tables route;
run;
If the nodes are more complex than single letters then you might want to use CATX() instead to insert a delimiter between then in the generate ROUTE string to avoid confusion caused by node ‘AB’ looking like node ‘A’ followed by node ‘B’.