I’m trying to do simple calculations but I’m new and SAS is not intuitive to me.
Suppose I have this table.
data money;
infile datalines delimiter=",";
input name $ return $ invested;
datalines;
Joe,10,100
Bob,7,50
Mary,80,1000
;
Which creates this
/* name | return | invested */
/* _________________________ */
/* Joe | 10 | 100 */
/* Bob | 7 | 50 */
/* Mary | 80 | 50 */
I have three things I would like to do for my job that just switched over to SAS.
-
I need to make sure columns
return
andinvested
are numeric. When I run the code above,return
column ends up being a CHAR column and I don’t know why. -
Now I want to create a new column and calculate the share of the total return they each got. In this case, the sum of
return
=97. This is the result I want.
/* name | return | invested | share_of_return */
/* ____________________________________________ */
/* Joe | 10 | 100 | 10.30% */
/* Bob | 7 | 50 | 7.22% */
/* Mary | 80 | 50 | 82.47% */
- Next I want to find their ROI. Which is (return-investment) / investment * 100. This is the result I am looking for
/* Find ROI */
/* name | return | invested | share_of_return | ROI */
/* ___________________________________________________ */
/* Joe | 10 | 100 | 10.30% | -90% */
/* Bob | 7 | 50 | 7.22% | -86% */
/* Mary | 80 | 50 | 82.47% | 60% */
I appreciate your explanations and guidance in advanced. This is for a work project and we just switched over to SAS
>Solution :
1 & 3 are easy, 2 is slightly more difficult.
- Remove $ in INPUT statement. $ indicates character. In your data you may need to convert it using the input function instead though.
Fix for example:
input name $ return invested;
Fix for actual data using input function. Note that you cannot convert types in a data step to the same name so I rename it while reading it in using the rename data set option.
data money2;
set money (rename = return = return_char);
return = input(return_char, best.);
drop return_char;
run;
- Add total value to data step, SQL is fastest here:
proc sql;
create table money3 as
select *, sum(return) as return_total, return/calculated return_total as return_percentage f=percent12.1
from money2;
quit;
I outline two different methods of doing this here
- Within a data step, add your calculation. It’s probably most efficient if it can be done in first step.
Since a data step loops automatically you write the formula pretty much as shown. In this case I’ve also applied a format so it shows as a percentage but that requires you to not multiply it by 100. Depending on what you’re doing next it may be best to leave it as numeric.
data money2;
set money (rename = return = return_char);
return = input(return_char, best.);
ROI = (return - investment)/investment;
format ROI percent12.1;
run;
drop return_char;
run;