Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to divide all the observations based on a sum of a column

I’m trying to do simple calculations but I’m new and SAS is not intuitive to me.

Suppose I have this table.

data money;
infile datalines delimiter=",";
input name $ return $ invested;
datalines;
Joe,10,100
Bob,7,50
Mary,80,1000
;

Which creates this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

/* name | return | invested */
/* _________________________ */
/* Joe  | 10     | 100 */
/* Bob  | 7      | 50 */
/* Mary | 80     | 50 */

I have three things I would like to do for my job that just switched over to SAS.

  1. I need to make sure columns return and invested are numeric. When I run the code above, return column ends up being a CHAR column and I don’t know why.

  2. Now I want to create a new column and calculate the share of the total return they each got. In this case, the sum of return=97. This is the result I want.

/* name     | return | invested | share_of_return */
/* ____________________________________________ */
/* Joe      |  10   | 100       | 10.30% */
/* Bob      |   7   | 50        | 7.22% */
/* Mary     | 80    | 50        | 82.47% */
  1. Next I want to find their ROI. Which is (return-investment) / investment * 100. This is the result I am looking for
/* Find ROI */

/* name | return | invested | share_of_return | ROI */
/* ___________________________________________________ */
/* Joe  | 10     | 100      | 10.30%          | -90% */
/* Bob  | 7      | 50       | 7.22%           | -86% */
/* Mary | 80     | 50       | 82.47%          | 60% */

I appreciate your explanations and guidance in advanced. This is for a work project and we just switched over to SAS

>Solution :

1 & 3 are easy, 2 is slightly more difficult.

  1. Remove $ in INPUT statement. $ indicates character. In your data you may need to convert it using the input function instead though.

Fix for example:

input name $ return  invested;

Fix for actual data using input function. Note that you cannot convert types in a data step to the same name so I rename it while reading it in using the rename data set option.

data money2;
set money (rename = return = return_char);
return = input(return_char, best.);
drop return_char;
run;
  1. Add total value to data step, SQL is fastest here:
proc sql;
create table money3 as
select *, sum(return) as return_total, return/calculated return_total as return_percentage f=percent12.1
from money2;
quit;

I outline two different methods of doing this here

  1. Within a data step, add your calculation. It’s probably most efficient if it can be done in first step.
    Since a data step loops automatically you write the formula pretty much as shown. In this case I’ve also applied a format so it shows as a percentage but that requires you to not multiply it by 100. Depending on what you’re doing next it may be best to leave it as numeric.
data money2;
set money (rename = return = return_char);
return = input(return_char, best.);
ROI = (return - investment)/investment;
format ROI percent12.1;
run;
drop return_char;
run;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading