Upload
jui
View
34
Download
0
Tags:
Embed Size (px)
DESCRIPTION
EPIB 698C Lecture 3. Instructor: Raul Cruz-Cano [email protected] 9/19/2012. 1. Creating and Redefining Variables. You can create and redefine variables with assignment statements as follows: Variable =expression. Home gardener's data. - PowerPoint PPT Presentation
Citation preview
2
Creating and Redefining Variables
• You can create and redefine variables with assignment statements as follows: Variable =expression
Type of expression Example
Numeric constant Age =10;
Character constant Gender =‘Female’;
A old variable Age = age_at_baseline ;
Addition Age =age_at_baseline +10;
3
Home gardener's data • Gardeners were asked to estimate the pounds they harvested for four
corps:tomatoes, zucchini, peas and grapes. Here is the data:
Gregor 10 2 40 0
Molly 15 5 10 1000
Luther 50 10 15 50
Susan 20 0 . 20
• Task: add new variable group with a value of 14; add variable type to indicate home gardener; Create a new variable zucchini_1 which equals to zucchini*10 derive total pounds of corps for each gardener; derive % of tomatoes for each gardener
4
Home gardener's data
DATA homegarden;
INFILE 'F:\SAS\lecture4\Garden.txt';
INPUT Name $ 1-7 Tomato Zucchini Peas grapes;
group = 14;
Type = 'home';
Zucchini_1= Zucchini * 10;
Total=tomato + zucchini_1 + peas + grapes;
PerTom = (Tomato / Total) * 100;
Run;
5
Home gardener's data
• Check the log window: Missing values were generated as a result of performing an operation on missing values.
• Since for the last subject, we have missing values for peas, so we the variable total and PerTom, which are calculated from peas, are set to missing
6
SAS functions• SAS has over 400 functions, with the following general form:
Function-name (argument, argument, …)
• All functions must have parentheses even if they don’t require any arguments
• Example: X=Int(log(10)); Mean_score = mean(score1, score2, score3); The Mean function returns mean of non-missing arguments, which differs from
simply adding and dividing by their number, which would return a missing values if any arguments are missing
7
Common Functions And Operators
Functions ABS: absolute value EXP: exponential LOG: natural logarithm MAX and MIN: maximum and minimum SQRT: square root SUM: sum of variables
Example: SUM (of x1-x10, x21)
• Arithmetic: +, -, *, /, ** (not ^)
8
Example: pumpkin carving contest data • This data contains contestant’s name , age, type of pumpkin (carved or
decorated), date of entry and the scores from 5 judges.
Alicia Grossman 13 c 10-28-2003 7.8 6.5 7.2 8.0 7.9
Matthew Lee 9 D 10-30-2003 6.5 5.9 6.8 6.0 8.1
Elizabeth Garcia 10 C 10-29-2003 8.9 7.9 8.5 9.0 8.8
Lori Newcombe 6 D 10-30-2003 6.7 5.6 4.9 5.2 6.1
Jose Martinez 7 d 10-31-2003 8.9 9.510.0 9.7 9.0
Brian Williams 11 C 10-29-2003 7.8 8.4 8.5 7.9 8.0
• We will derive the means scores using the “Mean” function
• Transform values of “type” to upper case
• Get the day of the month from the SAS date
9
Example: pumpkin carving contest data
DATA contest;
INFILE 'F:\SAS\lecture4\Pumpkin.txt';
INPUT Name $16. Age Type $
@23 Date MMDDYY10.
(Scr1 Scr2 Scr3 Scr4 Scr5) (4.1);
AvgScore= MEAN(Scr1,Scr2,Scr3,Scr4, Scr5);
DayEntered = DAY(Date);
Type = UPCASE(Type);
run;
10
More SAS functions
Function Name Example Result
Max Y=Max(1, 3, 5); Y=5
Round Y=Round (1.236, 2); Y=1.24
Sum Y=sum(1, 3, 5); Y=9
Length a=‘my cat’; Y=Length (a);
Y=6
Trim a=‘my ’, b=‘cat’
Y=trim(a)||b
Y=‘mycat’
11
Selected date functions
functions Description Example Results
Today Returns current date X=today(); Today’s date
QTR Returns a yearly quarter from a SAS date value
X= QTR(366)
1
Month Return the month value from a SAS date value
X= Month(366)
1
Day Return the day value from a SAS date value
X= day (369) 4
MDY Returns a SAS date value from month, day and year input
X=MDY(1,1,60)
0
12
Working with SAS Date
• A SAS date is a numeric value equal to the number of days since Jan. 1, 1960. For example:
Date SAS date value
Jan. 1, 1959 -365
Jan. 1, 1960 0
Jan. 1, 1961 366
Jan. 1, 2003 15706
13
Using IF-THEN statement • IF-THEN statement is used for conditional
processing. Example: you want to derive means test scores for female students but not male students. Here we derive means conditioning on gender =‘female’
• Syntax: If condition then action; Eg: If gender =‘F’ then mean_score =mean(scr1, scr2);
14
Using IF-THEN statement
Logical comparison Mnemonic term symbol
Equal to EQ =
Not equal to NE ^= or ~=
Less than LT <
Less than or equal to LE <=
Greater than GT >
greater than or equal to GE >=
Equal to one in a list IN
List of Logical comparison operators
Note: Missing numeric values will be treated as the most negative values you can reference on your computer
15
Using IF-THEN statement• Example: We have data contains the following information
of subjects: Age Gender Midterm Quiz FinalExam
21 M 80 B- 8220 F 90 A 9335 M 87 B+ 8548 F 80 C 7659 F 95 A+ 9715 M 88 C 93
• Task: To group student based on their age (<20, [20-40), [40-60), >=60)
data conditional;
input Age Gender $ Midterm Quiz $2. FinalExam;
datalines;
21 M 80 B- 82
20 F 90 A 93
35 M 87 B+ 85
48 F 80 C 76
59 F 95 A+ 97
15 M 88 C 93
;
run;
data new1;
set conditional;
if Age < 20 then AgeGroup = 1;
if 20 <= Age < 40 then AgeGroup = 2;
if 40 <= Age < 60 then AgeGroup = 3;
if Age >= 60 then AgeGroup = 4;
Run;
16
17
Multiple conditions with AND and OR
• IF condition1 and condition2 then action;
• Eg:
If age <40 and gender=‘F’ then group=1;
If age <40 or gender=‘F’ then group=2;
18
IF-THEN statement, multiple conditions
• Example: We have data contains the following information of subjects: Age Gender Midterm Quiz FinalExam
21 M 80 B- 8220 F 90 A 9335 M 87 B+ 8548 F 80 C 7659 F 95 A+ 9715 M 88 C 93
• Task: To group student based on their age (<40, >=40),and gender
19
data new1;set conditional;If age <40 and gender='F' then group=1;If age >=40 and gender='F' then group=2;IF age <40 and gender ='M' then group=3;IF age >=40 and gender ='M' then group=4;run;
• Note: Missing numeric values will be treated as the most negative values you can reference on your computer
• Example: group age into age groups with missing values
21 M 80 B- 82
20 F 90 A 93
. M 87 B+ 85
48 F 80 C 76
59 F 95 A+ 97
. M 88 C 93
20
21
IF-THEN statement, with multiple actions
• Example: We have data contains the following information of subjects: Age Gender Midterm Quiz FinalExam
21 M 80 B- 8220 F 90 A 9335 M 87 B+ 8548 F 80 C 7659 F 95 A+ 9715 M 88 C 93
• Task: To group student based on their age, and assign test date based on the age group
22
Multiple actions with Do, end
• Syntax: IF condition then do; Action1 ;Action 2;End;
If age <=20 then do ;group=1;exam_date =“Monday”;
End;
23
IF-THEN/ELSE statement
• SyntaxIF condition1 then action1;Else if condition2 then action2;Else if condition3 then action3;
• IF-THEN/Else statement has two advantages than IF-THEN statement
(1) It is more efficient, use less computing time(2) Else logic ensures that your groups are mutually
exclusive so that you do not put one observation into more than one groups.
24
IF-THEN/ELSE statement
data new1;set conditional; if Age < 20 then AgeGroup = 1; else if Age >= 20 and Age < 40 then AgeGroup = 2; else if Age >= 40 and Age < 60 then AgeGroup = 3; else if Age >= 60 then AgeGroup = 4;run;
25
Subsetting your data
• You can subset you data using a IF statement in a data step
• Example:
Data new1;Set new;If gender =‘F’;
Data new1;Set new;If gender ^=‘F’ then delete;
26
The IN operator
• If you want to test if a value is one of the possible choices, you can use multiple “OR” statement like this:
IF grade =‘A’ or grade =‘B’ or grade =‘C’ then PASS=‘yes’;
• A alternative is to use a IN operator:
IF grade in (‘A’ ‘B’ ‘C’) then PASS=‘yes’;
IF grade in (‘A’ , ‘B’ ,‘C’) then PASS=‘yes’;
27
The iterative DO loop • Iterative DO loop is used to execute a group of SAS
statements multiple times
• One form of an iterative DO statements follows:
Do index-variable =start to stop by increment;
SAS statement;
End;
• Without increment, it defaults to 1
28
• Example: You want to compute the total amount of money you will have if you start with $100 and invested it at a 3.75% interest rate for 3 years.
data compound;Interest = .0375;Total = 100;do Year = 1 to 3; Total + Interest*Total; output;end;format Total dollar10.2;run;
29
• Example: suppose you want to generate a table of integers from 1 to 10, along with their squares and square roots:
data table; do n = 1 to 10; Square = n*n; SquareRoot = sqrt(n); output; end;run;
30
Simplifying programs with Arrays• SAS Arrays are a collection of elements (usually SAS
variables) that allow you to write SAS statements referencing this group of variables.
• Arrays are defined using Array statement as: ARRAY name (n) variable list
name: is a name you give to the array n: is the number of variables in the array
eg: ARRAY store (4) macys sears target costco
Store(1) is the variable for macysStore(2) is the variable for sears
31
Simplifying programs with Arrays• A radio station is conducting a survey asking people to rate 10
songs. The rating is on a scale of 1 to 5, with 1=Do not like the song; 5-like the song;
• IF the listener does not want to rate a song, he puts a “9” to indicate missing values
• Here is the data with location, listeners age and rating for 10 songs
Albany 54 4 3 5 9 9 2 1 4 4 9Richmond 33 5 2 4 3 9 2 9 3 3 3Oakland 27 1 3 2 9 9 9 3 4 2 3Richmond 41 4 3 5 5 5 2 9 4 5 5Berkeley 18 3 4 9 1 4 9 3 9 3 2
• We want to change 9 to missing values (.)
32
Simplifying programs with Arrays
DATA songs;INFILE 'F:\SAS\lecture4\radio.txt';INPUT City $ 1-15 Age domk wj hwow simbh kt aomm libm tr filp ttr;ARRAY song (10) domk wj hwow simbh kt
aomm libm tr filp ttr;DO i = 1 TO 10; IF song(i) = 9 THEN song(i) = .;END;run;