25
Introduction to Where Expressions Mark Tabladillo, Ph.D. Software Developer, MarkTab Consulting Associate Faculty University of Phoenix Associate Faculty, University of Phoenix January 30, 2007

Introduction to SAS System Where Expressions

Embed Size (px)

DESCRIPTION

The where expression allows for declaring filters on SAS System datasets. This presentation illustrates some uses in the data step and SAS Macro Language.

Citation preview

Page 1: Introduction to SAS System Where Expressions

Introduction to Where ExpressionsMark Tabladillo, Ph.D.

Software Developer, MarkTab ConsultingAssociate Faculty University of PhoenixAssociate Faculty, University of Phoenix

January 30, 2007

Page 2: Introduction to SAS System Where Expressions

IntroductionIntroduction

• WHERE expressions allow for processingWHERE expressions allow for processing subsets of observations

• WHERE expressions can be used in theWHERE expressions can be used in the DATA step or with PROC (procedure) statements

• This presentation will contain a series of features and examples of the WHERE pexpression

• We end with some intensive macros

Page 3: Introduction to SAS System Where Expressions

WHERE-expression ProcessingWHERE expression Processing

• Enables us to conditionally select a subsetEnables us to conditionally select a subset of observations, so that SAS processes only the observations that meet a set ofonly the observations that meet a set of specified conditions.

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999253.htm

Page 4: Introduction to SAS System Where Expressions

Work Sales DatasetWork.Sales Datasetdata work.sales (drop=i randomState);

length state $2 sales 8 randomState 3;length state $2 sales 8 randomState 3;do i = 1 to 2500;

randomState = round(rand('gaussian',3,1)+0.5);if randomState in (1,2,3,4,5) then do;( )

select(randomState);when(1) state='TN';when(2) state='AL';when(3) state='GA';when(3) state= GA ;when(4) state='FL';when(5) state='MS';

end;sales = int(rand('gaussian',1000000,500000));output work.sales;

end;end;end;run;

Page 5: Introduction to SAS System Where Expressions

Data Set Option or StatementData Set Option or Statementdata work.highSales;

set work.sales (where=(sales>1500000));run;

data work highSales;data work.highSales;set work.sales;where sales>1500000;run;

proc means data=work.sales;where sales>1500000;run;;

Page 6: Introduction to SAS System Where Expressions

Data Set Option or StatementData Set Option or Statementdata work.lowSales;

set work.sales (where=(sales<0));run;

data work lowSales;data work.lowSales;set work.sales;where sales<0;run;

proc means data=work.sales (where=(sales<0));run;

Page 7: Introduction to SAS System Where Expressions

Multiple ComparisonsMultiple Comparisonsdata work.highFloridaSales;

set work.sales (where=(sales>1500000 and state = 'FL'));run;

data work highFloridaSales;data work.highFloridaSales;set work.sales;where sales>1500000 and state = 'FL';run;

proc freq data=work.sales;tables state;where sales>1500000 and state = 'FL';;run;

Page 8: Introduction to SAS System Where Expressions

SAS FunctionsSAS Functionsdata work.highFloridaSales;

set work.sales (where=(sales>1500000 and substr(state,1,1) = 'F'));run;

data work highFloridaSales;data work.highFloridaSales;set work.sales;where sales>1500000 and substr(state,1,1) = 'F';run;

proc means data=work.sales;where sales>1500000 and substr(state,1,1) = 'F';run;;

Page 9: Introduction to SAS System Where Expressions

Comparison OperatorsComparison OperatorsPriority Order of

EvaluationSymbols Mnemonic

EquivalentEvaluation EquivalentGroup I right to left **

++-

ˆ ¬ ~ NOT>< MIN<> MAX

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm

Page 10: Introduction to SAS System Where Expressions

Comparison OperatorsComparison OperatorsPriority Order of

EvaluationSymbols Mnemonic

EquivalentEvaluation EquivalentGroup II left to right *

//Group

IIIleft to right +

-Group

IVleft to right || ¦¦ !!

IV

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm

Page 11: Introduction to SAS System Where Expressions

Comparison OperatorsComparison OperatorsPriority Order of

EvaluationSymbols Mnemonic

EquivalentEvaluation EquivalentGroup

Vleft to right < LT

<= LE= EQ

¬= NE>= GE> GT

IN

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm

Page 12: Introduction to SAS System Where Expressions

Comparison OperatorsComparison OperatorsPriority Order of

EvaluationSymbols Mnemonic

EquivalentEvaluation EquivalentGroup

VIleft to right & AND

Group VII

left to right | ¦ ! OR

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000780367.htm

Page 13: Introduction to SAS System Where Expressions

Comparison OperatorsComparison Operatorsdata work.extremeNonGeorgia;

set work.sales (where=((sales<0 | sales>1500000) and state in ('TN','AL','FL','MS')));

run;

data work.extremeNonGeorgia;set work.sales;where (sales<0 | sales>1500000) and state in ('TN','AL','FL','MS');run;

data work.extremeNonGeorgia;set work.sales;;where ^ (0 <= sales <= 1500000) & state ne 'GA';run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 14: Introduction to SAS System Where Expressions

“Between And”Between Anddata work.boundedNonGeorgia;

set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS')));

run;

data work.boundedNonGeorgia;set work.sales;where (sales between 1000000 and 1500000) &

t t i ('TN' 'AL' 'FL' 'MS')state in ('TN','AL','FL','MS');run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 15: Introduction to SAS System Where Expressions

Contains ?Contains ?data work.LStates;

set work.sales (where=(state contains 'L'));run;

data work LStates;data work.LStates;set work.sales;where state contains 'L';run;

data work.LStates;set work.sales;where state ? 'L';;run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 16: Introduction to SAS System Where Expressions

Is Null/Is MissingIs Null/Is Missingdata work.nullStates;

set work.sales (where=(state is null));run;

data work.missingStates;set work.sales (where=(state is missing));se o sa es ( e e (s a e s ss g));run;

data work.nullSales;set work sales;set work.sales;where sales is missing;run;

data work.nonNullSales;set work.sales;where sales is not missing;run;run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 17: Introduction to SAS System Where Expressions

LikeLikedata work.likeL;

set work sales (where=(state like '%L'));set work.sales (where=(state like '%L'));run;

data work.likeL;set work.sales (where=(state like "%L"));run;

data work likeL;data work.likeL;set work.sales (where=(state like "%%L"));run;

data work.notLikeG;set work.sales;where state not like 'G_';run;run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 18: Introduction to SAS System Where Expressions

Sounds Like (Soundex)Sounds Like (Soundex)data work.soundsLikeFill;

set work.sales (where=(state =* 'fill'));run;

data work notSoundsLikeTin;data work.notSoundsLikeTin;set work.sales;where state not =* 'tin';run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 19: Introduction to SAS System Where Expressions

“Same And”Same Anddata work.boundedNonGeorgia;

set work.sales (where=((sales between 1000000 and 1500000) & state in ('TN','AL','FL','MS')));

run;

data work.boundedNonGeorgia;set work.sales;where (sales between 1000000 and 1500000);

h d t t i ('TN' 'AL' 'FL' 'MS')where same and state in ('TN','AL','FL','MS');run;

data work.boundedNonGeorgia;g ;set work.sales;where same and (sales between 1000000 and 1500000); where same and state in ('TN','AL','FL','MS');run;run;

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a000999255.htm

Page 20: Introduction to SAS System Where Expressions

WHERE vs Subsetting IFWHERE vs. Subsetting IFTask Method

Make the selection in a procedure without using a WHERE expressionMake the selection in a procedure without using a preceding DATA step

WHERE expression

Take advantage of the efficiency available with an indexed data set

WHERE expression

Use one of a group of special operators, such as BETWEEN-AND, CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like

WHERE expression

B th l ti thi th th i bl l b tti IFBase the selection on anything other than a variable value that already exists in a SAS data set. For example, you can select a value that is read from raw data, or a value that is calculated or assigned during the course f th DATA t

subsetting IF

of the DATA stepMake the selection at some point during a DATA step

rather than at the beginningsubsetting IF

Execute the selection conditionally subsetting IFExecute the selection conditionally subsetting IF

http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a001000521.htm

Page 21: Introduction to SAS System Where Expressions

Intensive Dataset GenerationIntensive Dataset Generation%macro OurCentury();%local year interest;y ;%do year = 2001 %to 2100;

%let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001)));data work.sales&year. (drop=i randomState index=(state sales));

length state $2 stateName $20 sales 8 randomState 3;g ;do i = 1 to 2500;

randomState = round(56*rand('uniform')+0.5);if randomState <= 56 and randomState not in (3,7,14,43,52) then do;

state = fipstate(randomState);p ( )stateName = fipnameL(randomState);sales = int(rand('gaussian',1000000*&interest.,500000*&interest.));output work.sales&year.;

end;end;run;

%end;%mend OurCentury;y%OurCentury;

Page 22: Introduction to SAS System Where Expressions

Year/State DatasetsYear/State Datasets%macro SalesByYearState();%local year stateCode state;%do year = 2001 %to 2100;

%do stateCode = 1 %to 56;%if &stateCode ne 3 & &stateCode ne 7 & &stateCode ne 14 &%if &stateCode. ne 3 & &stateCode. ne 7 & &stateCode. ne 14 &

&stateCode. ne 43 & &stateCode. ne 52 %then %do;%let state = %sysfunc(fipstate(&stateCode.));data work.sales&year.&state.;

t k l &set work.sales&year.;where state = "&state.";run;

%end;;%end;

%end;%mend SalesByYearState;%SalesByYearState;%SalesByYearState;

Page 23: Introduction to SAS System Where Expressions

Year/State High Sales DatasetsYear/State High Sales Datasets%macro HighSalesByYearState();%local year stateCode state interest keepDataset;%do year = 2001 %to 2100;

%let interest = %sysfunc(compound(1,.,0.05,%eval(&year.-2001)));%do stateCode = 1 %to 56;

%if &stateCode. ne 3 & &stateCode. ne 7 & &stateCode. ne 14 & &stateCode. ne 43 &&stateCode. ne 52 %then %do;&stateCode. ne 52 %then %do;%let state = %sysfunc(fipstate(&stateCode.));%let keepDataset = 0;data work.sales&year.&state.high;

set work.sales&year.;h t t "& t t " d l > 2000000*&i t twhere state = "&state." and sales > 2000000*&interest.;

call symput('keepDataset',left('1'));run;

%if not(&keepDataset.) %then %do;proc datasets lib=work nolist;p

delete sales&year.&state.high;run; quit;

%end;%end;

%end;%end;%end;%mend HighSalesByYearState;%HighSalesByYearState;

Page 24: Introduction to SAS System Where Expressions

ConclusionConclusion

• The WHERE expression allows forThe WHERE expression allows for efficient observation processing in the DATA step and the PROC statementsDATA step and the PROC statements

• The SAS System Documentation provides specific details on the syntaxspecific details on the syntax

• Using macros increases the processing f WHERE ipower of WHERE expressions

Page 25: Introduction to SAS System Where Expressions

Contact InformationContact Information

• Mark TabladilloMark TabladilloMarkTab Consultinghttp://www marktab com/http://www.marktab.com/