36
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward 1

Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

Embed Size (px)

Citation preview

Page 1: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward1

Introduction to SAS EssentialsMastering SAS for Data Analytics

Alan Elliott and Wayne Woodward

Page 2: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward2

Chapter 1: Getting Started

Page 3: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward3

LEARNING OBJECTIVES To be able to use the SAS® software program

in a Windows environment To understand the basic information about

getting data into SAS and running a SAS program.

To be able to run a simple SAS program.

Page 4: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward4

The Statistics Software Landscape SAS – largest, highly used in corporate and university

research settings, has several interfaces SPSS – IBM bought them recently – widely used in social

sciences (and now more in business) – Quick to learn in menu mode

STATA – Relatively new, command/code oriented JMP – a SAS product that is highly visual and menu

driven R – a user supported programming language, free and

expansive, but a larger learning curve EXCEL – has some statistical functions and procedures WINKS – Simple, low cost general use statistics

program, with a special version for Time Series MINITAB – Used in a number of intro stat courses

Page 5: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward5

Introducing SAS SAS is a programming language that

specializes in data analytics. There are several ways to use SAS – this book

teaches you how to use SAS code, which provides the most flexibility.

We’ll be using SAS in a Windows environment, although SAS is available on most platforms A brief tutorial on the University Edition is in the

book’s appendix Also, information on using the Citrix SAS (Apps)

version is included in the text

Page 6: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward6

Getting Started – Download Example Files Getting the data needed for the course Data files are available at

http//:www.alanelliott.com/sas

Choose one of two options to download files: OPTION 1 - Download self-extracting files

(copies SAS files to your hard disk, default location is C:\SASDATA) Recommended.

OPTION 2 – Download a zipped version of data files (You'll need a zip extractor such as WINZIP to extract files using this method.)

Page 7: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward7

1.1 Using SAS in A Windows Environment SAS runs on a number of computer platforms

(operating systems) including mainframes and personal computers whose operating systems are UNIX, Mac OS X, Linux, or Windows.

This book is based on using SAS in a Windows environment where you have the software installed on your local computer. The vast majority of the content in this book will apply to any SAS computer environment

Page 8: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward8

Creating a Folder for Storing Your SAS Files Install the data and SAS programming files in

your (Windows) C:\SASDATA folder.

C:\SASDATA

Your Hard Disk

SAS Program

We’ll be accessing SAS files from the C:\SASDATA folder on your computer

If you are using the University edition, consult Appendix E regarding the location of the data files. For other versions of SAS, see Table 1.1 on page 4 for information on accessing files.

Page 9: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward9

Understanding the SAS Interface

NOTE TABS

LOG Window

Program Editor

Explorer &

Results

Page 10: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward10

SAS Editor Also called the Enhanced Editor or Windows

Programming Editor (WPGM), this is the area where you write SAS code. It is like a simple word processor. When you open a previously saved SAS program, its contents will appear in this window.

SAS code is stored in plain ASCII text, so files saved in the ASCII format from any other editor or word processor may be easily opened in this editor.

You can also copy (or cut) text from another editor or word processor and paste it into the Editor window.

Page 11: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward11

SAS LogWhen you run a SAS program, a report detailing how (and if) the program ran, appears in the LOG window. Typically, when you run a SAS program, you will first look at the contents of the LOG window to see if there were any errors in the program. Errors appear in red. You should also look for warnings and other notes in the LOG Window to that tell you that some part of your program may not have run correctly.

Page 12: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward12

SAS Explorer/Results WindowThis window appears at the left of the screen and contains two tabs: The Results tab displays a tree listing of your output, making it easy to quickly scroll to any place in your output listing. The Explorer tab displays the available SAS libraries – these libraries are where SAS data files are stored.

Page 13: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward13

Graph WindowIf your SAS program creates graphic output, SAS will display a Graph window that contains the SAS graph. It is usually automatically displayed. If it does not appear click on the Graph tab to display this window.

Page 14: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward14

Results ViewerBeginning with SAS 9.3, SAS displays output from your SAS analysis in HTML format in the Results Viewer. We’ll discuss more about how to control this output in the section on Output Delivery System (ODS)

Page 15: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward15

Oops… What if you close one of these Windows? Reopen it by going to the View pull-down

menu and select the windows you want to reopen

Page 16: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward16

1.2 Your First SAS Analysis In SAS, choose File/Open Program… and open

the program named C:\SASDATA\First.SAS

Page 17: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward17

Note Sections in Code

This section defines the data for the analysis

This section defines an analysis

Page 18: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward18

Defining a Data Set

We’ll discuss this more in detail later, but notice that this code defines data set named EMPLOYEES that consists of two variables GENDER and AGE.

Page 19: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward19

Requesting an Analysis

The second section of the code requests an analysis. In this case, the requested analysis is called “MEANS” (which indicates simple statistics) and the CLASS (classification or grouping) of the data is request by GENDER.

Page 20: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward20

Run the code

To tell SAS to run (Submit) this code,

click on the “Running Man” icon

or select Run/Submit

Page 21: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward21

Output Created, Results Window

The output created by this simple program displays simple statistics for the numeric variables (AGE) in the data set – By Gender. This type output is HTML.

Page 22: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward22

What to do with output You can print from either window Save the output Copy and paste the results in Word,

PowerPoint, etc. Later, we’ll see how to automatically output

the analyses to .DOC, .PDF, etc

Page 23: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward23

Do Hands-On Example p 9 Make sure you are in the Editor window Open Second.SAS Add a line to the program. Run the program and observe the output

Page 24: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward24

Results

Page 25: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward25

1.3 How SAS Works

1. Define a data set in SAS using the DATA step (which begins with the key word DATA.) In this case, the data values are a part of the code (although it is not always the case.) The data values to be used in this analysis follow the keyword DATALINES.

2. Once you have a data set of values, you can tell SAS what analysis to perform using a procedure (PROC) statement In this case, the keywords PROC MEANS initiate the "MEANS" procedure.

3. Run the program and observe the output (in the Results Viewer).

Page 26: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward26

1.4 Tips and Tricks for Running SAS

Within a SAS program, each statement begins with an identifying keyword (DATA, PROC, INPUT, DATALINES, RUN, etc.) and ends with a semicolon “;”. For example:

DATA TEMP;PROC PRINT DATA=TEMP;RUN;

All three lines start with a SAS keyword…

Page 27: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward27

SAS Tips continued…

Statements can begin anywhere and end anywhere.

Statements can continue over several lines, ends with semi‑colon.

Several statements may be on the same line. Blanks, as many as you want but at least one,

separating the components (words) in a SAS program statement.

Case, (lower and upper) doesn’t matter in most SAS statements.

Case does make a difference in data and quoted information. (such as M or m for “MALE” or “male”).

Page 28: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward28

SAS Tips Continued… The most common error in SAS programming is a

misplaced (or missing) semicolon. A second common error is a missing RUN; statement. A third common error in a SAS program is the presence of

unbalanced quotation marks. Look for errors in a program log from the top down. If program errors cause problems that result in SAS

“freezing up” or not completing the steps in your program, a way to stop SAS from continuing to run is to press CTRL-Break and to select the “Cancel Submitted Statements” option.

If you cannot resolve a problem within SAS, save your files, exit the SAS program, and restart.

Make the structure of your SAS programs easy to read.

Page 29: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward29

Enhanced Editor Green - Comments appear in green. Dark Blue - Major SAS commands (also called

“step-boundaries”) begin with the keyword in dark blue.

Blue - Keywords that have special meaning as SAS commands appear in blue.

Yellow highlight - Data are highlighted in yellow.

Boundary Line - A section boundary line separates each step.

Page 30: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward30

SAS Function Keys (Defined)Function Key

SAS command Result

F2 RESHOW reshows window interrupted by system command

F3 END; /*GSUBMIT BUFFER=DEFAULT*/

submits SAS statements in clipboard

F4 RECALL recalls current SAS code to editor

F5 PROGRAM (PGM) display SAS program editor window

F6 LOG displays SAS log window

F7 OUTPUT displays SAS output window

F8 ZOOM OFF;SUBMIT

submits (runs) the current SAS program

F9 KEYS displays keys window

F10 NOT DEFINED  

F11 COMMAND BAR moves cursor to command bar

F12 NOT DEFINED  

Page 31: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward31

Define a new F12 Key (page 13) Step 1: Select Tools, Options, and Keys. Step 2: Next to the blank F12 option enter this code:

CLEAR LOG; ODSRESULTS; CLEAR; WPGM; (Press the Enter key to lock in the new command.)

Step 3: Exit the Key Window and try out this function key by re-running one of the previous examples. With the output displayed on the screen, press F12. The output will be cleared, the log file information will be cleared and the editor window will be displayed still containing the current program code. Thus, this command allows you to quickly clear the log and output windows while keeping the program code.

Page 32: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward32

SAS Menus (Change depending on Window) File Menu - Used for opening and saving files, and for printing. Edit Menu - Used to copy, cut and paste text, as well as to find and

replace text. View Menu - Allows you to go back and forth between viewing SAS

editor, log and output windows. Tools - Allows you to open programs for graphic, image, and text editing,

along with other options available to customize the program to your preferences.

Run - Allows you to run (submit) a SAS program and also contains options for accessing remote SAS options.

Solutions - Contains links to SAS programs that allow you to interactively design, select, and perform analyses.

Window - Contains options found in most Windows program that allow you to choose display strategies for opened windows such as tiled, cascade, etc. Also allows you to select a particular window to open such as Log, Output, etc.

Help - Contains options for the SAS Help System as well as online documentation.

Page 33: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward33

Common File Extensions SAS Code File (filename.sas): This is an ACSII text file and

may be edited using the SAS Editor, Notepad, or any text editor that can read an ASCII file.

SAS Log File (filename.log): This ASCII text file contains information such as errors, warnings, and data set details for a submitted SAS program.

SAS Results File (filename.mht or filename.html): This file contains theweb-formatted output such as that displayed in the Results Viewer. HTML standsfor Hyper-Text Markup Language and is the common language of Internet web files.

MHT is short for MHTML and stands for Microsoft (or MIME) Hypertext Archive file. It is a type of HTML file that contains the entire html-coded information in a single file (whereas HTML files may access external files for some components such as graphs.)

Page 34: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward34

More extensions SAS Data File (filename.sas7bdat): This file contains a

SAS data set that includes variable names, labels, and the results of calculations you may have performed in a DATA step. You cannot open or read this file except in SAS or in a few other programs that can read SAS data files.

Raw Data Files (filename.dat or filename.txt or filename.csv): These ASCII (text) files contain raw data that can be imported into SAS or edited in an editor such as Notepad.

Excel File (filename.xls or filename.xlsx): The data in a Microsoft Excel file (when properly formatted into a table of columnar data) can be imported into SAS. (We’ll discuss data file types that can be imported into SAS in Chapter 3.)

Page 35: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward35

Help -> SAS Help and Documentation

Page 36: Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1

SAS Essentials - Elliott & Woodward36

1.5 Summary, Chapter 1 This chapter provided an overview of SAS and

examples of how to run an existing SAS program. In the following chapters, we will discuss the components of a SAS program, including how to enter data, how to request analyses, and how to format and read output.

Continue to Chapter 2: Getting Data Into SAS.