Upload
dodan
View
216
Download
0
Embed Size (px)
Citation preview
Click to add text
Easy Come, Easy Go —
Interactions between the
DATA Step and External Files
Andrew T. Kuligowski,
HSN
For the good folks in Toronto –
June, 2015
Interactions Between the DATA Step and
External Files
2
Interactions … External Files
Introduction
• Introduction
• Defining the external data source
• Describing the appearance of the
data
• Practical Application
• Conclusion
Interactions Between the DATA Step and
External Files
3
Interactions … External Files
Introduction
This presentation will provide basic
answers to two questions:
• How do I get my data into SAS for analysis?
• How do I get my data out of SAS?
Interactions Between the DATA Step and
External Files
4
Interactions … External Files
Introduction
Process of moving data from an external
file to SAS, and from SAS to an external
source, are mirror images of each other.
(Almost)
Interactions Between the DATA Step and
External Files
5
Interactions … External Files
Introduction
You need to know the answer to two
questions in order to code properly.
• Where is this external source /
destination? (and, is there anything
significant about this file?)
• What does the data look like coming in
/ going out?
Interactions Between the DATA Step and
External Files
6
Interactions … External Files
Introduction
"The time has come," the Walrus said,"To talk of many things:Of shoes--and ships--and sealing-wax--Of cabbages--and kings--And why the sea is boiling hot--And whether pigs have wings."
Lewis Carroll
"I am the Walrus."
John Lennon
Interactions Between the DATA Step and
External Files
7
Interactions … External Files
Defining the External Data Source
Two statements that will identify an
external file to DATA step (and possibly
provide a little information about its
form).
• INFILE statement
• FILE statement
Interactions Between the DATA Step and
External Files
8
Interactions … External Files
Defining the External Data Source
Most straightforward usage – hardcode
the file name right in the statement.
DATA SAMPLE ;
/* full file name - under Windows */
INFILE 'c:\sasconf\sasconf.dat';
Self-documenting code!
Requires coding change to point
to a different dataset.
PLUS
MINUS
Interactions Between the DATA Step and
External Files
9
Interactions … External Files
Defining the External Data Source
Variation – use a macro variable.
%LET FILENM = c:\sasconf\sasconf.dat;
DATA SAMPLE ;
/* macro file name - under Windows */
INFILE “&FILENM”;
Of course, this isn’t much more flexible.
However, with a little creative coding,
this approach can be made
very flexible.
Interactions Between the DATA Step and
External Files
10
Interactions … External Files
Defining the External Data Source
More flexibility with FILEVAR statement.
%LET FileNm = %STR(SASCONF);
DATA SAMPLE ;
RETAIN fn “&FileNm" ;
File_in = TRIM( fn ) || ".dat" ;
File_out = TRIM( fn ) || ".out" ;
INFILE dummy FILEVAR=File_in ;
FILE dummy2 FILEVAR=File_out ;
Can be hardcoded, or it can
change from obs to obs.
Interactions Between the DATA Step and
External Files
11
Interactions … External Files
Defining the External Data Source
More flexibility with FILEVAR statement.
%LET FileNm = %STR(SASCONF);
DATA SAMPLE ;
RETAIN fn “&FileNm" ;
File_in = TRIM( fn ) || ".dat" ;
File_out = TRIM( fn ) || ".out" ;
INFILE dummy FILEVAR=File_in ;
FILE dummy2 FILEVAR=File_out ;
Can be hardcoded, or it can
change from obs to obs.File Reference
Set to “dummy” to
show these are not
actually used by the
routine when FILEVAR
used, but different from
each other to show they
must be unique.
Interactions Between the DATA Step and
External Files
12
Interactions … External Files
Defining the External Data Source
Defining a File Reference
(1) In the Operating System:
//SAMPDATA DD DSN=SAS.GLOBAL.FORUM.SAMPDAT,
// DISP=SHR
Example uses MVS – shows why File
References are also called
“DDNAMES”.
Interactions Between the DATA Step and
External Files
13
Interactions … External Files
Defining the External Data Source
Defining a File Reference
(2) Using the FILENAME statement:
FILENAME confdata 'c:\sasconf\sasconf.dat';
Example uses Windows.
Interactions Between the DATA Step and
External Files
14
Interactions … External Files
Defining the External Data Source
Defining a File Reference
(2) Using the FILENAME statement
(2a) Special options for FILENAME:
FILENAME <fileref> CLEAR;
Remove / release specified File Ref.
FILENAME <fileref> LIST;
Print a list of active File Refs to SASLOG.
Interactions Between the DATA Step and
External Files
15
Interactions … External Files
Defining the External Data Source
Defining a File Reference
(2) Using the FILENAME statement
(2a) Special options for FILENAME:
(2ai) Special pseudo File Reference:
FILENAME _ALL_ LIST;
FILENAME _ALL_ CLEAR;
Perform action on all active File Refs
Interactions Between the DATA Step and
External Files
16
Interactions … External Files
Defining the External Data Source
Defining a File Reference
(3) Special “Aggregate File Reference”
MVS: Partition Dataset
FILENAME confdata
'userid.sasconf.data';
DATA SAS_Conf07;
INFILE confdata(sascon07);
Interactions Between the DATA Step and
External Files
17
Interactions … External Files
Defining the External Data Source
Defining a File Reference
(3) Special “Aggregate File Reference”
MVS: Partition Dataset
Unix: Directory
Windows: DirectoryFILENAME confdata 'c:\sasconf';
DATA SAS_Conf07;
INFILE confdata(sascon07);
Interactions Between the DATA Step and
External Files
18
Interactions … External Files
Defining the External Data Source
Other related functions …
FILEREFRetCd2 = FILEREF( "confdat2" );
Return
Code Meaning0 File Reference is assigned
>0 (+) File Reference not currently assigned
<0 (-) File Reference assigned, but file it
refers to does not (currently) exist.
Interactions Between the DATA Step and
External Files
19
Interactions … External Files
Defining the External Data Source
Other related functions …
FILEEXISTIF FILEEXIST("c:\sasconf\sasconf2.dat")
Return
Code Meaning0 External File does not exist
1 External File does exist
Interactions Between the DATA Step and
External Files
20
Interactions … External Files
Defining the External Data Source
Other related functions …
FEXISTIF FEXIST("confdat2" ) THEN DO;
Return
Code Meaning0 External File does not exist
1 External File does exist
Interactions Between the DATA Step and
External Files
21
Interactions … External Files
Defining the External Data Source
Specialized File Ref. – Input Data Only
DATALINES
• Data can be found instream at end of
DATA step, denoted by DATALINES
statement.
• Instream data concluded by semicolon.
Interactions Between the DATA Step and
External Files
22
Interactions … External Files
Defining the External Data Source
Specialized File Ref. – Input Data Only
DATALINESDATA SAS_Conf07;
INFILE DATALINES; /* optional for DATALINES */
INPUT @ 1 Start_Dt MMDDYY8.
@ 10 End_Dt MMDDYY8.
@ 19 ConfName $CHAR16.
@ 36 ConfLoc $CHAR16. ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
11/04/07 11/06/07 SESUG Hilton Head SC
;
Interactions Between the DATA Step and
External Files
23
Interactions … External Files
Defining the External Data Source
Specialized File Ref. – Input Data Only
DATALINES
• Assumed to becard-image (80 bytes)
• Can override with systemoption NOCARDIMAGE.
Interactions Between the DATA Step and
External Files
24
Interactions … External Files
Defining the External Data Source
Specialized File Ref. – Input Data Only
DATALINES
• Alias = CARDS.
• Can also use DATALINES4(or CARDS4) if data mightcontain semicolon.
Interactions Between the DATA Step and
External Files
25
Interactions … External Files
Defining the External Data Source
Specialized File Ref. – Input Data OnlyDATALINES vs. CARDS
18 DATA _NULL_;
19 SET SAS_Conf07;
20 FILE CARDS;
NOTE: The file CARDS cannot be opened
for UPDATE processing.
21 PUT _N_= ConfName ;
22 RUN;
NOTE: The SAS System stopped processing
this step because of errors.
Interactions Between the DATA Step and
External Files
26
Interactions … External Files
Defining the External Data Source
Specialized File Ref. – Input Data OnlyDATALINES vs. CARDS
62 DATA _NULL_;
63 SET SAS_Conf07;
64 FILE DATALINES;
65 PUT _N_= ConfName ;
66 RUN;
_N_=1 SAS Global Forum
… … … …
_N_=8 NESUG
NOTE: There were 8 observations read
from the data set WORK.SAS_CONF07.
Where’s the
error message?
Interactions Between the DATA Step and
External Files
27
Interactions … External Files
Defining the Appearance of the Data
Two questions (from early in this talk):
Where is the data / where is it going?
What does the data look like?
5 types of INPUT and PUT, although SAS
only counts 4 of them as separate types.
• LIST input
• FORMATTED input
• COLUMN input
• NAMED input
• NULL input
Interactions Between the DATA Step and
External Files
28
Interactions … External Files
Defining the Appearance of the Data
•Two questions (from early in this talk):
Where is the data / where is it going?
• What does the data look like?
•5 types of INPUT and PUT, although SAS
only counts 4 of them as separate types.
• LIST input
• FORMATTED input
• COLUMN input
• NAMED input
• NULL input
Interactions Between the DATA Step and
External Files
29
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)
• Simply insert variables in
statement.
• No descriptive information
necessary (except “$”
denoting character vars.)
• Blank space separates values.
Interactions Between the DATA Step and
External Files
30
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)
Example of non-working codeDATA SAS_Conf07;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $
ConfLoc $ ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
PROBLEM: Dates have
specialized formats.
Interactions Between the DATA Step and
External Files
31
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $
ConfLoc $ ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
SOLUTION: Use INFORMAT statement.
Interactions Between the DATA Step and
External Files
32
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $
ConfLoc $ ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
PROBLEM: Values longer
than 8 characters.
Interactions Between the DATA Step and
External Files
33
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
LENGTH ConfName $ 16 ConfLoc $ 16. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $
ConfLoc $ ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
SOLUTION: UseLENGTH statement.
Interactions Between the DATA Step and
External Files
34
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
LENGTH ConfName $ 16 ConfLoc $ 16. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $
ConfLoc $ ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
PROBLEM: Embedded Blanks
in character strings.
Interactions Between the DATA Step and
External Files
35
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
LENGTH ConfName $ 16 ConfLoc $ 16. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $ &
ConfLoc $ & ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
SOLUTION: Use “&” format
modifiers.
Interactions Between the DATA Step and
External Files
36
LIST Input (and Output)
QUICK ASIDE – Format Modifiers
& Ampersand Permit single embedded blanks in character variables.
: Colon Ignore its default 8-character maximum on character variables.
? Question Mark
Bypass error processing should invalid data be expected andacceptable to the application - for example, attempting to reada character field into a numeric variable. Suppress the“Invalid Data” message, allowing the other indications of anerror to be dealt with normally.
?? Double Question Mark
Bypass error processing AND conceal every indication of anerror. The “Invalid Data” message and the echoing of theinput line containing the bad data are omitted from theSASLOG, and the _ERROR_ variable is not set to “true”(numeric 1).
~ Tilde Read delimiters within quoted character values as characters, and retains the quotation marks.
Interactions … External Files
Defining the Appearance of the Data
Interactions Between the DATA Step and
External Files
37
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
LENGTH ConfName $ 16 ConfLoc $ 16. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $ &
ConfLoc $ & ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
;
SOLUTION: Use “&” format
modifiers.SOLUTION: Move CONFLOC
variable 1 position to the right.
Interactions Between the DATA Step and
External Files
38
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
LENGTH ConfName $ 16 ConfLoc $ 16. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $ &
ConfLoc $ & ;
ConfName = TRANSLATE( ConfName, ' ', '-' );
ConfLoc = TRANSLATE( ConfLoc, ' ', '-' );
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando-------FL
06/03/07 06/06/07 PharmaSUG Denver--------CO
09/16/07 09/18/07 PNWSUG Seattle-------WA
09/30/07 10/02/07 SCSUG Austin--------TX
10/17/07 10/19/07 WUSS San Francisco-CA
10/28/07 10/30/07 MWSUG Des Moines----IA
11/04/07 11/06/07 SESUG Hilton Head---SC
SOLUTION: Use “&” format
modifiers.SOLUTION: Move CONFLOC
variable 1 position to the right.SOLUTION: Insert dashes to
separate city from state,
change them back to blanks.
Interactions Between the DATA Step and
External Files
39
Interactions … External Files
Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07;
INFORMAT Start_Dt End_Dt mmddyy8. ;
LENGTH ConfName $ 16 ConfLoc $ 16. ;
INFILE DATALINES ;
INPUT Start_Dt
End_Dt
ConfName $ &
ConfLoc $ & ;
ConfName = TRANSLATE( ConfName, ' ', '-' );
ConfLoc = TRANSLATE( ConfLoc, ' ', '-' );
DATALINES;
04/16/07 04/19/07 SAS Global Forum Orlando-------FL
06/03/07 06/06/07 PharmaSUG Denver--------CO
09/16/07 09/18/07 PNWSUG Seattle-------WA
09/30/07 10/02/07 SCSUG Austin--------TX
10/17/07 10/19/07 WUSS San Francisco-CA
10/28/07 10/30/07 MWSUG Des Moines----IA
11/04/07 11/06/07 SESUG Hilton Head---SC
SOLUTION: Use “&” format
modifiers.SOLUTION: Move CONFLOC
variable 1 position to the right.SOLUTION: Insert dashes to
separate city from state,
change them back to blanks.
Interactions Between the DATA Step and
External Files
40
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
• Can specify informat / format for
variables.(What does each variable look like?)
• Can specify starting position in record.(Where is each variable on each record?)
Interactions Between the DATA Step and
External Files
41
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Informats and Formats
Handles length and
appearance of data
Interactions Between the DATA Step and
External Files
42
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Informats and Formats
5 types of Informat / Format:
• Character
• Numeric
• Date / Time
• Column Binary
• User-Defined
Not covered in
this presentation.Not covered in
this presentation.
PROC FORMAT
Interactions Between the DATA Step and
External Files
43
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Informats and Formats
Formats can be assigned with:
• INFORMAT statement
(or FORMAT statement)
• ATTRIB statement, using INFORMAT=
FORMAT=
• Specifying on INPUT / PUT
Interactions Between the DATA Step and
External Files
44
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Column Pointers
Two types of Column Pointers
• Relative Column Pointers +
• Absolute Column Pointers @
Interactions Between the DATA Step and
External Files
45
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Column Pointers
File: 'c:\sasconf\sascon07.dat'
04/16/07 04/19/07 SAS Global Forum Orlando FL
06/03/07 06/06/07 PharmaSUG Denver CO
09/16/07 09/18/07 PNWSUG Seattle WA
09/30/07 10/02/07 SCSUG Austin TX
10/17/07 10/19/07 WUSS San Francisco CA
10/28/07 10/30/07 MWSUG Des Moines IA
11/04/07 11/06/07 SESUG Hilton Head SC
11/11/07 11/14/07 NESUG Baltimore MD
Interactions Between the DATA Step and
External Files
46
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Column Pointers – Relative
DATA SAS_Conf07;
RETAIN pt_loc 18 ;
INFILE 'c:\sasconf\sascon07.dat';
INPUT Start_Dt mmddyy8. +1
End_Dt mmddyy8. +1
ConfName $char16. +1
ConfLoc $char16. ;
FILE 'c:\sasconf\sascon07a.dat';
PUT Start_Dt date7. +10
ConfName $char16. +(-25)
End_Dt date7. +(pt_loc)
ConfLoc $char16. ;
RUN;
Basic example
on INPUT
statement.
More advanced
example on PUT
statement.
Interactions Between the DATA Step and
External Files
47
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Column Pointers – Absolute
DATA SAS_Conf07;
RETAIN pt_loc 36 ;
INFILE 'c:\sasconf\sascon07.dat';
INPUT @ 1 Start_Dt mmddyy8.
@ 10 End_Dt mmddyy8.
@ 19 ConfName $char16.
@ 36 ConfLoc $char16. ;
FILE 'c:\sasconf\sascon07b.dat';
PUT @ 1 Start_Dt mmddyy8.
@ 19 ConfName $char16.
@ 10 End_Dt mmddyy8.
@ pt_loc ConfLoc $char16.;
RUN;
Basic example
on INPUT
statement.
More advanced
example on PUT
statement.
Interactions Between the DATA Step and
External Files
48
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Column Pointers – Absolute – Character
DATA SAS_Conf07;
RETAIN pt_loc "LOC:" ;
INFILE DATALINES ;
INPUT @ 1 Start_Dt mmddyy8.
@ '07 ' ConfName $char16.
@ 10 End_Dt mmddyy8.
@ pt_loc ConfLoc $char16. ;
DATALINES;
04/16/07 04/19/07 SAS Global Forum LOC:Orlando FL
06/03/07 06/06/07 PharmaSUG LOC:Denver CO
09/16/07 09/18/07 PNWSUG LOC:Seattle WA
09/30/07 10/02/07 SCSUG LOC:Austin TX
11/04/07 11/06/07 SESUG LOC:Hilton Head SC
;
Even more
advanced
example
Interactions Between the DATA Step and
External Files
49
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Column Pointers – Absolute – Character
Only use a pointer with a character string
on INPUT. Try it with PUT, and you get:ERROR: The @'CHARACTER_STRING' or
@CHARACTER_VARIABLE specification is
valid on the PUT statement only in
conjunction with the FILE ODS option.
The execution of the DATA STEP is
being terminated.
Interactions Between the DATA Step and
External Files
50
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Line Pointers
Two types of Column Pointers
• Relative Column Pointers /
• Absolute Column Pointers #
Interactions Between the DATA Step and
External Files
51
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Line Pointers – Absolute
DATA SAS_Conf07;
RETAIN line_loc 2 ;
INFILE DATALINES ;
INPUT #1 Start_Dt mmddyy8.
#line_loc End_Dt mmddyy8. +1
ConfName $char16.
#(line_loc+1) ConfLoc $char16. ;
… … …
DATALINES;
04/16/07
04/19/07 SAS Global Forum
Orlando FL
09/16/07
09/18/07 PNWSUG
… … …
Interactions Between the DATA Step and
External Files
52
Interactions … External Files
Defining the Appearance of the Data
Formatted Input (and Output)
Line Pointers – Relative
DATA SAS_Conf07;
… … …
FILE 'c:\sasconf\conftbl.dat' ;
PUT @ 1 ConfName $char16.
@ 21 Start_Dt date7.
" to“
/ @ 1 ConfLoc $char16.
@ 21 End_Dt date7.
/ 28*'-';
DATALINES;
… … …
;
SAS Global Forum 16APR07 to
Orlando FL 19APR07
----------------------------
PNWSUG 16SEP07 to
Seattle WA 18SEP07
----------------------------
SCSUG 30SEP07 to
Austin TX 02OCT07
----------------------------
SESUG 04NOV07 to
Hilton Head SC 06NOV07
----------------------------
Interactions Between the DATA Step and
External Files
53
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
• List each variable with its start and end
position.
Interactions Between the DATA Step and
External Files
54
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
• List each variable with its start and end
position.
• No flexibility to use variables to specify
start and end column.
• Cannot specify formats.
Interactions Between the DATA Step and
External Files
55
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
Working Code
1 DATA SAS_Conf07;
2 INFILE 'c:\sasconf\sascon07.dat' ;
3 INPUT Start_DtC $ 1 - 8
4 End_DtC $ 10 – 17
5 ConfName $ 19 – 34
6 ConfLoc $ 36 - 51 ;
7 Start_Dt = INPUT( Start_DtC, mmddyy8. ) ;
8 End_Dt = INPUT( End_DtC, mmddyy8. ) ;
9 RUN;
NOTE: The data set WORK.SAS_CONF07 has 8 observations
and 6 variables.
Interactions Between the DATA Step and
External Files
56
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
Non-Working Code – Flexible Columns
40 DATA SAS_Conf07;
41 RETAIN NamePos 19 ;
42 INFILE 'c:\sasconf\sascon07.dat' ;
43 INPUT Start_DtC $ 1 - 8
44 End_DtC $ 10 – 17
45 ConfName $ NamePos – 34
--
22
ERROR 22-322: Expecting a name.
46 ConfLoc $ 36 - 51 ;
ERROR: Missing numeric suffix on a numbered variable
list (NamePos-NAME).
…
Interactions Between the DATA Step and
External Files
57
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
Non-Working Code – Flexible Formats
61 DATA SAS_Conf07;
62 INFILE 'c:\sasconf\sascon07.dat' ;
63 INPUT Start_Dt mmddyy8. $ 1 - 8
-
22
ERROR 22-322: Expecting a name.
64 End_DtC $ 10 – 17
65 ConfName $ NamePos – 34
66 ConfLoc $ 36 - 51 ;
…
Interactions Between the DATA Step and
External Files
58
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
Non-Working Code – Flexible Formats
81 DATA SAS_Conf07;
82 INFILE 'c:\sasconf\sascon07.dat' ;
83 INPUT Start_Dt $ 1 - 8 mmddyy8.
--------
22
ERROR 22-322: Expecting a name.
84 End_DtC $ 10 – 17
85 ConfName $ NamePos – 34
86 ConfLoc $ 36 - 51 ;
…
Interactions Between the DATA Step and
External Files
59
Interactions … External Files
Defining the Appearance of the Data
Column Input (and Output)
Non-Working Code – Flexible Formats
174 FORMAT Start_Dt End_Dt DATE7. ;
175 FILE LOG ;
176 PUT Start_Dt 1 - 8
177 End_Dt 10 – 17
178 ConfName 19 – 34
179 ConfLoc $ 36 - 51 ;
180 RUN;
17272 17275 SAS Global Forum Orlando FL
17320 17323 PharmaSUG Denver CO
17425 17427 PNWSUG Seattle WA
17439 17441 SCSUG Austin TX
17456 17458 WUSS San Francisco CA
17467 17469 MWSUG Des Moines IA
17474 17476 SESUG Hilton Head SC
Interactions Between the DATA Step and
External Files
60
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:
Fieldname=value
• INPUT statement can
only contain Named
input if Named input is
requested.
Interactions Between the DATA Step and
External Files
61
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:
Fieldname=value
• INPUT statement can
only contain Named
input once Named input
usage has begun
in that statement.
Named input is
more flexible /
traditional when
using the PUT
statement.
Interactions Between the DATA Step and
External Files
62
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:
Fieldname=value
• Variables do not have to be found in the
same order as on the INPUT statement.
Interactions Between the DATA Step and
External Files
63
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:
Fieldname=value
• Variables do not even have to be on the
INPUT statement! (Variables defined
with LENGTH or ATTRIB statement will
be identified and processed,
as well.)
Interactions Between the DATA Step and
External Files
64
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
Example using INPUT statement
File: 'c:\sasconf\param.dat‘
85 DATA Params;
86 INFORMAT ProcessDt MMDDYY8.;
87 INFILE 'c:\sasconf\param.dat';
88 INPUT @ 1 RunType $CHAR4.
89 Analyst= $CHAR3. ;
90 RUN;
PROD ProcessDt=12/02/06 Analyst=ATK
Interactions Between the DATA Step and
External Files
65
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
Example using PUT statement
92 DATA _NULL_;
93 SET Params;
94 FILE LOG;
95 PUT RunType=
96 ProcessDt= DATE9. Analyst= ;
97 RUN;
RunType=PROD ProcessDt=02DEC2006 Analyst=ATK
Interactions Between the DATA Step and
External Files
66
Interactions … External Files
Defining the Appearance of the Data
Named Input (and Output)
Example using PUT statement
99 DATA _NULL_;
100 SET Params;
101 FILE LOG;
102 PUT _ALL_ ;
103 RUN;
ProcessDt=17137 RunType=PROD Analyst=ATK
_ERROR_=0 _N_=1
Interactions Between the DATA Step and
External Files
67
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
• The 5th type of input?
• No line pointers,
no column pointers,
no formats,
not even any variables!
Interactions Between the DATA Step and
External Files
68
"As I was going up the stair, I met a man who was not there, He wasn't there again today, I wish that man would go away."
Hugh Mearns
“… He must be with the CIA."
MAD Magazine
Interactions … External Files
Defining the Appearance of the Data
Interactions Between the DATA Step and
External Files
69
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
Do you
remember this
slide from
earlier in the
presentation?
Interactions Between the DATA Step and
External Files
70
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
There is another form of column pointer
known as the “trailing @”.
As expected, it is found at the end of a
line on an INPUT or PUT statement.
Interactions Between the DATA Step and
External Files
71
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
Trailing @.
Prevents line and column pointers from
automatically moving to the start of next
line after completion of INPUT statement.
Line pointer is not moved until next
INPUT statement, or next iteration of
DATA step.
Interactions Between the DATA Step and
External Files
72
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
Double Trailing @.
Prevents line and column pointers from
automatically moving to the next line
after completion of INPUT statement.
Line pointer is not moved until next
INPUT statement, or next iteration of
DATA step.
Interactions Between the DATA Step and
External Files
73
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
So … how do you restore the default
behavior of the column and line pointer?
You guessed it …
a Null INPUT / PUT statement!
Interactions Between the DATA Step and
External Files
74
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
DATA SAS_Conf07;
ARRAY ConfDt (2) Start_Dt End_Dt ;
INFILE 'c:\sasconf\sascon07.dat' ;
INPUT @ 19 ConfName $char16.
@ 36 ConfLoc $char16.
@ 1 @ ;
DO i = 1 TO 2 ;
INPUT ConfDt(i) mmddyy8. +1 @ ;
END ;
INPUT ;
… … …
Interactions Between the DATA Step and
External Files
75
Interactions … External Files
Defining the Appearance of the Data
Null Input (and Output)
DATA SAS_Conf07;
… … …
PUT @ 1 ConfName $CHAR16.
@ 18 Start_Dt mmddyy8.
@ 27 End_Dt mmddyy8. @ ;
IF MONTH( Start_Dt) ^= MONTH( End_Dt ) THEN
PUT ' Span Month' ;
ELSE PUT ;
RUN ;
Interactions Between the DATA Step and
External Files
76
Interactions … External Files
Practical application
ASIDE: A little trick to quickly transition
from input to output …
193 DATA _NULL_;
194 INFILE DATALINES ;
195 INPUT ;
196 FILE PRINT ;
197 PUT _INFILE_;
198 DATALINES;
NOTE: 8 lines were written to file PRINT.
No variables
listed on
INPUT
statement.
Special
variable
_INFILE_
listed on
PUT
statement.Add decision logic for
conditional output.
Interactions Between the DATA Step and
External Files
77
Interactions … External Files
Practical application
Q: So, which technique is best?
A: That depends on the data.
(You knew I was going to say that.)
Interactions Between the DATA Step and
External Files
78
Reading Sequential files:
usually Formatted input.
Reading parameters:
sometimes Named input,
sometimes List input
sometimes Formatted input
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
79
Writing Sequential files:
usually Formatted input.
Writing parameters and debugs:
often Named input,
sometimes List input
sometimes Formatted input
Interactions … External Files
Practical application
?
Interactions Between the DATA Step and
External Files
80
Reading / writing directly to / from spreadsheet:
• Can use DDE, which is outside the scope of
this presentation.
Using a text file as an interface:
• Spreadsheet can read and write to CSV files.
• SAS can read and write CSV files, too!
CSV = “Comma Separated Value
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
81
Reading / writing directly to / from spreadsheet:
• Use DSD option on INFILE statement.
– Two or more consecutive delimiters denote
missing values. (Normally, extra delimiters
are ignored.)
– Default delimiter is comma. (Normally,
default delimiter is a space.) This can still be overridden with DELIMITER= option.
DSD = “Delimiter Separated Data”
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
82
Reading / writing directly to / from spreadsheet:
• Use DSD option on INFILE statement.
– Character values are assumed to be
enclosed in double-quotes. (Use ~ format
modifier if you need to keep the double-
quotes as part of the data.)
DSD = “Delimiter Separated Data”
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
83
Interactions … External Files
Defining the Appearance of the Data
04/16/07,04/19/07,"SAS Global Forum","Orlando FL"
06/03/07,06/06/07,"PharmaSUG","Denver CO"
09/16/07,09/18/07,"PNWSUG","Seattle WA"
09/30/07,10/02/07,"SCSUG","Austin TX"
10/17/07,10/19/07,"WUSS","San Francisco CA"
10/28/07,10/30/07,"MWSUG","Des Moines IA"
11/04/07,11/06/07,"SESUG","Hilton Head SC"
11/11/07,11/14/07,"NESUG","Baltimore MD"
DSD – Sample Data
File: 'c:\sasconf\sascon07.csv'
Interactions Between the DATA Step and
External Files
84
Interactions … External Files
Defining the Appearance of the Data
DSD – Wrong Way
17 DATA SAS_Conf07;
18 INFILE 'c:\sasconf\sascon07.csv' DSD;
19 INPUT Start_Dt mmddyy8.
20 End_Dt mmddyy8.
21 ConfName $CHAR16.
22 ConfLoc $CHAR16.;
23 FILE 'c:\sasconf\testout2.csv' DSD;
24 PUT ConfName $CHAR16.
25 ConfLoc $CHAR16.
26 Start_Dt date7.
27 End_Dt date7.;
28 RUN;
Interactions Between the DATA Step and
External Files
85
Interactions … External Files
Defining the Appearance of the Data
DSD – Wrong Way
NOTE: Invalid data for End_Dt in line 1 9-16.
RULE: ----+----1----+----2----+----3----+---4—
04/16/07,04/19/07,"SAS Global Forum”,O …
Start_Dt=16APR2007 End_Dt=.
ConfName=7,"SAS Global Fo
ConfLoc=rum","Orlando FL _ERROR_=1 _N_=1
NOTE: Invalid data for End_Dt in line 2 9-16.
06/03/07,06/06/07,"PharmaSUG","Denver CO“
Start_Dt=03JUN2007 End_Dt=. ConfName=7,"PharmaSUG","D
ConfLoc=enver CO" _ERROR_=1 _N_=2
etc. etc. etc.
Interactions Between the DATA Step and
External Files
86
DSD – Wrong Way
'c:\sasconf\testout1.csv'
7,"SAS Global Forum","Orlando FL16APR07 .
7,"PharmaSUG","Denver CO" 03JUN07 .
7,"PNWSUG","Seattle WA" 16SEP07 .
7,"SCSUG","Austin TX" 30SEP07 .
7,"WUSS","San Francisco CA" 17OCT07 .
7,"MWSUG","Des Moines IA" 28OCT07 .
7,"SESUG","Hilton Head SC" 04NOV07 .
7,"NESUG","Baltimore MD" 11NOV07 .
Interactions … External Files
Defining the Appearance of the Data
Interactions Between the DATA Step and
External Files
87
Interactions … External Files
Defining the Appearance of the Data
DSD – Right Way
44 DATA SAS_Conf07;
45 INFILE 'c:\sasconf\sascon07.csv' DSD;
46 INFORMAT Start_Dt End_Dt mmddyy8.
47 ConfName ConfLoc $CHAR16.;
48 FORMAT Start_Dt End_Dt date7. ;
49 INPUT Start_Dt
50 End_Dt
51 ConfName
52 ConfLoc ;
53 FILE 'c:\sasconf\testout2.csv' DSD;
54 PUT ConfName
55 ConfLoc
56 Start_Dt
57 End_Dt ;
58 RUN;
Interactions Between the DATA Step and
External Files
88
DSD – Right Way
'c:\sasconf\testout2.csv'
SAS Global Forum,Orlando FL,16APR07,19APR07
PharmaSUG,Denver CO,03JUN07,06JUN07
PNWSUG,Seattle WA,16SEP07,18SEP07
SCSUG,Austin TX,30SEP07,02OCT07
WUSS,San Francisco CA,17OCT07,19OCT07
MWSUG,Des Moines IA,28OCT07,30OCT07
SESUG,Hilton Head SC,04NOV07,06NOV07
NESUG,Baltimore MD,11NOV07,14NOV07
Interactions … External Files
Defining the Appearance of the Data
Interactions Between the DATA Step and
External Files
89
Interactions … External Files
Practical application
Q: What will the system do if
you ask for more variables
than you actually have on
your input data?
A: Whatever you tell it to do.
(You knew I was going to say that, too.)
Interactions Between the DATA Step and
External Files
90
INFILE options – End of Input Line
FLOWOVER
Allows INPUT to read next data record.
NOTE: This is the default option / behavior.
Watch for following in SASLOG:NOTE: SAS went to a new line when INPUT
statement reached past the end of
a line.
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
91
INFILE options – End of Input Line
MISSOVER
Stops INPUT from reading new data record,
remaining variables set to missing.
Watch for following in SASLOG:(nothing – no extra NOTEs, WARNINGs,
or ERRORs will be produced)
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
92
INFILE options – End of Input Line
TRUNCOVER
Stops INPUT from reading new data record,
remaining variables set to missing.
Watch for following in SASLOG:(nothing – no extra NOTEs, WARNINGs,
or ERRORs will be produced)
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
93
INFILE options – End of Input Line
MISSOVER vs. TRUNCOVER
The difference between these two options is
how they handle values that are in the
middle of being read at the end of the line.
MISSOVER sets it to a missing value.
TRUNCOVER “settles for what it can get”,
and stores as much as is present.
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
94
INFILE options – End of Input Line
STOPOVER
Stops INPUT from reading new data record,
stops execution of DATA step, sets
_ERROR_ = 1.
Watch for following in SASLOG:ERROR: INPUT statement exceeded record
length. INFILE <infile> OPTION
STOPOVER specified.
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
95
INFILE options – End of Input Line
SCANOVER
Searches next line for contents of the
“@<string>” value.
Watch for following in SASLOG:NOTE: SAS went to a new line when INPUT
@'CHARACTER_STRING' scanned past
the end of a line.
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
96
INFILE options – End of Input Line
TURNOVER
• Meeting planners term for moving one
group out of a room and another group in.
• Catering term for a type of pastry.
Interactions … External Files
Practical application
Interactions Between the DATA Step and
External Files
97
Interactions … External Files
Conclusion
• 50 minute presentation can only
scratch the surface of getting data into
SAS and putting data out of SAS.
• You can learn more from a few minutes
of independent trial and error than from
sitting and listening to me for 50
minutes!
Interactions Between the DATA Step and
External Files
98
Interactions … External Files
Conclusion
For further information …
Interactions Between the DATA Step and
External Files
99
Interactions … External Files
Conclusion
Special thanks to the late John Tenniel
(1820-1914) for providing the graphics
used in this presentation.
Interactions Between the DATA Step and
External Files
100
Interactions … External Files
Conclusion
SAS and
all other SAS Institute Inc. product
or service names are registered trademarks
or trademarks of SAS Institute Inc. in the USA
and other countries.
indicates USA registration.
My lawyer