SAS Connect vs SAS Access

8/13/2019 SAS Connect vs SAS Access

1/8

Getting connected with your DATA: Using SAS/CONNECTand SAS/ACCESS

to work with

data housed in a remote environment

Kevin Delaney, New York State Office of Mental Health, Albany, New York

Abstract

This paper will provide an overview of

SAS/CONNECT and SAS/ACCESS software foraccessing and manipulating databases located in aremote environment. Using the example of an

Oracle database on a remotely located Unix server,the author will demonstrate many of the mainfeatures of SAS/Connect and SAS/Access.SAS/Connect topics to be covered include:

connecting to the remote server, submitting SAScode remotely, and moving data back and forth

between the server and client. SAS/Access topics tobe covered include: interfacing with the data usingboth the libname statement with SAS/Access

specific options and running a query against the datausing the SQL pass through facility. Issues ofefficiency and practicality will also be discussed.

Introduction

At this company, many of our large data sets arehoused in an ORACLE relational database, on aUnix server. In order to access these data from our

local area network, and work with them in SAS, wehad to become familiar with the SAS/CONNECTand SAS/ACCESS software packages. Like mostSAS software, within these packages there aremany different ways to reach the same goal. Onceyou become familiar with several of these methods,

the only challenge is to figure out which method ismost appropriate for a given circumstance. This

paper will attempt to walk through several of themore common utilities of SAS/CONNECT andSAS/ACCESS software, and hopefully clarify which

methods are most efficient and most practical.

Throughout this paper I will stick with what I know,I will use examples involving a SAS/CONNECTsession with a Unix server, and the SAS/ACCESS

interface with ORACLE relational databases. Forthose of you who know more about other operating

systems, or other database management systems, Ihope you will find that my examples are adaptable toyour host/entity of choice.

SAS/CONNECT :

The introduction to the SAS/CONNECT User'sGuideTMtells us that SAS/CONNECT is a "SAS-to-SAS client/server toolkit." What exactly does thismean? SAS/CONNECT software can be used to

connect to a SAS session running on a remoteserver, to transfer data between environments, and to

process data on the remote server. I will attempt toaddress the multitude of methods that

SAS/CONNECT provides for accomplishing thesetasks within this paper. SAS/CONNECT also

supports SCL commands and SAS/AF applicationsthat allow for remote messaging, linking of objectson different platforms, and running of scheduled

applications for routine updates from one server toanother, but I will not cover these topics here.

Getting Connected

There are several methods within SAS/CONNECTthat can be used to actually connect to a remoteenvironment. In the Windowing environment youcan use the SIGNON window to connect to the

remote host.

Select RUN from the toolbar, then SIGNON

Figure 1: Select Run and SIGNON from the DisplayManager Pull down menus.

This gets you the following SIGNON menu:


2/8

Figure 2: SAS/CONNECT SIGNON window

SAS/CONNECT ships with a number of script filesthat establish the connection between SAS on the

local host and SAS on the remote host. These arespecific to the remote host, but can be modified from

their standard form. You can also write your ownscript file, instructions for doing so are included inthe SAS/CONNECT User's Guide, Version 8TM. By

default these script files are stored in the

!SASROOT\CONNECT\SASLINK

folder in Windows. SAS will also look for script

files in your SASUSER folder. An example of thedefault TCPUNIX.scr that ships with SAS is

attached to this paper, as well as my modifiedTCPUNIX2.scr, see if you can recognize the

changes. As you might guess you can have a lot offun with the script files, if you are so inclined.

SAS/CONNECT supports several different accessmethods that are operating system dependant. All ofmy examples will involve the TCP/IP access method

for communication between Unix and Windows. Iwill not say a whole lot about access methods other

than to mention that you need to use one that may beused by both the local and remote hosts. For a morein depth discussion see: Communications Accessmethods for SAS/CONNECT and SAS/SHARE

software, Version 8TM.

With this information, and the name of the remotehost onto which you would like to connect, you canthen sign on to your remote SAS session using theSAS/CONNECT SIGNON window pictured in

Figure 2. You would place the name of the scriptfile you would like to use in the first line, the remotesession's name in the second and yourcommunications access method in the third.

For my example I am using a script calltcpunix2.scr, to connect to the remote host Unixdata,

using the TCP/IP access method. These are the onlythree lines that need to be filled in, as the NOTE onthe bottom of the window states, leaving a field

blank will default to the current setting. The onlyother item you might want to change is whether or

not remotely submitted commands executesynchronously, but we will discuss this more fully in

a minute.

If you prefer a more programmatic approach when

signing on to the remote host, the syntax is equallyeasy to grasp. In SAS Version 8, you need only

associate the fileref RLINK with your script file andthen issue the SIGNON command. For my example:

filename rlink 'tcpunix2.scr';signon unixdata;

Passing SAS statements to the remote host

Now that we are signed onto SAS "up" on Unix, letssend some SAS commands through and see how it

works. To send SAS statements to a remote hostyou need only bracket your normal SAS code with

an RSUBMIT; - ENDRSUBMIT; block. Forinstance:

rsubmit;libname myunix '/home/myunixdir';

endrsubmit;

Creates the LIBRARY MYUNIX within the session

of SAS executing on Unix, and then returns the logfrom this remote session to your local SAS Log. (Ifwe had done something that produced output, theoutput would also be directed down to the localoutput window.)

Remote Library Services

SAS/CONNECT also offers the ability to create alocal library that refers to files on the remote sessionusing the REMOTE engine. This is useful if youwish to use the Explorer window to look at the SASdata sets housed in your remote directories. The

syntax to create a LOCAL libref to the samedirectory as our MYUNIX LIBRARY "up there"would be:

libname mylocux '/home/myunixdir'

server=unixdata;


3/8

Once you have set up this remote libref you can then

manipulate data on the remote host withoutwrapping it in an RSUBMIT; - ENDRSUBMIT

block. For example:

proc contents data=mylocux.set1

out=mylocux.set1contents;run;

If you happen to know the directory you have beenassigned on the remote host this works well, but

what about viewing the work directory? You canuse the SASHELP.VMEMBER data set view on

your remote host to set up a local libref to yourremote WORK library:

rsubmit;data findwork;

x=1;run;

data find2(keep=path);set sashelp.vmember;if Upcase(memname)='FINDWORK';

run;proc download data=find2 out=finduxwork;

run;endrsubmit;data _null_;

set finduxwork;call symput("workdir",trim(path));

run;%put &workdir ; *to make sure it worked;

libname unixwork "&workdir"server=uxdata2;

Notice we are looking for the Unix WORK libraryso we need to SET SASHELP.VMEMBER from

Unix, by using an RSUBMIT with our data set. Forthose of you who have not used the VMEMBERdata set view in the past, it contains the attributes ofall the data sets currently referenced in your SAS

session. By creating a dataset in the WORK libraryand then selecting the variable path for that data set,we obtain the full path of our current WORK library.

This example also adds a new SAS/CONNECT

procedure. PROC DOWNLOAD, and its partner incrime PROC UPLOAD, are SAS/CONNECT

procedures that perform data transfer. The syntaxfor the procedures really is as easy as it looks. ForPROC DOWNLOAD DATA= data set name refers

to the data on the remote host which you wish to

push to your local SAS session. OUT=data set nameis the name of the data set that will reside in the local

session. In this case the procedure copies the dataset FIND2 from Unix down into the data setFINDUXWORK on our local SAS session. Thisdata set is then used to create the MACRO variableWORKDIR, and a remote library ref to WORKDIR

is established. This seems like a lot of work, but itactually executes in tenths or hundredths of seconds,

and then allows you to use the local EXPLORERWINDOW to look at data sets on the remote server,rename them interactively, and even move them to

other referenced libraries on either host.

You can use the remote library reference as youwould any other library reference, so you can SET

data on the remote host, and use it to create a localdata set, you can use PROC PRINT to print datafrom the remote host, and well, you get the point.

However, this is often not the most efficient way touse the SAS/CONNECT product. For example, lets

look at the following code:

HEAT # 1

data work.test;

set unixwork.smallset;run;rsubmit;

VS.

proc download data=work.smallsetout=work.test2;

run;endrsubmit;

Heat #2

data unixwork.test;set localref.smallset;run;

VS.

rsubmit;proc upload data=localref.smallsetout=work.test2;

run;endrsubmit;


4/8

Heat # 3

proc format library=workcntlout=unixwork.fmts;run;

rsubmit;

proc format library=work cntlin=work.fmts;run;

endrsubmit;

VS.

proc format library=workcntlout=work.fmts;

run;

rsubmit;proc upload data=work.fmtsout=work.fmts2;

run;proc format cntlin=fmts2;

run;endrsubmit;

I am not sure where the word HEAT comes from, butdefinition 10a in my dictionary does state " Oneround of many in a sporting competition, such as arace."

This example pits remote library services againstPROC DOWNLOAD/UPLOAD in a little contest to

see who is faster. With relatively small numbers ofobservations, and particularly with small numbers ofvariables, these two methods come pretty close.

However, PROC DOWNLOAD/UPLOADdefinitely wins both HEAT # 1 and HEAT # 2. Theadvantage to using this procedure over the Remotelibrary option grows wider as you add morevariables and observations to the data sets you are

moving between hosts. Of course if you arecleaning up for the night and interactively movingdata from your Unix work directory to a permanentlibrary it might be easier to click and drag in the

EXPLORER WINDOW, but for long programs thatneed to be duplicable and or completely automated,PROC DOWNLOAD/UPLOAD seems to makemore sense.

HEAT # 3 is much closer, because there is an extrastep needed to use PROC UPLOAD to move thedata. Also, unless you have a HUGEFORMATCATALOG, I don't know that the FMTS data setwill ever be big enough to see a real difference in

efficiency.

What would be neat (this is directed to those SAS

people who make this stuff happen) is if

options fmtsearch = (work.formatsunixwork.formats library.formats);

actually worked. Unfortunately as it stands now ifyou try to assign formats located in the unixwork

library or any other remote library using theOPTIONS FMTSEARCH=() option and a remotelibrary reference, you won't get an error, but when

you try to assign a format from a remote FORMATcatalog to a local session variable it won't work.

This is because "You cannot open a catalog througha server because access to catalogs is not supported

when the user machine and server machine havedifferent data representations." (If you want to seethis "NOTE" yourself double click on the

FORMATS catalog as it appears in theUNIXWORK folder of the EXPLORER

WINDOW.)

Are we having fun yet? The best attributes of

SAS/CONNECT software are still ahead of us. Notonly can SAS/CONNECT talk back and forth with a

remote host, but it can also do so asynchronously.To this point we have not made use of theWAIT=NO option in any of our RSUBMIT

statements. This option tells SAS to send the SASstatements in the RSUBMIT; - ENDRSUBMIT;

block through to the host server, but to immediatelyreturn control to the local SAS session. We haven'tused this option thus far because we haven't needed

it; all of the code we have submitted executed andreturned results faster than we could blink. Thiswould not be true if we were trying to pull recordsout of a database with a couple million records, orto perform an SQL query that combines ten tables

from a relational database. In my mind the bestreason for using SAS/CONNECT is to be able tosend large, memory intensive tasks such as these toanother server, and let the processing take place on

the remote host, allowing you to be free to do otherthings locally. This is especially true if you storeyour data remotely so as not to bog down your localserver.We will look more closely at the uses of the

SAS/CONNECT WAIT=NO and other statementsthat work with it as we turn our attention to anotherimportant piece of SAS software.


5/8

SAS/ACCESS software

If your data is stored in a format other than a SASdata set on the remote server you are CONNECTedto, how do we ACCESSit??

In effect SAS/ACCESS software provides a SAS-to-NONSAS database management software

connection in the same way that SAS/CONNECT isa SAS-to-SAS connection. SAS/ACCESS allowsyou to read in and modify data housed in a NON-

SAS data storage package, and then write thatmodified data back out to the database. From the

data analysts prospective, I don't have a need towrite data back out to the database, in fact, in my

job; I don't have the privilege of doing so. My focuswill therefore be on the various ways to 'access' datastored in a relational database, using SAS/ACCESS,

rather than on the way to write these data back out(PROC DBLOAD). Again, the examples in this

paper discuss accessing ORACLE tables on a Unixserver, if you are using a different DBMS, see theSAS/ACCESS User's Guidespecific to your product

for modifications that you might need to run theseexamples on your system.

SAS/ACCESS software provides three mainmethods for accessing a relational database, The

ACCESS Procedure, a DBMS specific LIBNAMEstatement, or the SQL Pass-through facility. I will

compare and contrast the three.

The ACCESS Procedure

This procedure is the most code intensive method ofaccessing a DBMS (Those of you deathly afraid ofSQL will note that I didn't say 'of using DBMSdata'), although none of the code is particular

difficult to grasp. The ACCESS procedure forrelational databases consists of two distinctcomponents, the ACCESS descriptor, and the VIEWdescriptor.

The ACCESS descriptor is a set of statements thattells SAS how to access a DBMS table. Forexample:

proc access dbms=oracle;create work.mytest.access;user=kpd;orapw=mypassword;table=category_service;

path='prda';

assign=yes;rename catsrv_code=CATCODE

catsrv_label=Service;list all;

This is the access descriptor for an ORACLE tablecalled CATEGORY_SERVICE within the

ORACLE instance 'prda'. The access descriptorcontains the information SAS/ACCESS will need to

read this table when it is called upon to do so,including my userid (USER=) and password(ORAPW=). ASSIGN=YES tells SAS that all

attributes of data sets created from this ACCESSdescriptor must conform to what is described here.

For example, I have renamed the ORACLE fieldcatsrv_code to be CATCODE. Any SAS data sets

created using this descriptor will contain the variableCATCODE, and I will not be able to rename them inthe VIEW descriptor. In addition to RENAME you

can also use such familiar SAS options as FORMATand DROP within the ACCESS descriptor.

A VIEW descriptor uses the information containedin its reference ACCESS descriptor to access the

database, then CREATE VIEW to "take a picture" ofthe data. When you create a view, you actually set

up a query of the data, which can later be called byany SAS procedure or DATA step. You can alsocreate a SAS data set from the ACCESS procedure,

but it must occur after the initialization of a VIEWdescription. In other words while we would like:

rsubmit;proc access dbms=oracle;

create work.mytest.access;user=kpd;orapw=noturpassword;table=category_service;

path='prda';

assign=yes;rename catsrv_code=CATCODEcatsrv_label=Service;list all;

create work.myview.viewout=outputdataset;

select catsrv_code catsrv_label;subset where catsrv_code ='96';

run;

We instead need to use a second PROC ACCESSstatement to create the data set:

rsubmit;


6/8

proc access dbms=oracle;create work.mytest.access;

user=coevkpd;orapw=urnosey;table=category_service;

path='prda';assign=yes;

rename catsrv_code=CATCODEcatsrv_label=Service;

list all;

create work.myview.view;

select catsrv_code catsrv_label;subset where catsrv_code ='96';

run;proc access viewdesc=work.myview

out=oratable1;run;endrsubmit;

Notice that I submitted this code to my SAS session

running remotely. This is, even in the case of a dataset with only two variables and one observation, themost efficient way of using PROC ACCESS. There

are two reasons for this, first the Unix server is farless bogged down with everyday traffic than my

Windows server. Even if I had a copy of thisORACLE database available locally, SAS could

probably read it faster "up there." Second, since I

don't actually have a copy of the data to accesslocally, it is much faster to access and manipulate it

up where it lives than to pull the data through mynetwork connection to Unix (which is what wouldhappen if I submitted the code without the

RSUBMIT).

The LIBNAME statement

The next option available to me is to reference the

ORACLE instance where my data is stored using aLIBNAME statement.

libname dwh1 oracle user=kpd

password=stopaskingpath='dwh1' schema=cpeom;

libname dwh1 oracle dbprompt=yesschema=cpeom;

rsubmit;libname dwh2 oracle user=kpd

password=iwonttellpath='dwh1'schema=cpeom;

endrsubmit;

The first piece of code represents a local LIBRARY

reference to the remotely stored ORACLE data. Thesecond demonstrates the DBPROMT= optiondiscussed below. The main reason I can think of toset up the localLIBREF is the same as the reasonwe used the SERVER= option earlier. It provides a

way to look at and move the smaller data tablesinteractively.

The third example shows the preferred method,remotely submitting the library reference to create

the ORACLE library as close to the data as possible.Like remote submitting PROC ACCESS in the

previous section, we are trying to avoid pulling datathrough the network until absolutely necessary, i.e.,

when we have a small enough subset of our data touse PROC DOWNLOAD or REMOTE LIBRARYSERVICES. In case you are wondering the

SERVER= option presented in the SAS/CONNECTportion of the paper applies to the REMOTE library

engine, while ORACLE in your LIBREF here callsthe ORACLE library engine, so we can't combinethe two to get a local copy of a remote ORACLE

library. Nice thought though.

The LIBNAME statement with options for theSAS/ACCESS to ORACLE engine has two distinctadvantages over PROC ACCESS. First, by

referencing the ORACLE instance (an instance isORACLE's way of saying LIBRARY) you set up a

reference to an entire group of tables at once, ratherthan having to create a descriptor for each table.Second, by using the DBPROMPT= option you can

tell SAS to prompt you for your username, passwordand path rather than leaving them laying out in opencode. (Note: this obviously will not work inBATCH SAS code, nor will it work for a remotelibrary reference, since you won't have access to the

resulting prompt locally.)

SQL Pass-Through Facility

For those of you familiar with SQL, the code forPROC ACCESS probably looked familiar. That is

because SQL queries underlie most of what SASdoes with SAS/ACCESS for relational databases.(SAS/ACCESS software for other types of database

management systems that do not use SQL to operateon the data stored within them works differently.) Ifyou do not use SQL, don't know how to use SQL,and have no interest in learning SQL, then the SQLPass-through facility is not for you. You can do

pretty much everything you want to do with your


7/8

DBMS data using PROC ACCESS or theLIBNAME statement, and never have to write any

"real SQL code." But if you are going to be workingwith data with large numbers of observations, ormany (50, 100, 250, etc.) related tables, you mightwant to start playing with SQL. Here is an exampleof what looks to be a complicated SQL Query (its

really not that bad, but I am not teaching SQL todayso you will have to take my word for it) that

combines information from three different tables in arelational database with over 1 million total records.It produces a count of the total number of clients

served per year by county.:

rsubmit wait=no;proc sql;

connect to oracle (user=kpdorapw=mylipsrsealed

path='pwh1'

schema=snp);

create table querytable as select *

from connection to oracle (select dates.year,

counties.ctyofres, count(distinctservices.recipient) as tot_served

from snp.dates dates, snp.services services,snp.counties counties

where dates.datekey=services.datekey

andcounties.countykey=services.countykey

group by dates.year counties.cntyofres);

disconnect from oracle;

quit;endrsubmit;

*rdisplay unixdata;

/*Pick one of us not both*/

*rget unixdata;

There are several key points. First to toot SQL'shorn a little, notice that it did not require sorting thedatabase to perform by group processing, that it

produced a frequency count for me, and that it also

essentially produced a report dataset of Total clientsserved by county and year.

Second, what you may not have noticed is probablythe most exciting part of this SQL code, the

CONNECT TO ORACLE and SELECT * (SQL for

ALL) FROM CONNECTION TO ORACLEstatements. These statements are used to leave SAS

entirely and run this query from within theORACLE database itself. SAS then is passed theresults of this ORACLE SQL query, which it uses tomake the data set QUERYTABLE. This is by farthe most efficient way of running a query against a

database this large and complicated. It letsORACLE do the work it was designed to do, and

then lets SAS do the rest. This could have beensubmitted on Unix using a LIBRARY reference forORACLE such as the DWH2 from my LIBNAME

example, but this would have been slower than thequery that uses the SQL-Pass Through facility. The

query could also have been run using the localLIBREF DWH1, but this would have been by far the

slowest option (in the case of queries against HUGEdata sets the slowest by HOURS).

Also, since this was submitted remotely with theWAIT=NO you can run other SAS procedures

locally while this is running on your remote SASsession. The last two lines of code bring us back toSAS/CONNECT. The RDISPLAY and RGET

commands are used with the WAIT=NO option togo up to the remote server at a later point in time and

pull down the SAS LOG and output printed to theLISTING OUTPUT destination. RGET puts theseresults into your local LOG and OUTPUT windows

respectively, while RDISPLAY opens two morewindows to display this output separately. Of these

two, I prefer RGET. The reason for this preferencebeing that you can use RGET with PROC PRINTTOto save a local copy of the remote SAS session's

LOG and OUTPUT, separate from your local SASsession log.

proc printto log='remote.log'print='remote.lst' new;

run;rget unixdata;

proc printto;run;

I haven't figured out a good way to do this withRDISPLAY output, other than to interactively copythe LOG or OUTPUT and then paste it into someother text file for later.

Conclusion

This paper was intended to present just some of themany ways to use SAS/CONNECT andSAS/ACCESS software, and, within the ways

presented, to describe their pros and cons.


8/8

Hopefully the suggestions CONNECTed with you,and they will serve to make these two valuable

packages more ACCESSible to you.

References

SAS Institute Inc. (1999), Communication Access

Methods for SAS/CONNECT and SAS/SHAREsoftware, Version 8, Cary, NC: SAS Institute Inc.

SAS Institute Inc. (1999), SAS/CONNECT User'sGuide, Version 8, Cary, NC: SAS Institute Inc.

SAS Institute Inc. (1999), SAS OnlineDoc, Version

8, Cary, NC: SAS Institute Inc.

SAS and all other SAS Institute Inc. product orservice names are registered trademarks or

trademarks of SAS Institute Inc. in the USA andother countries. indicates USA registration.

Other brand and product names are registeredtrademarks or trademarks of their respective

companies.

Contact Information

Please send questions, comments and suggestions to:

Kevin DelaneyNYS Office of Mental Health44 Holland Ave

Albany, NY 12229(518) [email protected]

Documents

SAS Connect vs SAS Access