Thursday, March 10, 2011

What the Heck is a Data Step?

It is a basic building block of SAS programming. It creates the data sets that are used in a SAS program’s analysis and reporting procedures.

The Data Step is represented by three components:

1. The input or raw data.

2. The data step statements – DATA statement, INPUT statement and more statements.

3. The output or SAS data set.

Temporary versus Permanent SAS Data Sets

The following is an example of a DATA step that creates the temporary data set WEIGHT_CLUB.

data weight_club;

input IdNumber Name $ 6--20 Team $ 22--27 StartWeight EndWeight;

datalines;

1023 David Shaw red 189 165

1049 Amelia Serrano yellow 145 124

1219 Alan Nance red 210 192

1246 Ravi Sinha yellow 194 177

1078 Ashley McKnight red 127 118

1221 Jim Brown yellow 220 ;

run;

Wednesday, March 9, 2011

Kiss My SAS ... part two

Traditional SAS Output
  • Data Set
  • SAS log
  • A Report or Simple Listing
  • Other SAS Files Such as Catalogs
  • External Files or Entries in Other Databases
Data Set

Just what you think it is - it stores multiple data variables, descriptive elements ... names and arrangements of data. Can be temporary or permanent.

SAS Log

A log file ... a record of SAS statements entered and detail about the execution of your program.

A Report or Simple Listing

Grouping and summaries of data ... can display statistics. This can be similar to the example output in the first blog posting.

Other SAS Files Such As Catalogs

Can't be represented as tables or data values. Examples include key settings, letters produced by SAS/FSP software, displays produced by SAS/GRAPH software.

External Files or Entries in Other Databases

Using SAS/ACCESS software you can create and update files stored in databases such as Oracle.

Procedures

The Proc Statement

Two examples:

proc tabulate data=weight_club;
class team;
var StartWeight EndWeight Loss;
table team, mean*(StartWeight EndWeight Loss);
title 'Mean Starting Weight, Ending Weight,';
title2 'and Weight Loss';
run;

And then this produces a table output with the calculations performed.
So you see, this is declaring a tabulate data procedure and assigning it the name weight_club

The second example ... :

proc print data=weight_club;
title 'Health Club Data';
run;

This displays all of the data in the weight_club data set.

So really, you just need a listing of all SAS procedures in order to get anything knocked out.

A Listing of SAS Procedures

Learning Edition TaskSAS PROCSAS Product
Mixed ModelsMIXEDSAS/STAT
Generalized Linear ModelsGENMODSAS/STAT
Split ColumnPROC TRANSPOSEBase SAS
Stack ColumnPROC TRANSPOSEBase SAS
Utility ProcedureAPPENDBase SAS
Utility ProcedureCOMPAREBase SAS
Utility ProcedureCONTENTSBase SAS
Utility ProcedureCOPYBase SAS
CorrelationsCORRBase SAS
Utility ProcedureDATASETSBase SAS
Create FormatsFORMATBase SAS
Utility ProcedureFORMATBase SAS
One-Way FrequenciesFREQBase SAS
Table AnalysisFREQBase SAS
Calculate Summary StatisticsMEANSBase SAS
Utility ProcedureOPTIONSBase SAS
Utility ProcedureOPTLOADBase SAS
Utility ProcedureOPTSAVEBase SAS
Utility ProcedurePLOTBase SAS
Utility ProcedureDELETEBase SAS
List DataPRINTBase SAS
Utility ProcedurePRTDEFBase SAS
RankRANKBase SAS
Utility ProcedureREGISTRYBase SAS
Utility ProcedureREPORTBase SAS
Utility ProcedureSORTBase SAS
Append TableSQLBase SAS
Query BuilderSQLBase SAS
Standardize variables in a SAS data setSTANDARDBase SAS
Utility ProcedureSUMMARYBase SAS
Summary TablesTABULATEBase SAS
TransposeTRANSPOSEBase SAS
Distribution AnalysisUNIVARIATEBase SAS
ARIMA Modeling and ForecastingARIMASAS/ETS
Regression Analysis with Autoregressive ErrorsAUTOREGSAS/ETS
Prepare Time Series DataEXPANDSAS/ETS
Basic ForecastingFORECASTSAS/ETS
Utility ProcedurePDLREGSAS/ETS
Regression Analysis of Panel DataTSCSREGSAS/ETS
Scatter (3D)G3DSAS/GRAPH
SurfaceG3DSAS/GRAPH
Utility ProcedureG3GRIDSAS/GRAPH
Utility ProcedureGANNOSAS/GRAPH
BarGCHARTSAS/GRAPH
DonutGCHARTSAS/GRAPH
PieGCHARTSAS/GRAPH
ContourGCONTOURSAS/GRAPH
Utility ProcedureGOPTIONSSAS/GRAPH
AreaGPLOTSAS/GRAPH
BubbleGPLOTSAS/GRAPH
FinanceGPLOTSAS/GRAPH
LineGPLOTSAS/GRAPH
Line-multi interpolation supportGPLOTSAS/GRAPH
Scatter (2D)GPLOTSAS/GRAPH
RadarGRADARSAS/GRAPH
Capability AnalysisCAPABILITYSAS/QC
cdf PlotCAPABILITYSAS/QC
Comparative HistogramCAPABILITYSAS/QC
HistogramCAPABILITYSAS/QC
P-P PlotCAPABILITYSAS/QC
ProbabilityCAPABILITYSAS/QC
Q-Q PlotCAPABILITYSAS/QC
ParetoPARETOSAS/QC
Box ChartSHEWHARTSAS/QC
c ChartSHEWHARTSAS/QC
Control ChartsSHEWHARTSAS/QC
Individual MeasuresSHEWHARTSAS/QC
Means and RangesSHEWHARTSAS/QC
Means and StandardsSHEWHARTSAS/QC
np ChartSHEWHARTSAS/QC
p ChartSHEWHARTSAS/QC
u ChartSHEWHARTSAS/QC
One-Way ANOVAANOVASAS/STAT
Utility ProcedureBOXPLOTSAS/STAT
Canonical CorrelationCANCORRSAS/STAT
Cluster AnalysisCLUSTERSAS/STAT
Discriminant FunctionDISCRIMSAS/STAT
Factor AnalysisFACTORSAS/STAT
Utility ProcedureFASTCLUSSAS/STAT
Linear ModelsGLMSAS/STAT
Life TablesLIFETESTSAS/STAT
Logistic RegressionLOGISTICSAS/STAT
Nonlinear RegressionNLINSAS/STAT
Non-Parametric One-Way ANOVANPAR1WAYSAS/STAT
Proportional HazardsPHREGSAS/STAT
Principal ComponentsPRINCOMPSAS/STAT
Linear RegressionREGSAS/STAT
SamplingSURVEYSELECTSAS/STAT
Utility ProcedureTREESAS/STAT
T TestTTESTSAS/STAT





Tuesday, March 8, 2011

Get Your SAS On or ... Get Your SAS in Gear

Rules for SAS

SAS statements end with a semicolon ; - just like in C#
Case does not matter unlike C#
Continuation is allowed, don't split words though

Some Sample SAS

This lil' procedure produces a report that displays the values of the variables in the SAS data set WEIGHT_CLUB

proc print data=weight_club;
title 'Health Club Data';
run;

The first line, the print procedure, tells the program to display the variables in a simple, organized form. The title slaps a title on the top of the report and run ... well .... that runs the program.

Here is the sample output:

Health Club Data Obs Id Number Name Team Start Weight End Weight Loss 1 1023 David Duke red 189 179 10 2 1049 Dan Quayle yellow 155 144 11 3 1219 Ronnie Reagan red 171 170 1

In order to do some calculations, you need to use the TABULATE procedure.

Proc means procedure.

proc tabulate data=weight_club;
class team;
var StartWeight EndWeight Loss;
table team, mean*(StartWeight EndWeight Loss);
title 'Mean Starting Weight, Ending Weight,';
title2 'and Weight Loss';
run;

And the results ...

MeanStartWeight EndWeight LossTeamred175.33158.3317yellow169.5150.519


You can see that a class, team is declared in this procedure and three variables; StartWeight, EndWeight and Loss.

Notice how it is not necessary to put commas between the variable names.

The next statement performs the mean, or average) calculation ... mean*( StartWeight EndWeight Loss)

Finally, we get some more titles and the run statement.