1 / 59

Introduction to SAS

Introduction to SAS. Tamara Arenovich Tony Panzarella. I. OBJECTIVES

gjustin
Télécharger la présentation

Introduction to SAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to SAS Tamara Arenovich Tony Panzarella

  2. I. OBJECTIVES This session is intended to introduce you to SAS – what it is, how it works, and how you will use it. The focus of this session is on the SAS programming essentials needed to help get you started on your SAS session. We will also cover some basic descriptive statistics. II. WHAT IS SAS? The acronym SAS stands for Statistical Analysis System. Simply put, it is a software program that allows you to analyze lots of data quite rapidly. It works by having you tell it what to do through a sequence of steps (or commands). Through this sequence of steps, there are four major tasks (in general) that are often performed: • Data access • Data management • Data analysis • Data presentation

  3. III. EXPLORING THE SAS ENVIRONMENT (IN WINDOWS) • Enhanced Editor [I] Syntax Rules: SAS programs must be written following syntax rules Beginning with a Keyword Ending with a semicolon SAS statements are not case-sensitive, except inside quotation marks sex = 'm' is the same as SEX = 'm'; sex = 'm' is not the same as sex = 'M‘;

  4. [II] Steps in a SAS Program • There are two types of steps in a SAS program: • DATA steps • PROC steps • A SAS program can contain any number of DATA and PROC steps • Examples:

  5. [III] Running (Submitting) SAS Programs • Running your SAS program: • Until you’re sure your SAS program is completely error-free, submitting only sections of your SAS code at a time is preferred

  6. B.SAS Log • After submitting a SAS program, the SAS log contains information about the processing of the SAS program, including any error or warning messages • In the SAS log, you should always see • The SAS statements • NOTES • You might also see • WARNINGS • ERRORS • Note: Always read your SAS log after running a SAS program.

  7. C.SAS Results Viewer / Output • As a general rule, PROC steps generate output, DATA steps do not. • The results of your PROC steps can be viewed in the Results Viewer window • Earlier versions of SAS (i.e. 9.1, 9.2) print results to the Output window by default. ODS commands are required to make these files look a bit nicer and generate graphics. • Your practicum sites may be working with earlier versions of SAS

  8. D. SAS Files • All of your SAS programs, SAS datasets, SAS log, and SAS output can be saved:

  9. E.Results Window Lists all the reports that appear in the Output window. You can also use the Results window to jump through your output for easy navigation F. Explorer Window The Explorer Window allows you to browse SAS libraries and SAS datasets.

  10. IV. THE FOUR MAJOR TASKS

  11. Task 1: Data Access (Reading your data into SAS) • [I] SAS Libraries • All SAS files follow a 2-part naming system: libref.fileref • A libref is a reference to a directory on your computer or a connection to a physical location on your computer • To define the libref component, we use a ‘libname’ statement (code) OR the New Library window (point & click) • Example: • libname lunl7 'C:\Users\projects\Data';

  12. No record of this library reference in your program or log file SAS will remember this library designation across sessions

  13. Permanent SAS Libraries: These libraries that are created by you are known as permanent SAS Libraries You may create as many permanent SAS libraries as you wish Rules For Naming your Permanent Library (the LIBREF): Must be 1 to 8 characters Must begin with a letter or underscore. Temporary SAS Libraries: Known as the WORK library. If no libref is specified, WORK is assumed

  14. [II] Reading a SAS Dataset into SAS • Assign a library reference that refers to the directory on your computer where the SAS dataset is saved. • No cards statement required, the data is already in SAS format – set command used instead

  15. [III]Import/Export Wizard • The Import/Export wizard guides you through the importing or exporting process • You can import your data from a variety of data sources (e.g. Excel, Access, SPSS, Stata), but make sure your data is structured appropriately prior to importing

  16. Task 2: Data Management • In Task 2, you are modifying the current SAS dataset and turning it into a new SAS dataset that is appropriate for analysis. • All of this cleaning is performed in the DATA step. • [I] Naming a SAS Dataset • All SAS datasets have two-level names: libref.fileref. • Fileref can be 32 characters long • Not case-sensitive • Must begin with a letter or underscore. Subsequent characters can be letters, underscores or numbers. • Special characters (e.g., #) are not used.

  17. Examples of valid SAS dataset names: baseline baseline1 _baseline Examples of non-valid SAS dataset names: base line (cannot have spaces) baseline#1 (# is not a valid character) 1baseline (cannot begin with a number)

  18. [II] Viewing Contents of SAS Dataset • Use the PROC CONTENTS procedure to view descriptive information about the contents of your dataset • Use the PROC PRINT procedure to view the actual data • Use the VAR statement to specify the variables to be displayed. • Use the WHERE statement to specify the observations to be displayed

  19. [IV] Types of SAS Variables • Character Letters, numbers, special characters and blanks Length 1 to 32,767 bytes [default length is 8 characters] Creating new variables, the length statement precedes the SET statement • Numeric variables 8 bytes of storage by default Provides space for 16 to 17 significant digits SEX as a character variable might be coded as: a). sex = '1' or sex = '2' b). sex = 'Male' or sex = 'Female‘ SEX as a numeric variable might be coded as: a). sex = 1 or sex = 2 The way variable values are stored affects what you can do with the variables.

  20. [V]Creating New SAS Datasets • Within the DATA step, use the DATA and SET statements to create a new SAS dataset. • In the DATA statement, specify the new SAS dataset that you are about to create. • In the SET statement, specify the SAS dataset that you are reading from.

  21. [VI]Common Data Management Activities Performed in the DATA Step • Keep or Remove Observations • Use the WHERE statement or the IF statement.

  22. Comparison Operators that Can be Used with a WHERE or IF statement:

  23. Logical Operators that Can Be Used with a WHERE or IF statement: So, any of the WHERE statements below could have been used to restrict the dataset to baseline observations only: where sex =1; where sex eq1; where sex ^in (0); where sex not in (0); The following WHERE statement could be used to restrict your dataset to female participants age 65 and older only: where sex = 1ANDage GE 65;

  24. Keep or Drop Variables • There are three different ways to do this, and all three methods are done in the DATA step. • Method 1: Use the KEEP (or DROP) statementin the DATA step. • Method 2: Use the KEEP = (or DROP =) dataset option in the DATA statement. • Method 3: Use the KEEP = (or DROP =) dataset option in the SET statement.

  25. Method 1: Use the KEEP (or DROP) statement in the DATA step.

  26. Method 2: Use the KEEP= (or DROP=) dataset option in the DATA statement.

  27. Method 3: Use the KEEP= (or DROP=) dataset option in the SET statement.

  28. Creating a new variable • There are many ways to create new variables. I will show you two ways here: • Using equations • Using conditional (if-then-else) logic.

  29. Renaming a variable: Use the RENAME statement in the DATA step.

  30. Create descriptive labels for variable names: Use the LABEL statement in the DATA step.

  31. Format the Values of a Variable • Create and apply formats when you wish to change the appearance of variable values. • Creating and applying user-defined formats involves two steps. First, you mustcreate the formats using the PROC FORMAT step. Then, you must apply the format in the DATA step.

  32. Merging Files • The MERGE statement can be used to combine two or more SAS datasets • Ensure unique identifiers are present and that files are sorted by the unique identifiers • One-one and one-many merges are ok, many-one and many-many DO NOT WORK!!!

  33. Task 3: Data Analysis • Two very common SAS procedures: PROC FREQ and PROC MEANS.

  34. [I] PROC FREQ • To produce frequency counts and cross tabular frequency tables. • Can be used with either numerical or character variables. • In the PROC FREQ statement, specify the name of the SAS dataset you wish to analyze. • In the TABLES statement, list the variables you want frequencies of. • For a cross tabular frequency table, use an asterisk (*) symbol in the TABLES statement to cross variables.

  35. [II]PROC MEANS • To display simple descriptive statistics for variables in a SAS dataset. • Numerical variables only. • In the PROC MEANS statement, specify the name of the SAS dataset to be analyzed. • In the VAR statement, list the variables to be analyzed. • An optional CLASS statement may be used.

  36. [III]Useful Statements That Can Be Used in Most PROC steps • BY Statement • This statement may be used with the PROC FREQ procedure. • It allows you to perform subgroup analysis, working similarly to the CLASS statement in the PROC MEANS procedure. • Before using the BY statement in any procedure, you must firstsort your data on the BY variable.

  37. WHERE Statement • This statement may be used with both the PROC FREQ or PROC MEANS procedure (and others!)

  38. FORMAT Statement • This statement may be used in the PROC FREQ procedure on the analysis variable, or it may be used in the PROC MEANS procedure on the class variable. • Use this statement if you did not assign the format of interest in your DATA step, but wish to assign it for a specific procedure only.

  39. Label Statement • This statement may be used in both the PROC FREQ and PROC MEANS procedure. • In the PROC MEANS procedure, it may be used with both the analysis variable and the class variable. • Use this statement if you did not assign the descriptive label of interest in your DATA step, but wish to assign it for a specific procedure only.

  40. Task 4: Data Presentation • Here will show you three methods: • 1). SAS System Options; • 2). Adding titles and/or footnotes; • 3). Saving your output as a PDF or EXCEL file.

  41. [I] SAS System Options • You may use the OPTIONS statement to change SAS system options.

  42. Commonly used options:

More Related