1 / 30

Going Beyond SAS Default Formats

Going Beyond SAS Default Formats. September 12th, 2007 Fall RTSUG Meeting. Objectives. Easy to create multiple user-defined formats Easy to maintain these formats Customize these formats Create your own SAS-Catalog of formats. Efficiency. Lets say your data is like this -.

Télécharger la présentation

Going Beyond SAS Default Formats

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Going Beyond SAS Default Formats September 12th, 2007 Fall RTSUG Meeting

  2. Objectives • Easy to create multiple user-defined formats • Easy to maintain these formats • Customize these formats • Create your own SAS-Catalog of formats. • Efficiency

  3. Lets say your data is like this - CUS_ID VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 VAR9 VAR11 VAR10 VAR12 VAR13 VAR14 VAR15 VAR16---VAR80 1 UG 3 01 A 8050 A 01 999 01 R003 N N 134 2 UG 3 01 A 5020 A 01 999 01 B099 N N 152 3 UG 3 01 A 7999 A 01 528 01 C035 N N 505 4 UG 3 01 A 00 3670 A 01 999 01 N N 119 5 OH 3 01 A 02 00 8060 A 01 014 01 B010 N N 144 6 OH 3 01 A 02 00 8060 A 01 014 01 B010 N N 144 7 OH 3 01 A 01 00 8060 A 01 014 01 B010 N N 144 8 UG 3 01 A 00 8060 A 01 014 01 B010 N N 144 9 UG 1 01 A 8060 A 01 014 01 B010 N N 144 10 UG 3 01 R 01 8060 A 01 014 01 B010 N N 144 11 OH 1 01 A 01 0002 F 02 014 01 B010 N N 144 12 OH 3 01 A 2510 A 01 528 01 C027 N N 505 13 OH 1 01 A 2510 A 01 528 01 C027 N N 505 14 UG 3 01 A 2510 A 01 528 01 C027 N N 505 15 OH 1 01 A 2510 A 01 528 01 C027 N N 505 16 OH 3 01 A 02 00 2510 A 01 528 01 C027 N N 505 17 UG 3 01 A 02 00 2510 A 01 528 01 C027 N N 505 18 UG 3 01 A 01 00 2510 A 01 528 01 C027 N N 505 19 UG 3 01 A 5812 A 02 192 01 C010 N N 138 20 UG 3 01 A 00 8060 A 01 016 01 B099 N N 107 21 OH 3 01 A 0724 002 A 01 598 01 B001 N N 511 22 OH 3 01 A 00 0724 002 A 01 598 01 B001 N N 511 23 UG 3 01 A 6512 A 01 320 01 N N 153 24 UG 1 01 A 7310 A 01 320 01 N N 153 25 UG 3 01 A 0001 A 01 320 01 B099 N N 153 26 OH 3 01 A 00 5140 A 01 320 01 B099 N N 153 27 UG 3 01 A 00 5410 A 01 542 01 B011 N N 505 28 UG 3 01 A 6512 A 01 076 01 N N 153 29 OH 3 01 A 00 8090 A 01 999 01 B012 N N 505 30 OH 3 01 A 00 8050 A 01 062 01 R004 N N 141 31 OH 3 01 A 00 2030 005 A 01 269 01 B099 N N 155 32 UG 3 01 A 2030 005 A 01 269 01 B099 N N 155

  4. An Example : Cus_Num Acct_Cls 1 02 2 01 3 03 4 03 5 02 6 05 7 05 8 08 9 08 10 01 11 08 . . …more data...

  5. -Continued- Lets assume your groups of customers are classified like this : ACCT_CLS CODE_DESCRIPTION 01 CROWN 02 DIAMOND 03 PLATINUM 04 RUBY 05 EMERALD 06 SAPPHIRE 07 PEARL 08 GOLD 09 SILVER 10 BRONZE

  6. -continued- Desired report/result for business folks, who usually don’t know the raw-data very well….  * Produce a summary of our customers by all account types …. ?

  7. -continued- Outputs in 2 ways: • 1st way The FREQ Procedure ACCT_CLS Cumulative Cumulative ACCT_CLS Frequency Percent Frequency Percent --------------------------------------------------------------- 01 1201772 83.44 1201772 83.44 02 234502 16.28 1436274 99.72 03 5 0.00 1436279 99.72 05 27 0.00 1436306 99.73 06 435 0.03 1436741 99.76 08 3507 0.24 1440248 100.00 Frequency Missing = 1

  8. -continued- • 2nd way (preferable) The FREQ Procedure ACCT_CLS Cumulative Cumulative ACCT_CLS Frequency Percent Frequency Percent ------------------------------------------------------------------ 01: CROWN 1201772 83.44 1201772 83.44 02: DIAMOND 234502 16.28 1436274 99.72 03: PLATINUM 5 0.00 1436279 99.72 05: EMERALD 27 0.00 1436306 99.73 06: SAPPHIRE 435 0.03 1436741 99.76 08: GOLD 3507 0.24 1440248 100.00 Frequency Missing = 1

  9. ….SAS code.… The following code can easily create such results using ‘decodes’ as the labels ….. PROC FORMAT; VALUE $ACCT ’01’ = ’01: CROWN’ ’02’ = ’02: DIAMOND’ ’03’ = ’03: PLATINUM ’04’ = ’04: RUBY’ ’05’ = ’05: EMERALD’ ’06’ = ’06: SAPPHIRE’ ’07’ = ’07: PEARL’ ’08’ = ’08: GOLD’ ’09’ = ’09: SILVER’ ’10’ = ’10: BRONZE’ ; RUN; PROC FREQ DATA = ACCOUNTS; TABLES ACCT_CLS ; FORMAT ACCT_CLS $ACCT. ; RUN;

  10. Problem …. ? ? What if …

  11. Problem …. ? ? • Need formatted reports for 10 or more coded variables, and very quickly….say in an hour….!!!! • Amount of time required to create individual formats….!!!! • How do we know which ‘format’ to use for a desired variable if there are lots of coded variables…!!!!

  12. An attempt to find a solution for this. Lets say your dataset with the decodes look like this …..

  13. NAME CODE DECODE VAR1 01 Residential VAR1 02 Commercial VAR1 03 Industrial VAR1 04 Government VAR2 A On leave VAR2 B On Vacation VAR2 C In Hospital VAR2 D Working From Home VAR2 E Retired VAR2 F On Contract VAR2 G Resigned VAR2 H Laid Off VAR3 101 Salary < $20,000 VAR3 102 Salary between $20,000 and $40,000 VAR3 103 Salary between $40,001 and $60,000 VAR3 104 Salary between $60,001 and $80,000 VAR3 105 Salary between $80,001 and $100,000 VAR3 106 Salary > $100,000 VAR4 A001 Family of 1 member VAR4 A002 Family of 2 members VAR4 A003 Family of 3 members VAR4 A004 Family of 4 members VAR4 A005 Family of 5 members VAR4 A006 Family of 6 members VAR4 A007 Family of 7 or more members VAR5 00A1 In Batch # - 1 VAR5 00A2 In Batch # - 2 VAR5 00A3 In Batch # - 3 VAR5 00A4 In Batch # - 4 VAR5 00A5 In Batch # - 5

  14. NAME CODE DECODE VAR5 00A6 In Batch # - 6 VAR5 00A7 In Batch # - 7 VAR5 00A8 In Batch # - 8 VAR5 00A9 In Batch # - 9 VAR5 00A10 In Batch # - 10 VAR5 00A11 In Batch # - 11 VAR5 00A12 In Batch # - 12 VAR5 00A13 In Batch # - 13 VAR5 00A14 In Batch # - 14 VAR5 00A15 In Batch # - 15 VAR5 00A16 In Batch # - 16 VAR5 00A17 In Batch # - 17 VAR5 00A18 In Batch # - 18 VAR5 00A19 In Batch # - 19 VAR5 00A20 In Batch # - 20 VAR5 00A21 In Batch # - 21 VAR5 00A22 In Batch # - 22 VAR5 00A23 In Batch # - 23 VAR5 00A24 In Batch # - 24 VAR5 00A25 In Batch # - 25 VAR5 00A26 In Batch # - 26 VAR6 SS01 Service Territory Area 1 VAR6 SS02 Service Territory Area 2 VAR6 SS03 Service Territory Area 3 VAR6 SS04 Service Territory Area 4 VAR6 SS05 Service Territory Area 5 VAR6 SS06 Service Territory Area 6 VAR6 SS07 Service Territory Area . . . . . . . so on . . . .

  15. SAS code DATA ONE (KEEP = NAME CODE DECODE2); SET Codes_decodes_data; DECODE2 = trim(CODE)||": "||DECODE; RUN; /* TO CLEAN DATA */ PROC SORT DATA=ONE DUPOUT=DUPS NODUPKEY; BY NAME CODE; RUN; /* TO SEE JUST THE FIELD-NAMES */ PROC FREQ NOPRINT DATA = ONE; TABLES NAME / OUT=FREQOUT (DROP=PERCENT); RUN;

  16. -continued- /* CREATING FORMATS */ DATA FMT; RETAIN N 0; RETAIN FMTNAME; SET ONE; BY NAME; IF FIRST.NAME THEN DO; N+1; FMTNAME='C'||PUT(N,Z3.)||'T'; END; START=CODE; LABEL=DECODE2; TYPE='C'; RUN; PROC FORMAT LIBRARY = WORK.My_Formats CNTLIN=FMT FMTLIB; RUN; QUIT;

  17. ‘FMT’ dataset looks like this…. N FORMAT NAME CODE DECODE2 START LABEL TYPE 1 C001T VAR1 01 01: Residential 01 01: Residential C 1 C001T VAR1 02 02: Commercial 02 02: Commercial C 1 C001T VAR1 03 03: Industrial 03 03: Industrial C 1 C001T VAR1 04 04: Government 04 04: Government C 2 C002T VAR10 M M: Sent 3 Notices M M: Sent 3 Notices C 2 C002T VAR10 N N: Sent 2 Notices N N: Sent 2 Notices C 2 C002T VAR10 Y Y: Sent 1 Notice Y Y: Sent 1 Notice C 3 C003T VAR11 1A 1A: Sent 4 Notices 1A 1A: Sent 4 Notices C 3 C003T VAR11 1B 1B: Sent 5 Notices 1B 1B: Sent 5 Notices C 3 C003T VAR11 1C 1C: Sent 6 Notices 1C 1C: Sent 6 Notices C 3 C003T VAR11 1D 1D: Sent 7 or more Notices 1D 1D: Sent 7 or more Notices C 4 C004T VAR12 SSSS201 SSSS201: Credit Value - 100 SSSS201 SSSS201: Credit Value - 100 C 4 C004T VAR12 SSSS202 SSSS202: Credit Value - 75 SSSS202 SSSS202: Credit Value - 75 C 4 C004T VAR12 SSSS203 SSSS203: Credit Value - 50 SSSS203 SSSS203: Credit Value - 50 C 4 C004T VAR12 SSSS204 SSSS204: Credit Value - 25 SSSS204 SSSS204: Credit Value - 25 C 4 C004T VAR12 SSSS205 SSSS205: Credit Value - 0 SSSS205 SSSS205: Credit Value - 0 C 4 C004T VAR12 SSSS206 SSSS206: Credit Value - Negative SSSS206 SSSS206: Credit Value - Negative C 5 C005T VAR13 1 1: Rejected 1 1: Rejected C 5 C005T VAR13 2 2: Selected 2 2: Selected C 5 C005T VAR13 3 3: Missing 3 3: Missing C 6 C006T VAR14 N N: No - Not On Schedule N N: No - Not On Schedule C 6 C006T VAR14 Y Y: Yes - On Schedule Y Y: Yes - On Schedule C 7 C007T VAR15 N N: No - Customer is not an Employee N N: No - Customer is not an Employee C 7 C007T VAR15 Y Y: Yes - Customer is an Employee Y Y: Yes - Customer is an Employee C 8 C008T VAR16 M M: Batch Missing M M: Batch Missing C 8 C008T VAR16 N N: Batch Not Present N N: Batch Not Present C 8 C008T VAR16 Y Y: Batch Present Y Y: Batch Present C 9 C009T VAR17 M M: Data Unavailable M M: Data Unavailable C 9 C009T VAR17 N N: Payment Not Done N N: Payment Not Done C 9 C009T VAR17 Y Y: Payment Done Y Y: Payment Done C …. SO ON….

  18. APPLICATION OPTIONS FMTSEARCH = (WORK.MY_Formats); PROC FREQ DATA = EXAMPLE; TABLES VAR1 VAR2 VAR3 VAR4 VAR5; FORMAT VAR1 $C002T. VAR2 $C013T. VAR3 $C015T. VAR4 $C016T. VAR5 $C017T. ; RUN;

  19. Output : Cumulative Cumulative VAR1 Frequency Percent Frequency Percent ----------------------------------------------------------------------- 01: Residential 12 50.00 12 50.00 02: Commercial 4 16.67 16 66.67 03: Industrial 4 16.67 20 83.33 04: Government 4 16.67 24 100.00 Cumulative Cumulative VAR2 Frequency Percent Frequency Percent ------------------------------------------------------------------------- A: On leave 12 50.00 12 50.00 B: On Vacation 4 16.67 16 66.67 C: In Hospital 4 16.67 20 83.33 D: Working From Home 4 16.67 24 100.00 Cumulative Cumulative VAR3 Frequency Percent Frequency Percent ----------------------------------------------------------------------------------------- 101: Salary < $20,000 4 16.67 4 16.67 102: Salary between $20,000 and $40,000 4 16.67 8 33.33 103: Salary between $40,001 and $60,000 4 16.67 12 50.00 104: Salary between $60,001 and $80,000 4 16.67 16 66.67 105: Salary between $80,001 and $100,000 8 33.33 24 100.00

  20. Output continued… Cumulative Cumulative VAR4 Frequency Percent Frequency Percent -------------------------------------------------------------------------------------- A001: Family of 1 member 17 70.83 17 70.83 A002: Family of 2 members 1 4.17 18 75.00 A003: Family of 3 members 3 12.50 21 87.50 A004: Family of 4 members 1 4.17 22 91.67 A005: Family of 5 members 2 8.33 24 100.00 Cumulative Cumulative VAR5 Frequency Percent Frequency Percent ----------------------------------------------------------------------------------------------------------------------------------------------- 00A1: In Batch # - 1 2 8.33 2 8.33 00A10: In Batch # - 10 18 75.00 20 83.33 00A11: In Batch # - 11 1 4.17 21 87.50 00A12: In Batch # - 12 1 4.17 22 91.67 00A2: In Batch # - 2 1 4.17 23 95.83 00A4: In Batch # - 4 1 4.17 24 100.00

  21. References For Formats • SAS Dataset as a reference Data My_Fmts_Reference (Rename = (Start = Code Label = Decode)); Set Fmt (Keep = Fmtname Name Start Label); Run; (Input dataset on next slide)

  22. Used thisFMTdataset to create reference table N FORMAT NAME CODE DECODE2 START LABEL TYPE 1 C001T VAR1 01 01: Residential 01 01: Residential C 1 C001T VAR1 02 02: Commercial 02 02: Commercial C 1 C001T VAR1 03 03: Industrial 03 03: Industrial C 1 C001T VAR1 04 04: Government 04 04: Government C 2 C002T VAR10 M M: Sent 3 Notices M M: Sent 3 Notices C 2 C002T VAR10 N N: Sent 2 Notices N N: Sent 2 Notices C 2 C002T VAR10 Y Y: Sent 1 Notice Y Y: Sent 1 Notice C 3 C003T VAR11 1A 1A: Sent 4 Notices 1A 1A: Sent 4 Notices C 3 C003T VAR11 1B 1B: Sent 5 Notices 1B 1B: Sent 5 Notices C 3 C003T VAR11 1C 1C: Sent 6 Notices 1C 1C: Sent 6 Notices C 3 C003T VAR11 1D 1D: Sent 7 or more Notices 1D 1D: Sent 7 or more Notices C 4 C004T VAR12 SSSS201 SSSS201: Credit Value - 100 SSSS201 SSSS201: Credit Value - 100 C 4 C004T VAR12 SSSS202 SSSS202: Credit Value - 75 SSSS202 SSSS202: Credit Value - 75 C 4 C004T VAR12 SSSS203 SSSS203: Credit Value - 50 SSSS203 SSSS203: Credit Value - 50 C 4 C004T VAR12 SSSS204 SSSS204: Credit Value - 25 SSSS204 SSSS204: Credit Value - 25 C 4 C004T VAR12 SSSS205 SSSS205: Credit Value - 0 SSSS205 SSSS205: Credit Value - 0 C 4 C004T VAR12 SSSS206 SSSS206: Credit Value - Negative SSSS206 SSSS206: Credit Value - Negative C 5 C005T VAR13 1 1: Rejected 1 1: Rejected C 5 C005T VAR13 2 2: Selected 2 2: Selected C 5 C005T VAR13 3 3: Missing 3 3: Missing C 6 C006T VAR14 N N: No - Not On Schedule N N: No - Not On Schedule C 6 C006T VAR14 Y Y: Yes - On Schedule Y Y: Yes - On Schedule C 7 C007T VAR15 N N: No - Customer is not an Employee N N: No - Customer is not an Employee C 7 C007T VAR15 Y Y: Yes - Customer is an Employee Y Y: Yes - Customer is an Employee C 8 C008T VAR16 M M: Batch Missing M M: Batch Missing C 8 C008T VAR16 N N: Batch Not Present N N: Batch Not Present C 8 C008T VAR16 Y Y: Batch Present Y Y: Batch Present C 9 C009T VAR17 M M: Data Unavailable M M: Data Unavailable C 9 C009T VAR17 N N: Payment Not Done N N: Payment Not Done C 9 C009T VAR17 Y Y: Payment Done Y Y: Payment Done C …. SO ON….

  23. Dataset as a reference FMTNAME NAME CODE DECODE C001T VAR1 01 01: Residential C001T VAR1 02 02: Commercial C001T VAR1 03 03: Industrial C001T VAR1 04 04: Government C002T VAR10 M M: Sent 3 Notices C002T VAR10 N N: Sent 2 Notices C002T VAR10 Y Y: Sent 1 Notice C003T VAR11 1A 1A: Sent 4 Notices C003T VAR11 1B 1B: Sent 5 Notices C003T VAR11 1C 1C: Sent 6 Notices C003T VAR11 1D 1D: Sent 7 or more Notices C004T VAR12 SSSS201 SSSS201: Credit Value - 100 C004T VAR12 SSSS202 SSSS202: Credit Value - 75 C004T VAR12 SSSS203 SSSS203: Credit Value - 50 C004T VAR12 SSSS204 SSSS204: Credit Value - 25 C004T VAR12 SSSS205 SSSS205: Credit Value - 0 C004T VAR12 SSSS206 SSSS206: Credit Value - Negative C005T VAR13 1 1: Rejected C005T VAR13 2 2: Selected C005T VAR13 3 3: Missing C006T VAR14 N N: No - Not On Schedule C006T VAR14 Y Y: Yes - On Schedule

  24. References For Formats • SAS Catalogs as a reference

  25. SAS CATALOG as a reference

  26. -Continued- /* SAS CODE TO CREATE FORMAT-CATALOG */ DATA LABELS; SET My_fmts_reference; KEEP FMTNAME NAME; RUN; PROC SORT DATA = LABELS NODUPKEY; BY NAME; RUN; %MACRO CATMAC(FMTNAME,NAME); PROC CATALOG CAT = MY_Formats; MODIFY &FMTNAME (DESC=&NAME) / ET=FORMATC; CONTENTS OUT = CATOUT; RUN; QUIT; %MEND; DATA _NULL_; SET LABELS; CALL EXECUTE("%CATMAC(" || FMTNAME || ",'" || NAME || "')"); RUN;

  27. Efficiency Option 1: This SAS code (autoexec.sas) opens-up the catalog window automatically as soon as the SAS session starts, provided these formats were saved in a “Permanent SAS-Library”. It’s saved here (Ask your SAS technical support where to save): <C:\Program Files\SAS\SAS 9.1> DM 'CATALOG Paper.My_Formats';

  28. OPTIONS FMTSEARCH = (WORK.MY_Formats); Run this … Run this… Run this again … Run this again again…. Run this again again again….

  29. Option 2: This code should be added in SASV9.txt (SAS Configuration Information). It loads all the formats automatically in the memory as soon as the SAS session starts, provided these formats were saved in a “Permanent SAS-Library”. It’s saved here (Ask your SAS technical support where to save): <C:\Program Files\SAS\SAS 9.1\nls\en\SASV9.txt> -FMTSEARCH (Paper.My_Formats) No need to run OPTIONS FMTSEARCH = () every time to use the formats !!!!!

  30. For Questions… Contact: Divya Jain Divya.Jain@pgnmail.com (919)-546-5129 Progress Energy Company

More Related