1 / 70

Chapter 16 Processing Variables with Arrays

This chapter focuses on grouping variables into arrays, performing actions on array elements, creating new variables with the ARRAY statement, assigning initial values to array elements, and creating temporary elements with an ARRAY statement.

valtina
Télécharger la présentation

Chapter 16 Processing Variables with Arrays

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 16Processing Variables with Arrays Objectives Group variables into one- and two-dimensional arrays Perform an action on array elements Create new variables with ARRAY statement Assign initial values to array elements Create temporary elements with an ARRAY statement

  2. Array Processing You can use arrays to simplify programs that • perform repetitive calculations • create many variables with the same attributes • read data • rotate SAS data sets by making variables into observations or observations into variables • compare variables • perform a table lookup.

  3. Performing Repetitive Calculations Employees contribute an amount to charity every quarter. The SAS data set mylib.donatecontains contribution data for each employee. The employer supplements each contribution by 25 percent. Calculate each employee’s quarterly contribution including the company supplement. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30

  4. Performing Repetitive Calculations The following program does the purpose without using an ARRAY data charity; set mylib.donate; Qtr1=Qtr1*1.25; Qtr2=Qtr2*1.25; Qtr3=Qtr3*1.25; Qtr4=Qtr4*1.25; run; proc print data=charity noobs; run;

  5. Performing Repetitive Calculations Partial PROC PRINT Output What if you want to similarly modify 52 weeks of data stored in Week1 through Week52? ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 15.00 41.25 27.50 . E00367 43.75 60.00 50.00 37.50 E00441 . 78.75 111.25 112.50 E00587 20.00 23.75 37.50 36.25 E00598 5.00 10.00 7.50 1.25

  6. What Is a SAS Array? A SAS array • is a temporary grouping of SAS variables that are arranged in a particular order • is identified by an array name • exists only for the duration of the current DATA step • is not a variable.

  7. What Is a SAS Array? Each value in an array is • called an element • identified by a subscript that represents the position of the element in the array. When you use an array reference, the corresponding value is substituted for the reference.

  8. CONTRIB Array name ID Qtr1 Qtr2 Qtr3 Qtr4 D First element Third element Second element Fourth element What Is a SAS Array? ...

  9. CONTRIB Array name ID Qtr1 Qtr2 Qtr3 Qtr4 First element Third element Second element Fourth element CONTRIB{1} CONTRIB{2} CONTRIB{3} CONTRIB{4} Array references What Is a SAS Array?

  10. The ARRAY Statement The ARRAY statement defines the elements in an array. These elements can be processed as a group. You refer to elements of the array by the array name and subscript. ARRAY array-name{array-subscript} <$> <length> <array-elements> <(initial-value-list)>;

  11. The ARRAY Statement The ARRAY statement • must contain all numeric or all character elements • must be used to define an array before the array name can be referenced • creates variables if they do not already exist in the PDV

  12. Some warnings when using Arrays • Do not give an array the same name as a variable name in the same DATA step. • Avoid using the SAS function name as an array name; although the array will still be correct, but, you can not use it as a SAS function in the same Data Step, a warning message will be in the SAS Log. • Can not use array name in LABEL, FORMAT, DROP, KEEP, or LENGTH statements. • Arrays do not become part of the output data set. They are temporary names.

  13. Creating an One-Dimensional Array An example of using one-dimensional array to reduce the # of program statements. The following program convert Fanrenheit to Celsius temperature for each week day without using Array: Data temperature_convert; set Fahrenheit; Mon = 5*(Mon-32)/9; Tue=5*(Tue-32)/9; Wed=5*(Wed-32)/9; Thr=5*(Thr-32)/9; Fri=5*(Fri-32)/9; Sat=5*(Sat-32)/9; Sun=5*(Sun-32)/9; Run;

  14. The following program convert Fanrenheit to Celsius temperature for each week day with an Array: Data temperature_convert (drop=i); Set Fahrenheit; Array wkday{7} montue wed thrfri sat sun; Array celtemp[7] cmonctuecwedcthrcfricsatcsun; Do i = 1 to 7; celtemp{i} = 5*(wkday{i}-32)/9; End; Run; NOTE: • The array name is wkday • # of elements defined in the array is 7. • The seven elements are the variables mon, tue, etc. • The use of the array in the program is by the DO loop. • The index in the DO loop is a new variable created in the program. It should dropped , unless it will be used for other purpose in the same Data step.

  15. CONTRIB First element Third element Second element Fourth element Defining an Array Write an ARRAY statement that defines the four quarterly contribution variables as elements of an array. array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; ID Qtr1 Qtr2 Qtr3 Qtr4

  16. First element Third element Second element Fourth element Defining an Array Variables that are elements of an array do not need to have similar, related, or numbered names. array Contrib2{4} Q1 Qrtr2 ThrdQ Qtr4; CONTRIB2 Q1 Qrtr2 ThrdQ Qtr4 ID

  17. Specifying Array-subscript There are different ways to specify array-subscript in addition to specify the size of the array. Ex.: • Specify a range of values as dimension: Array sales{05:09} mon05 mon06 mon07 mon08 mon09; • Use asterisk (*) as dimension. SAS determine the dimension of array by counting the number of elements. Array contrib{*} qtr1 qtr2 qtr3 qtr4; • You can use { } , [ ] , or ( ) to enclose dimension: array sales[4] qtr1 qtr2 qtr3 qtr4; array sales (4) qtr1 qtr2 qtr3 qtr4;

  18. Specifying Array Elements Array sales {4} qtr1 qtr2 qtr3 qtr4; This array has four elements, which are defined by the four variables qtr1 atr2 qtr3 and qtr4. • It can be simplified by using qtr1 – qtr4: Array sales[4] qtr1 – qtr4; • Since array elements are a list of variables, one can use the following as array elements just like we describe a variable list: A numbered range of variables Var1 – Varn The list of variables from A to B A - - B All numeric variables _NUMERIC_ All character variables _CHARACTER_ All variables _ALL_ • NOTE: When _ALL_ is used, by default, all variables must be either numeric or character. It can not be a mixed list of variables.

  19. Some examples of Array Statements Array sales (6) mon7 mon8 mon9 mon10 mon11 mon12; Array sales {7:12} mon7 mon8 mon9 mon10 mon11 mon12; Array sales(*) july august septoctnovdec; Array sales{*} july - - dec; /*NOTE: use A - - B for the entire list of variables from A to B */ Array sales [*] mon7-mon12; Array sales (*) _numeric_ ; Array names{*} _character_; If the entire list of variables are either all numeric or all character, one can specify the Array statement as: Array names {*} _ALL_; NOTE: the asterisk is used as the dimension if the # of the elements are not known. SAS will count the # of elements in the list.

  20. In some situations, we do not have pre-defined variable list in the Data Step for the array.Can we define an Array statement without providing the array elements? The answer is YES. SAS will create a list of default names for the elements. General Syntax is : Array a_name {dimension}; SAS creates a default list of variables as: a_name1, a_name2, … , a_namen, where n is the array dimension. Ex: Array wtdif{4}; creates four default variable list named as wtdif1 wtdif2 wtdif3 wtdif4;

  21. Processing an Array Array processing often occurs within DO loops. An iterative DO loop that processes an array typically has the following form: To execute the loop as many times as there are elements in the array, specify that the values of index-variable range from 1 to number-of-elements-in-array. DOindex-variable=1 TOnumber-of-elements-in-array;additional SAS statements using array-name{index-variable}…END;

  22. CONTRIB{i} Qtr1 Qtr2 Qtr3 Qtr4 First element Third element Second element Fourth element Processing an Array array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end; ...

  23. Value of index variable i CONTRIB{i} 1 Qtr1 Qtr2 Qtr3 Qtr4 First element Third element Second element Fourth element Processing an Array array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end; ...

  24. array reference Value of index variable i CONTRIB{i} 1 2 3 4 CONTRIB{1} CONTRIB{2} CONTRIB{3} CONTRIB{4} Qtr1 Qtr2 Qtr3 Qtr4 First element Third element Second element Fourth element Processing an Array array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end;

  25. Use the DIM(array_name) function to call out the dimension of an array when array dimension is not specified in the array statement. When the dimension is not specified in Array statement, SAS determines the # of dimensions by counting the # of elements. When processing the array in the DO loop, we need to specify the dimension to be processed. Since SAS has already counted the dimension, all we need to do is to call out this dimension. The following example convert distances from six cities to Mt. Pleasant from MILES to Kilometers. Array distance(*) Dist1 – Dist6; Do k = 1 to DIM(distance); Dist{k} = Dist{k} * 1.6 ; End; run;

  26. Performing Repetitive Calculations data charity(drop=i); set mylib.donate; array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end; run; Contrib{1}=Contrib{1}*1.25; When i=1 Qtr1=Qtr1*1.25; ...

  27. Performing Repetitive Calculations data charity(drop=i); set mylib.donate; array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end; run; Contrib{2}=Contrib{2}*1.25; When i=2 Qtr2=Qtr2*1.25; ...

  28. Performing Repetitive Calculations data charity(drop=i); set mylib.donate; array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end; run; Contrib{3}=Contrib{3}*1.25; When i=3 Qtr3=Qtr3*1.25; ...

  29. Performing Repetitive Calculations data charity(drop=i); set mylib.donate; array Contrib{4} Qtr1 Qtr2 Qtr3 Qtr4; do i=1 to 4; Contrib{i}=Contrib{i}*1.25; end; run; Contrib{4}=Contrib{4}*1.25; When i=4 Qtr4=Qtr4*1.25;

  30. Performing Repetitive Calculations proc print data=charity noobs; run; Partial PROC PRINT Output ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 15.00 41.25 27.50 . E00367 43.75 60.00 50.00 37.50 E00441 . 78.75 111.25 112.50 E00587 20.00 23.75 37.50 36.25 E00598 5.00 10.00 7.50 1.25

  31. Creating Variables with Arrays Calculate the percentage that each quarter’s contribution represents of the employee’s total annual contribution. Base the percentage only on the employee’s actual contribution and ignore the company contributions. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30

  32. Creating Variables with Arrays data percent(drop=i); set mylib.donate; Total=sum(of Qtr1-Qtr4); array Contrib{4} Qtr1-Qtr4; array Percent{4}; do i=1 to 4; Percent{i}=Contrib{i}/Total; end; run; The second ARRAY statement automatically creates four numeric variables: Percent1, Percent2, Percent3, Percent4.

  33. Creating Variables with Arrays proc print data=percent noobs; var ID Percent1-Percent4; format Percent1-Percent4 percent6.; run; Partial PROC PRINT Output ID Percent1 Percent2 Percent3 Percent4 E00224 18% 49% 33% . E00367 23% 31% 26% 20% E00441 . 26% 37% 37% E00587 17% 20% 32% 31% E00598 21% 42% 32% 5% NOTE: percent6. is a display format for displaying percentage with % sign.

  34. First difference Creating Variables with Arrays Calculate the difference in each employee’s actual contribution from one quarter to the next. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 ...

  35. Second difference First difference Creating Variables with Arrays Calculate the difference in each employee’s actual contribution from one quarter to the next. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 ...

  36. Third difference First difference Second difference Creating Variables with Arrays Calculate the difference in each employee’s actual contribution from one quarter to the next. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30

  37. Creating Variables with Arrays data change(drop=i); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{3}; do i=1 to 3; Diff{i}=Contrib{i+1}-Contrib{i}; end; run;

  38. Creating Variables with Arrays proc print data=change noobs; var ID Diff1-Diff3; run; Partial PROC PRINT Output ID Diff1 Diff2 Diff3 E00224 21 -11 . E00367 13 -8 -10 E00441 . 26 1 E00587 3 11 -1 E00598 4 -2 -5

  39. Assigning initial values in ARRAY statement There are situations where initial values will be assigned to array quarter. The difference between the actual sales and the goal will be computed each quarter. Using array to assign the initial values using the following syntax: ARRAY array_name(dim) variable names (initial values); Example: ARRAY sales[4] sale1 – sale4; ARRAY goals[4] goal1 – goal4 (5000 6000 7500 9000); NOTE: Each variable is assigned an initial value in the same order of the sequence of the initial values.

  40. Assigning initial values in an ARRAY without creating new variables in the SAS data set By default, when array is defined, the elements are either provided or will be created by the ARRAY using array_name1, array_name2 , etc. Ex: ARRAY sales(4) sale1 – sale4; ARRAY diff{4} ; creates four variables diff1, diff2, diff3, diff4. ARRAY goal{4} (5000 6000 7500 9000); Will creates four variables goal1 =5000, goal2=6000, goal3=7500 and goal4=9000 in the SAS data set. Since these will be used to determine the difference between goals and sales, there is no need to add these variables to SAS data set. We can use _TEMPORARY_ in ARRAY statement to create temporary variables without adding to SAS data set in array: ARRAY array_name{dim} _Temporary_ (initial values); EX: ARRAY goal[4] _Temporary_ (5000 6000 7500 9000);

  41. Assigning Initial Values in an Array Statement Determine the difference between employee contributions and last year’s average quarterly goals of $10, $15, $5, and $10 per employee. data compare(drop=i Goal1-Goal4); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} Goal1-Goal4 (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}-Goal{i}; end;run; If array consists of constants, the parenthesis is needed for the constants.

  42. Assigning Initial Values proc print data=compare noobs; var ID Diff1 Diff2 Diff3 Diff4; run; Partial PROC PRINT Output ID Diff1 Diff2 Diff3 Diff4 E00224 2 18 17 . E00367 25 33 35 20 E00441 . 48 84 80 E00587 6 4 25 19 E00598 -6 -7 1 -9

  43. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 Creating Variables with Arrays: Compilation data compare(drop=i Goal1-Goal4); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} Goal1-Goal4 (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}-Goal{i}; end; run; PDV ID Qtr1 Qtr2 Qtr3 Qtr4 ...

  44. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 ID Qtr1 Qtr2 Qtr3 Qtr4 Diff1 Diff2 Creating Variables with Arrays: Compilation data compare(drop=i Goal1-Goal4); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} Goal1-Goal4 (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}- Goal{i}; end; run; PDV Diff3 Diff4 ...

  45. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 ID Qtr1 Qtr2 Qtr3 Qtr4 Diff1 Diff2 Creating Variables with Arrays: Compilation data compare(drop=i Goal1-Goal4); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} Goal1-Goal4 (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}- Goal{i}; end; run; PDV Diff3 Diff4 Goal1 Goal2 Goal3 Goal4 10 15 5 10 ...

  46. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 ID Qtr1 Qtr2 Qtr3 Qtr4 Diff1 Diff2 Creating Variables with Arrays: Compilation data compare(drop=i Goal1-Goal4); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} Goal1-Goal4 (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}- Goal{i}; end; run; PDV Diff3 Diff4 Goal1 Goal2 Goal3 Goal4 i 10 15 5 10 ...

  47. Partial Listing of mylib.donate data compare(drop=i Goal1-Goal4); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} Goal1-Goal4 (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}- Goal{i}; end; run; ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30 ID Qtr1 Qtr2 Qtr3 Qtr4 Diff1 Diff2 Creating Variables with Arrays: Compilation PDV D D D D D Diff3 Diff4 Goal1 Goal2 Goal3 Goal4 i 10 15 5 10

  48. Performing a Table Lookup without creating new variable lists in Array Statement You can use the keyword _TEMPORARY_ instead of specifying variable names when you create an array to define temporary array elements. data compare(drop=i); set mylib.donate; array Contrib{4} Qtr1-Qtr4; array Diff{4}; array Goal{4} _temporary_ (10,15,5,10); do i=1 to 4; Diff{i}=Contrib{i}-Goal{i}; end; run; When array is created, the corresponding variables are created unless they are dropped or assigned as _TEMPORARY_ In this example, SAS creates Diff1-Diff4 new variables If we did not use _temporary_, then SAS would have created Goal1 – Goal4 as well.

  49. Defining SAS Array with Character Variables We have discussed defining and processing arrays with numeric variables. SAS array can also handle character variables. The Array statement has the following syntax: Array Array_name{dim} $ <length> <list of character variables >; • $ is required for character arrays. • By default, all character variables created by SAS array have the length 8. specifying length to overwrite the length. The length specified is assigned to all character variables created by ARRAY statement. Ex: ARRAY Employ{4} $ Emp90 Emp95 emp2000 emp2005; ARRAY Employ(4) $ ; creates employ1 – employ4, each has length 8. ARRAY Employ[4] $ 30; creates employ1 – employ4, each with length 30.

  50. Rotating a SAS Data Set Rotating, or transposing, a SAS data set can be accomplished by using array processing. When a data set is rotated, the values of an observation in the input data set become values of a variable in the output data set. Partial Listing of mylib.donate ID Qtr1 Qtr2 Qtr3 Qtr4 E00224 12 33 22 . E00367 35 48 40 30

More Related