1 / 58

Evolution of SPSS: Layout, syntax and change

Evolution of SPSS: Layout, syntax and change. Layout. It’s back to the 80-column card. Key to layout of Hollerith card. ~ ~ ~ ~ ~. This determined layout of early SPSS setup files. Columns 1 to 15 were reserved for commands

vine
Télécharger la présentation

Evolution of SPSS: Layout, syntax and change

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolution of SPSS: Layout, syntax and change

  2. Layout

  3. It’s back to the 80-column card

  4. Key to layout of Hollerith card ~ ~ ~ ~ ~

  5. This determined layout of early SPSS setup files • Columns 1 to 15 were reserved for commands • Columns 16 to 72 were reserved for sub-commands and specifications • Columns 73 to 80 were for numbering the cards • Commands had to start in column 1 • Sub-commands and specifications could start in or after column 16 • Continuation lines had to start in or after column 16, but variable names could not be wrapped.

  6. 001110204+57462235696172244322232422- 2O- 322K2- 3$62$$5 05902-- 89564$-147321 0012$$$% 1 23 0 19$0$78$$6110$Q31111010 23463110 4113+2211207637321 002119051-44689428858-45242524431442324T31$3823+84$8354$77 158-5-7M$6$O6$$417321 0022$$$$ 2 1 3 1$1$$$$22F$11222-41010011022113100 310002220107637321 003114202+355-953273--3324454341415591+N91238-2+8257$$55+- $- 4-7$$5$$5$2137321 0032$$$$ 1 32 0 12$$$26N$11222$51111011012122010 310122215127637321 Raw data (including multi-punches) from 80-column card (SSRCQuality of Life: 1st Pilot Survey 19712 cards per case, first 3 cases only, multipunches in red ) UK Data Archive study 247. The survey was conducted March – May 1971, but SPSS files were not created until 1972-73.

  7. 001110204+57462235696172244322232422- 2O- 322K2- 3$62$$5 05902-- 89564$-147321 0012$$$% 1 23 0 19$0$78$$6110$Q31111010 23463110 4113+2211207637321 00139000000101000090000001001000101000110100009000000009000000001010010001007321 00140001001011000001010001000000000001000010000000000001010000009000000000017321 00150000000100100001000100000100100110010000011112222222222222212900000000017321 0016000000001011100000010000000101 7321 Spread out multi-punching: first case only …done with LSE program MUTOS

  8. Standard 80-column data preparation sheet modified for SPSS use at SSRC Survey Unit

  9. These restrictions were later lifted but it is still helpful for beginners (or even veterans) to retain these distinctions visually by using tabs to inset sub-commands and specifications

  10. Syntax

  11. VARxxx TO VARyyy  Vx to Vy Qx to Qy etc. Labels allowed in UPPER CASE only  Any printing characters in primes Limits to no. of characters in labels (40 for variables) (20 for values)  Removed, theoretically 255, but printout constraints apply VARIABLE LIST INPUT FORMAT INPUT MEDIUM  DATA LIST FILE = RECORDS = Some changes to syntax since 1972  MEANS BREAKDOWN

  12. Effectsof changes Many setup jobs from the 1970s and 1980s will no longer work 1 Fortran format statements have been replaced by data list 2 Much data was received in multipunched format, and had to be read as alphabetic, but data can’t be recoded into same variables any more

  13. Data input and transformation

  14. Variable Names • Had to be in upper case in form VARddd eg VAR001 TO VAR010 • Later changed to any upper case letter(s) and any digit(s) eg VAR1 TO VAR10 or Q1 to Q10 • Later still, lower case letters allowed: eg • q1 to q10, but print format is still in upper case • Still can’t do any letter(s) and any letter(s) eg q1a to q1g

  15. Mnemonicvariable names • Demonic more like! • Names look like what they represent and help you to remember them • We shall see! • sex age income are self-evident • but what about idstrng = "strength of identity with political party supported?"

  16. Positionalvariable names • First digit defines card (record) • 2nd pair of digits defines start column • VAR311 is not the 311th variable, but the variable which starts on record 3 column 11 (field width is determined by the format statement)

  17. RUN NAME QL1UK1 - PILOT 1 FIRST SYSTEM FILE FILE NAME QL1UK1 QUALITY OF LIFE PILOT I UK VARIABLE LIST VAR101 VAR105 VAR109 TO VAR137 VAR141,VAR144,VAR145,VAR148 VAR149 VAR152 VAR155 VAR158 VAR159 VAR162 VAR165 VAR166 VAR169 VAR172 VAR175,VAR176, VAR209 TO VAR223 VAR225 VAR230 VAR234 TO VAR237 VAR240 TO VAR256 VAR263 VAR264 VAR266 TO VAR268 VAR270 INPUT MEDIUM INDATA INPUT FORMAT FIXED (F3.0,1X,A4,F1.0,13A1, 14F1.0,A1,3X,A1,2X,F1.0,A1, 2X,2A1,2X,A1,2X,A1,2X,2A1,2X,A1, 2X,2A1,2X,A1,2X,A1,2X,2A1,4X/ 8X,15A1,1X,1A1,4X,A2,2X,A1, F1.0,2A1,2X,17A1,6X,A1,A2,2A1,A2,A4) NO. OF CASES 213 Read in data in alpha format: 1973

  18. RECODE VAR105 ('++++'=9999) (CONVERT)/ VAR110 ('+'=2)('-'=1)('0'=88) (CONVERT)/ VAR111 TO VAR122 VAR137 VAR141 VAR145 VAR149 VAR152 VAR155 VAR158 VAR162 VAR166 VAR169 VAR172 ('-'=10)('+'=99) (CONVERT)/ VAR144 (1=2)/ VAR148 VAR165 ('+'=1) ('-'=2) (CONVERT)/ VAR159 (' '=1) ('-'=0) (CONVERT)/ VAR175 ('+' ' '=88) ('4'=3) (CONVERT)/ VAR176 (' ','+'=99) (CONVERT) Converting alpha to numeric: 1973 

  19. Recode of alpha to numeric format 1973

  20. This doesn’t work any more You have to use dummy variables and then RECODE < dummy varlist> (<old value list> = <new value>) (CONVERT) into <new varlist> Now that's syntax!

  21. data list file ‘f:qluk1.dat’ records 6 /1 serial 1-3 v105 to v180 5-80 (a) /2 v209 to v280 9-80 (a). Data List for dummy variables (alphanumeric data) 2002

  22. Data List will read 6 records from the command file Variable Rec Start End Format SERIAL 1 1 3 F3.0 V105 1 5 5 A1 V106 1 6 6 A1 V107 1 7 7 A1 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ V278 2 78 78 A1 V279 2 79 79 A1 V280 2 80 80 A1 Output from Data List

  23. When I first used SPSS for Windows My data files were in another directory (oops! folder) and I couldn’t get SPSS to find them. Small data files were placed on dsk:a, eg ‘a:fifth.dat’, but files larger than 1.4mb presented problems. I got round it by opening the raw data file, dragging the data into the setup file and then bracketing the data with BEGIN DATA and END DATA , but it took ages to wait for the copy I did this with huge files for 3 years, until a friend gave me a memory stick and I could use dsk:f eg ‘f:ess2002.dat’ Now I also use a plug-in rewriter and back up on CD.

  24. Embedded data: begin data & end data

  25. Read in data in alpha format: 2002

  26. RECODE V209 TO V222 (' ','+','-'=0) ('1'=2) ('2'=1) ('3'=-2) (CONVERT) into VAR209 TO var218 xvar219 var220 to VAR222 / V223 V225 V234 ('+'=88) (CONVERT) into VAR223 VAR225 VAR234 / V230 ('99'=98) ('++'=99) (CONVERT) into var230/ V236 V237 ('+'=99) (CONVERT) into var236 var237/ V240 ('+'=88) ('1' '2'=3)(CONVERT) into xvar240/ V241 TO V252(' '=88)(CONVERT) into VAR241 TO VAR252/ V253 (' '=88) ('4' '5'=3)(CONVERT) into var253/ V254 TO V256(' '=88)(CONVERT) into VAR254 TO VAR256. RECODE V263 ('+'=88) (CONVERT) into var263 / V264 ('++'=88) (CONVERT) into var264 / V266 V267 ('+'=1) ('-'=2) ('0'=3) ('1'=4) ('2'=5) ('3'=6) ('4'=7) ('5'=8) ('6'=9) ('7'=99) (' '=99)(CONVERT) into var266 var267 /V268 (CONVERT) into var268 / V270 ('++++'=88) (CONVERT) into var270. Recode dummy string variables into numeric variables

  27. Variable Labels

  28. Variable labels: 1973(SSRC Quality of Life Survey 1st Pilot 1971)

  29. Variable labels: 1973(SSRC Quality of Life Survey 1973) Note change of format mid-setup!

  30. Variable labels: 1981(Fifth form survey in North London)

  31. Variable labels 1989(NUS Student Finance Survey 1989)

  32. Value Labels

  33. Value Labels 1973 • UPPER CASE only • VALUE LABELS in cols 1-16 • Values in round brackets, no primes needed • 20 characters for rows • 16 characters for columns (in 2 blocks of 8) • Tortuous spellings and abbreviations • Formatted with packing spaces

  34. VALUE LABELS FORM(1)LOWER FIFTH(2)UPPER FIFTH(3)LOWER SIXTH (4)UPPER SIXTH /YEARBORN(1)1954(2)1955(3)1956(4)1957(5)1958 /MONTH(1)JANUARY(2)FEBRUARY(3)MARCH(4)APRIL(5)MAY(6)JUNE(7)JULY (8)AUGUST(9)SEPTEMBR(10)OCTOBER(11)NOVEMBER(12)DECEMBER /VAR111 TO VAR119(1)MOST IMPORTNT(2)NEITHER(3)LEAST IMPORTNT /JOB1 TO JOBAT25(1)ACCNTNCY,FINANCE(2)ARCHIT- ECTURE (3)CIVIL ENGINEER(4)CREATIVE ARTIST(5)DOCTOR, DENTIST (6)FASHION(7)GOVNMNT,ADMIN.(8)HOUSE -WIFE(9)INDUST. TECH. (10)JOURN- ALISM(11)MILITARY SERVICE(12)NURSING (13)OUTDOOR,ATHLETIC(14)OWN BUSINESS(15)PERFORM-ING ARTS (16)PERSONN-EL MNGMT(17)POLITICS(18)PUBLISH -ING (19)SALES + MARKETNG(20)SCIENCE-MATHS(21)SCIENCE-BIOLOGY (22)SCIENCE-SOCIAL(23)SECRET -ARY(24)SOCIAL WORK (25)SOLICTR,BARRISTR(26)TEACHER-PRIMARY(27)TEACHER-SECNDARY (28)TOWN PLANNING(29)TV,FILM PRODUCER(30)UNIVSTY LECTURER (31)LIBRAR -IAN(32)PUBLIC RELATNS(33)COMP- UTERS(34)OTHER ValueLabels 1973 (Attitudes and Opinions of Senior Girls: St Trinian’s)

  35. ValueLabels 2002 St Trinian’s (twice modified) • Before • /JOB1 TO JOBAT25 • (1) ACCNTNCY,FINANCE • (2) ARCHIT- ECTURE • (3) CIVIL ENGINEER • (4) CREATIVE ARTIST • (5) DOCTOR, DENTIST • (6) FASHION • (7) GOVNMNT,ADMIN. • (8) HOUSE -WIFE • (9) INDUST. TECH. • (10) JOURN- ALISM • After • /job1 to jobat25 • (1) Accntncy,finance • (2) Archit- ecture • (3) Civil engineer • (4) Creative artist • (5) Doctor, dentist • (6) Fashion • (7) Govnmnt,admin. • (8) House -wife • (9) Indust. tech. • (10) Journ- alism

  36. VALUE LABELS VAR109 (1) LOT MORE (2) LITTLE MORE (3) SAME (4) LITTLE LESS (5) LOT LESS /VAR110 (1) FORWARDS (2) BACKWRDS /VAR123 (1) UNSKILLDMAN WKRS (2) SKILLDMAN WKRS (3) OFFICE WORKERS (4) PROFES- SIONAL (5) COMPANY DIRECTRS (6) SHOP KPRS ETC (7) OAP'S (8) INVESTRS ETC (9) NOT KNOWN Value Labels 1973 (SSRC Quality of Life: 1st pilot survey 1971)

  37. Output formats(before Windows)

  38. FREQUENCY COUNT WITH LABELS (1973) AGEGRP: Age group Relative Adjusted Cum Absolute freq freq freq Code freq ( % ) ( % ) ( % ) 17-29 1. 206 22.1 22.4 22.4 30-44 2. 214 23.0 23.3 45.8 45-59 3. 242 26.0 26.4 72.1 60+ 4. 256 27.5 27.9 100.0 99. 14 1.5 Missing 100.0 ------ ------ ------- Total 932 100.0 100.0 Valid cases 918 Missing cases 14

  39. VAR147 SATISFACTION WITH WHOLE LIFE Code I 1 ** ( 1) I 2 ** ( 1) I 3 ** ( 2) I 4 ****** ( 9) I 5 ********** ( 18) I 6 ********** ( 17) I 7 ******************* ( 35) I 8 ******************************* ( 60) I 9 ****************** ( 34) I 10 ****************** ( 33) I.........I.........I.........I.........I.........I 0 20 40 60 80 100 Frequency Mean 7.610 Median 7.867 Std dev 1.801 Valid cases 210 Missing cases 0 HISTOGRAM PLOT (with optional statistics) 1973 [1] This was done before the graphics facilities were added to SPSS

  40. AGE AGE OF R IN COMPLETE YEARS Adj Cum Adj Cum Adj Cum Code Freq % % Code Freq % % Code Freq % % 18 15 2 2 42 14 2 42 66 14 2 83 19 16 2 3 43 14 2 44 67 20 2 85 20 19 2 5 44 19 2 46 68 12 1 87 21 17 2 7 45 11 1 47 69 18 2 89 22 19 2 9 46 15 2 49 70 13 1 90 23 16 2 11 47 14 2 50 71 8 1 91 24 16 2 13 48 17 2 52 72 8 1 92 25 14 2 14 49 15 2 54 73 12 1 93 26 19 2 16 50 24 3 56 74 9 1 94 27 25 3 19 51 16 2 58 75 8 1 95 28 13 1 21 52 15 2 60 76 7 1 96 29 16 2 22 53 19 2 62 77 7 1 97 30 13 1 24 54 14 2 63 78 6 1 97 31 13 1 25 55 15 2 65 79 4 0 98 32 24 3 28 56 13 1 66 80 5 1 98 33 7 1 29 57 19 2 68 81 3 0 98 34 19 2 31 58 16 2 70 82 6 1 99 35 13 1 32 59 19 2 72 83 2 0 99 36 7 1 33 60 10 1 73 85 1 0 99 37 12 1 34 61 15 2 75 86 1 0 100 38 14 2 36 62 17 2 77 87 1 0 100 39 13 1 37 63 14 2 78 88 2 0 100 40 15 2 39 64 17 2 80 90 1 0 100 41 17 2 41 65 15 2 82 M i s s i n g d a t a Code Freq Code Freq Code Freq Wild 15 CONDENSED FORMAT FREQUENCY COUNT (not available in Windows)

  41. CONTINGENCY TABLE WITH LABELS SEX SEX OF RESPONDENT by HAPPY HOW HAPPY IS R? HAPPY Count : Row % :NOT TOO PRETTY VERY Row :HAPPY HAPPY HAPPY Total : 1 : 2 : 3 : SEX --------:--------:--------:--------: 1 : 24 : 230 : 131 : 385 MEN : 6.2 : 59.7 : 34.0 : 41.6 -:--------:--------:--------: 2 : 33 : 286 : 222 : 541 WOMEN : 6.1 : 52.9 : 41.0 : 58.4 -:--------:--------:--------: Column 57 516 353 926 Total 6.2 55.7 38.1 100.0 Number of missing observations = 6

  42. CONTINGENCY TABLE WITH ALL PERCENTAGES SEX SEX OF RESPONDENT by AGEGROUP GROUPED OF R AGEGROUP Count : Row % :17-29 30-44 45-59 60+ Row Col % : Total Tot % : 1 : 2 : 3 : 4 : SEX --------:--------:--------:--------:--------: 1 : 88 : 90 : 110 : 92 : 380 MEN : 23.2 : 23.7 : 28.9 : 24.2 : 41.4 : 42.7 : 42.1 : 45.5 : 35.9 : : 9.6 : 9.8 : 12.0 : 10.0 : -:--------:--------:--------:--------: 2 : 118 : 124 : 132 : 164 : 538 WOMEN : 21.9 : 23.0 : 24.5 : 30.5 : 58.6 : 57.3 : 57.9 : 54.5 : 64.1 : : 12.9 : 13.5 : 14.4 : 17.9 : -:--------:--------:--------:--------: Column 206 214 242 256 918 Total 22.4 23.3 26.4 27.9 100.0 Number of missing observations = 14 (NB: Extensive use of this format in analysis is usually a sign of inexperience and anxiety in researchers (or their supervisors) who are either too proud to ask for advice and assistance or who are possibly even completely incompetent. It is also a waste of paper, time and money!)

  43. MEANS SEXISM BY SEX SEXISM Q33 Sexism score SEX Sex of respondent Variable Value Label Mean Cases For Entire Population 2.8810 86 SEX 1 Boys 3.9729 42 SEX 2 Girls 1.8389 44 Total Cases = 86

  44. MEANS SEXISM BY SEX BY ETHNIC SEX Sex of respondent Variable Value Label Mean Cases For Entire Population 2.8810 86 ETHNIC 1 White 3.2600 38 SEX 1 Boys 4.6300 19 SEX 2 Girls 1.8900 19 ETHNIC 2 Black 2.5810 48 SEX 1 Boys 3.4300 23 SEX 2 Girls 1.8000 25 Total cases = 86

  45. CROSSBREAK (not available in Windows) MEANS VARIABLES = SEXISM(0,9) V348(1,2) ETHNIC(1,2) /CROSSBREAK = SEXISM BY V348 BY ETHNIC /CELLS = MEAN COUNT

  46. CROSSBREAK output (not available in Windows) ETHNIC Mean : Count : White Black Row : Total : 1 : 2 : SEX --------:----------:----------: 1 : 4.63 : 3.43 : 3.98 Boys : 19 : 23 : 42 -:----------:----------: 2 : 1.89 : 1.80 : 1.84 Girls : 19 : 25 : 44 -:----------:----------: Column Total 3.26 2.58 2.88 38 48 86

  47. RECODE SEXISM (2 THRU 7 = 100) (0,1 = 0) (ELSE = SYSMIS) MEANS VARIABLES = SEXISM (0,100) V348 (1,2) ETHNIC (1,2) /CROSSBREAK = SEXISM BY V348 BY ETHNIC /CELLS = MEAN COUNT Crafty use of Crossbreak

  48. Crafty use of CROSSBREAK: output ETHNIC Mean : Count : White Other Row : Total : 1 : 2 :V348 --------:----------:----------: 1 : 100.00 : 82.61 : 90.48 Boys : 19 : 23 : 42 -:----------:----------: 2 : 47.37 : 44.00 : 45.45 Girls : 19 : 25 : 44 -:----------:----------: Column Total 73.68 62.50 67.44 38 48 86Number of missing observations = 56 Used with RECODE: cells are % "sexist" and base n

  49. Back to variable names

  50. /2 VERSION 8 READPAP 9 WHPAPER 10-11 SUPPARTY CLOSEPTY 12-13 PARTYID1 14-15 IDSTRNG CNTLCNCL RATES RENTS EEC NATO NATION USANUKE OWNNUKEUKNUCPOL DEFPARTY PEACE NIRELAND TROOPOUT 16-29 HINCDIFF HINCPAST HINCXPCT 31-55 RECONACT 56-57 RFTEDUC RTRAING RPAIDWRK RWAITWRK RREGISTD RSEEKWRK RNTLOOK RSICK RRETIRD RATHOME RELSE REMPLOYE 58-69 EJBHOURS 70-71 EJBHRCAT WAGENOW PAYGAP WAGEXPCT NUMEMP EMSMEWRK EMSEXWRK EMWOMCLD EMWOMWLD 72-80 Data List with mnemonic variable names(British Social Attitudes 1987: Curtice)

More Related