330 likes | 421 Vues
Understand how to access and analyze large-scale secondary data, exploring the benefits and drawbacks, methods of obtaining datasets, and analytical possibilities. Enhance skills in data management and analysis using software tools like SPSS and SAS.
 
                
                E N D
Lecture 9 of 47C5 Social Research Process I: Using Secondary Datasets Paul Lambert, 8.10.03, 9-10am
Resources for lectures 8,9,11,12 • Lecture slides on WebCT site • 2 Reading lists: • Initial list in 47C5 unit outlines • Some additions on further list at WebCT site Also: http://staff.stir.ac.uk/paul.lambert/teaching.htm
L9: Using Secondary Datasets Introduction and background Accessing secondary datasets Qualities of secondary datasets Data analysis / management issues Key variables in survey research
1) Introduction and Background • Vast quantity of surveys conducted • An efficient step would be to analyse existing data (secondary) rather than personally collect your own (primary) • Data archives collate survey datasets and supply them for secondary analysis
Large scale data Lecture 8: Modern social survey analysis most often either large scale secondary or small scale primary • Several assets of large scale surveys: • Generalise • Multivariate (more variables and more cases)
Large surveys’ high expenses: • Government funds many large surveys (also EU; LA’s; charities; commercial) • Often made available freely or at low cost •  An ideal research tool (see ESRC): • Quick to access • Methodological rigour • Falsifiable – others can access also
Secondary analysis of surveys • Makes particular sense when large scale datasets are desirable • Also often applies to smaller surveys • Involves particular issues of data analysis, management and interpretation • …Is a highly marketable skill!
2) Accessing Secondary datasets • Internet and computing developments have revolutionised delivery of data resources • Three steps to data access: • Find out survey details / documentation • Apply for access from archive or collectors • Obtain and analyse the data
2.1) Finding Details • The modern way: • Internet search, eg UK data archive, UK Question Bank, many others (reading list) • The old fashioned way: • Look out for research reports using datasets and contact authors / data collectorsdirectly
The UK data archive www.data-archive.ac.uk • ESRC Efforts to encourage usage • ‘Athens’ authentication • Survey descriptions and lists of research • Variable lists • ‘NESSTAR’ to browse data • Links to more sources for secondary data
2.2) Applying for access • The modern way: • Email / webpage forms, agree to conditions of access (anonymised data to reduce ‘disclosure risk’) • The old fashioned way: • Personal contacts and requests to original data collectors
2.3) Obtaining / analysing data • The modern way: • Download data from supplier (usually compressed and portable format), use with documentation and variable lists in data analysis package (eg SPSS) • The old fashioned way: • A plain text computer file on disk, and copy of original questionnaire, arrive by post: good luck!
3) Qualities of Secondary data • Efficient: cheap & quick to access / analyse • Scale of data larger than most can afford • Methodological rigour of major suppliers: • Sampling • Questionnaire and variable design • Trained interviewers and data entry • Falsifiable nature of analysis
Some drawbacks • Distance from data collection • Harder to assess reliability / validity • Many variables already pre-coded • Can’t change / add anything in study • Time delays in accessing to results • Data analysis / management complex • May be bracketed with survey originators
Analytical possibilities vary by survey data type One division: Mirco-social v’s Macro-social most social survey analysis uses former • Macro-social data • Government statistics www.statistics.gov.uk • Cross-national statistics (UN, OECD) • Macro-economic time series (trends / forecasts) • Beware: many critiques of ‘official statistics’
Types of micro-social data • Census’s • General overview of whole population • Disclosure risk issues • Cross-sectional surveys • Most widely used sources • Huge range of topic coverage • May be used to study small / rare populations
..more types of micro-social data • Longitudinal datasets • Repeated cross-sections • Panel datasets • Cohort studies • Retrospective studies • Strengths: understand process and causality • Problems: sampling and attrition; complexity
..more types of micro-social data • Cross-nationally comparative datasets • Focussed surveys (IPUMS census’s; ISSP; World Values Survey; European Social Survey) • Longitudinal studies (LIS; ECHP; CHER) • Many analytical attractions, but issues of comparable analysis are complex
Some major UK social surveys • Cross-sectional:
Some major UK social surveys • Longitudinal:
4) Data analysis & management • ..become core skills in using secondary surveys… • Software packages – SPSS, SAS, STATA, .. – with wide capabilities • Good and bad practice – should only do sensible things with data… (see 47C6)
Data Analysis • Good practice • Reflects properties of variables • Describes output in appropriate context • Bad practice (..is widespread) • Forcing data into style of analysis • Attributing false properties to data • Over zealous conclusions
Data Analysis Assessing appropriateness of data analysis techniques is inherent to assessing survey research findings(need to learn about statistics and analysis..) • Secondary data analysis misuses common – too easy to get data & run (bad) analyses • Primary theme: must remember social context and theories throughout analysis
Data management • Matching data files • Coding / transforming variables • Dealing with ‘missing’ data Secondary dataset management tends to be: • More complex  • More error prone  • Subject to external scrutiny 
5) Key variables in social investigations • Variable operationalisation key to surveys • Choices: - in initial data collection - in data recoding / analytic treatment • In secondary analysis, researcher can only influence latter • Here: Comment on some widely used variables (cf Burgess 1986, others)
Age and gender: • Age • Linear or grouping or quadratic.. - which has most social significance? • Age / Period / Cohort confusion • Gender: • Deceptively simple, politically sensitive • Concepts of sexuality; masculinity
Education and occupation: • Education • Changing ‘levels’ of education over time • Education as proxy for ability, intelligence? • Occupation • Contested meanings of labour market status • Occupational indicators of stratification • Occupational gender segregation
Ethnicity and Health: • Ethnicity • Existence of groups or racist language? • Identity v’s nationality v’s religion v’s .. • Health • Subjective nature of self-reports • Changing terminology and social stigmas
Income and crime • Income • High non-response and recording errors • Current income  general well-being? • Crime • Most crimes not reported • Categories of crimes arbitrary / debated / changing
Key variables: summary • Methods guidelines on appropriate handling ‘Harmonised concepts and questions’; textbooks; papers / debates specific issues • Choices / approximations always used • Research reports and methods appendices must explain and justify position taken
Summary: Secondary datasets • Wealthy resource for survey analysis • Issues and problems in use – but benefits outweigh disadvantages • To understand, best tactic is to read social science research reports based on relevant secondary datasets