Introduction to Factor Analysis

Introduction to Factor Analysis Bonnie Halpern-Felsher, Ph.D. Megie Okumura, MD, MAS

Road Map • Definition and purpose of factor analysis (example) • Types of factor analysis • Considerations when conducting an Exploratory Factor Analysis (EFA) • Beyond EFA

What is Factor Analysis? A statistical technique used to analyze interrelationships among many variables

Example:Adolescent Invulnerability Scale “A felt sense of invulnerability to injury, harm, and danger.” (Lapsley & Hill, 2010)

Psychological Invulnerability “One’s felt invulnerability to personal or psychological distress.”

Danger Invulnerability “A sense of indestructibility and propensity to take physical risks.”

Psychological Invulnerability Danger Invulnerability Circles = Factors Constructs Latent Variables

Psychological Invulnerability Danger Invulnerability Item 2 Item 4 Item 7 Item 19 Item 1 Item 3 Item 5 Item 20 … … Measured Variables Boxes = Observed Variables Manifest Variables

Psychological Invulnerability Danger Invulnerability Item 2 Item 4 Item 7 Item 19 Item 1 Item 3 Item 5 Item 20 … …

Psychological Invulnerability Danger Invulnerability 0.50 Item 2 Item 4 Item 7 Item 19 Item 1 Item 3 Item 5 Item 20 … … Factor Loading (λ): A measure of the influence of a factor on an observed variable; tells strength and direction of influence

λ2 = percent of variance in the measured variable that is accounted for by the factor Example: 0.6962 = 0.48 = 48% Interpretation: 48% of variance in item is accounted for by Factor 1.

Principles Behind Factor Analytic Theory • Interrelationships between all possible observed variables may be explained by a small number of factors • Given a set of data, we want to determine the number and nature of underlying factors and the pattern of influence those factors have on the observed variables

Types of Factor Analysis • Exploratory Factor Analysis (EFA) • You do not know what factors you will find (although you may have some idea) • Often used in scale development • Confirmatory Factor Analyses (CFA) • You specify which measured variables will load on which factors • This a special case of something called Structural Equation Modeling (SEM)

Exploratory Factor Analysis

Item-Level Factor Analysis • Often used to analyze many items that comprise a self-report measurement scale • Is the scale unidimensional or multidimensional? • Does it measure one underlying construct (factor), or… • Does it measure several underlying constructs?

Things to Consider • What statistical software should I use? • Do I use Factor Analysis or Principle Components Analysis? • Have I created my scale items responsibly? • Do my data meet the assumptions of FA? • Which estimation method should I use? • Should I use a rotated solution? What type? • How do I decide how many factors to extract? • Do I have an adequate sample size?

Statistical Software • SPSS • STATA • SAS • AMOS • EQS • LISREL • Different statistical programs label their output differently • You will need to find out how your program labels its output

Comparing FA and PCA (Preacher & MacCallum, 2003)

Issues with Principle Components Analysis • Many people compute a PCA and say it is a FA • This is wrong because FA and PCA are not the same thing • The two methods may give similar results, but not always • Warning: Some programs carry out PCA as the default (SPSS)

Developing Scale Items • Make sure you have enough items • You may have to delete some later • Make sure items are at least face valid, and are based on theory/previous research • Choose your response scale carefully • Ordinal response scales (e.g., Likert scales) can introduce additional analytic concerns

Testing Assumptions of Factor Analysis • No outliers • Interval data • Linearity • Multivariate normality • Homoscedasticity • No perfect multicollinearity

What is Estimation? • Process of using a set of mathematical procedures to estimate a statistical model (“find the solution”) • Also called extraction

Choosing an Estimation Method • There are a variety of available methods • Limited information on relative strengths and weaknesses • Inconsistent names • Vary by statistical software • General recommendation • Maximum Likelihood Estimation for normally distributed data • Principle Factors Method if data are non-normal (Costello & Osborne, 2005)

Rotating Your Solution • Rotation is used to find the most easily interpretable solution • Orthogonal rotation • Forces your factors to be uncorrelated • Several types (e.g., Varimax rotation) • Oblique rotation • Allows your factors to be correlated • Several types (e.g., Promax rotation)

Orthogonal vs. Oblique Rotation • How often are constructs in the behavioral sciences completely unrelated in practice?

r = .50 Psychological Invulnerability Danger Invulnerability Item 2 Item 4 Item 7 Item 19 Item 1 Item 3 Item 5 Item 20 … …

Which Type of Rotation? • Oblique rotation is often the safest bet • If your factors are actually uncorrelated, you will get roughly the same solution as if you used an orthogonal rotation • Rotations are mathematically equivalent and do not affect how well the model fits the data

Which Rotated Solution is Best? • Look at simple structure • Each item loads heavily on one and only one factor

Choosing Number of Factors • Two common methods • K1 Rule • Cattell’sScree Test • Much more accurate method • Parallel Analysis

K1 Rule: # of Factors = # of Eigenvalues > 1 (Eigenvalues represent the total amount of variance explained by a factor) *Not very accurate*

Cattell’sScree Test: Choose the number of factors that precedes the last big drop on the scree plot Can be subjective *Not very accurate*

Parallel Analysis: Number of Factors = Number of points on the Factor Analysis line that are above the Parallel Analysis line *Accurate, but rarely used* STATA, SPSS, SAS

Procedural Options • Determine number of factors based on parallel analysis • Re-run factor analysis with a few more or a few less factors • Compare results of different factor analyses with regard to interpretability, residuals, communalities

Interpretability • Are the factors even interpretable? • Which variables load on which factors? • Do the loadings make sense according to previous research, theory, and common sense?

Looking at Communalities • Communality (h2): percent of variance in a given measured variable that is explained by all of the factors jointly • Implications of low communality • For one item  Factor model is not working well for that item; consider deleting item • For several items  Items are not very related to each other

Example: Communalities • Get table of communalities in computer output • For “Nothing can harm me” • h2 = .59 • The extracted factors explain 59% of the variation in this item

Evaluating Residual Correlations • Two sets of correlations among items • Correlations predicted by your factor model (reproduced correlations) • Observed correlations • Residual correlations • (Reproduced) – (Observed) • Should be close to zero if your model fits your data well

Sample Size for Factor Analysis • EFA is a large sample procedure • Old rule of thumb: Ratio of cases to items should be at least 10:1 • FA is still prone to error at ratio of 20:1 (Costello & Osborne, 2005)

Sample Size Cont’d • If you have “strong” data, you may get by with a smaller sample size • How to define strong data? • High communalities ( >.80, >.40) • No cross-loadings (λ ≥ .32 on ≥ 2 factors) • Several items loading on each factor (Not < 3 items; Preferably > 5 items) (Costello & Osborne, 2005)

Naming Your Factors • We often assign a “meaningful” label to each factor • e.g., Danger Invulnerability • Beware the naming fallacy! • Just because a factor is named does not mean that the hypothetical construct is understood or even correctly labeled • Beware reification! • The belief that a hypothetical construct must correspond to a real thing (Klein, 2005)

Extensions • Confirmatory Factor Analysis (CFA) • Perhaps you have done an EFA, and now you want to replicate the results of your EFA in a new sample • Structural Equation Modeling (SEM) • You can look at predictive relationships among latent constructs

Confirmatory Factor Analysis Nicotine Dependence Depression MNWS HONC FTND Hamilton CES-D BDI-II

Structural Equation Modeling ? Nicotine Dependence Depression MNWS HONC FTND Hamilton CES-D BDI-II

In Summary… • Definition and purpose of factor analysis (example) • Types of factor analysis • Considerations when conducting an Exploratory Factor Analysis (EFA) • Beyond EFA (CFA/SEM)

Introduction to Factor Analysis

Introduction to Factor Analysis

Presentation Transcript

Factor Analysis

Factor Analysis

Factor Analysis

Introduction to scale factor

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

An Introduction to Factor Analysis

Factor Analysis

Factor Analysis

An Introduction to Factor Analysis

FACTOR ANALYSIS

Factor Analysis

Factor Analysis:

Factor Analysis

Factor Analysis

FACTOR ANALYSIS

Factor Analysis