Statistical Concepts

Factor Analysis: What It Is, EFA vs. CFA, and How to Interpret Results

6 min read

Learn what factor analysis is, the difference between exploratory and confirmatory approaches, and how to interpret factor loadings in research.

What Is Factor Analysis?

Factor analysis is a statistical method that reduces a large number of observed variables into a smaller set of underlying dimensions called factors. It works on the premise that groups of correlated variables share a common latent construct that isn't directly measured. For example, if survey respondents who rate a brand highly on "reliable" also tend to rate it highly on "trustworthy" and "dependable," factor analysis identifies these three items as indicators of a single underlying factor you might label "Brand Trust." Instead of analyzing 25 individual survey questions, you end up working with 4-6 interpretable factors that explain most of the variation in responses.

Why Factor Analysis Matters

Survey instruments often include dozens of items, and analyzing them individually creates noise and redundancy. Factor analysis solves this by revealing the structure underneath your data, which questions are really measuring the same thing, and how many distinct constructs your survey actually captures. It's also essential for scale development and validation: if you're building a customer satisfaction instrument, factor analysis confirms whether your items group into the dimensions you intended.

How Factor Analysis Works

Exploratory Factor Analysis (EFA)

EFA is used when you don't have a strong hypothesis about the underlying structure. You let the data reveal how variables cluster together. The process involves:

  1. Calculate a correlation matrix: measure how every variable relates to every other variable
  2. Extract factors: identify the underlying dimensions that account for the correlations (common methods: principal axis factoring, maximum likelihood)
  3. Determine the number of factors: use eigenvalues > 1 (Kaiser criterion), the scree plot, or parallel analysis
  4. Rotate the solution: make factors easier to interpret (common rotations: varimax for uncorrelated factors, oblimin for correlated factors)
  5. Interpret and label factors: examine which variables load on each factor and assign meaningful names

Confirmatory Factor Analysis (CFA)

CFA tests whether a pre-specified factor structure fits the data. You hypothesize which items belong to which factors and then evaluate model fit. CFA is used when you're validating an existing instrument or confirming a structure found through EFA in a new sample.

Key fit indices for CFA:

  • CFI (Comparative Fit Index): > 0.90 acceptable, > 0.95 good
  • RMSEA (Root Mean Square Error of Approximation): < 0.08 acceptable, < 0.06 good
  • SRMR (Standardized Root Mean Residual): < 0.08 good

EFA vs. CFA

EFA CFA
Purpose Discover structure Confirm structure
When to use New instruments, exploratory research Validating scales, testing theory
Hypothesis None, data-driven Pre-specified model
Software SPSS, R, Python AMOS, Mplus, R (lavaan)
Output Factor loadings, eigenvalues Fit indices, path coefficients

Worked Example

You surveyed 500 consumers on 12 brand perception attributes for a fast-food chain. Running EFA reveals three factors:

Factor 1, "Quality" (eigenvalue = 4.2, 35% of variance)

  • Fresh ingredients: loading = 0.82
  • Taste quality: loading = 0.79
  • Food presentation: loading = 0.71
  • Menu variety: loading = 0.65

Factor 2, "Convenience" (eigenvalue = 2.1, 17.5% of variance)

  • Speed of service: loading = 0.85
  • Location accessibility: loading = 0.78
  • Mobile ordering: loading = 0.72

Factor 3, "Value" (eigenvalue = 1.5, 12.5% of variance)

  • Price fairness: loading = 0.81
  • Portion size: loading = 0.74
  • Deals and promotions: loading = 0.69

These three factors together explain 65% of the total variance. Instead of comparing brands on 12 individual attributes, you can now compare them on three meaningful dimensions. Items with loadings below 0.40 on all factors would be candidates for removal.

Key Decisions in Factor Analysis

How many factors to retain? The Kaiser criterion (eigenvalues > 1) often over-extracts. Parallel analysis is more reliable, it compares your eigenvalues to those from random data of the same size. Retain only factors with eigenvalues exceeding the random baseline.

Which rotation? Use varimax if you believe factors are independent. Use oblimin or promax if you expect factors to correlate with each other (which is common in real data, "quality" and "value" perceptions often correlate).

What's a good loading? Generally, 0.40 or higher is the minimum threshold. Loadings above 0.70 are strong. Items that load above 0.40 on two or more factors (cross-loadings) are problematic and may need to be removed or rewritten.

When to Use Factor Analysis

  • Scale development to confirm that your survey items measure the constructs you intend
  • Data reduction to collapse many variables into a manageable number of composite scores for further analysis
  • Brand perception mapping to identify the key dimensions along which consumers evaluate brands
  • Segmentation preprocessing: factor scores often serve as inputs for cluster analysis
  • Questionnaire refinement to identify redundant or poorly performing items

Common Mistakes to Avoid

  • Running factor analysis with too few observations: aim for at least 5-10 respondents per variable, with a minimum of 200 total; below this, factor solutions are unstable
  • Labeling factors based on one or two items: a factor defined by fewer than three items is weak and may not replicate; strong factors have 3+ items with loadings above 0.50
  • Using principal component analysis interchangeably with factor analysis: PCA extracts components that maximize total variance, while factor analysis models shared variance; they answer different questions and can produce different results

How Quali-Fi Supports Factor Analysis

Quali-Fi's Intelligence tier includes an exploratory factor analysis module that calculates eigenvalues, generates scree plots, and produces rotated factor loading tables directly from your survey data. The platform flags cross-loadings and low-loading items, making it easy to refine your instrument without switching to external statistical software.

Run factor analysis in Quali-Fi

Frequently Asked Questions

How many variables do I need for factor analysis?

You generally need at least 3 variables per expected factor, and a total of at least 10-12 variables. More importantly, you need an adequate sample size, a common guideline is a minimum of 200 observations, or 5-10 observations per variable, whichever is larger.

Can I run factor analysis on ordinal data (like Likert scales)?

Technically, factor analysis assumes continuous data. In practice, Likert scales with 5 or more points are routinely analyzed with factor analysis and results are generally reliable. For scales with fewer categories (binary or 3-point), consider polychoric correlations instead of Pearson correlations as inputs.

What's the difference between factor analysis and principal component analysis?

Factor analysis models the shared variance among variables to identify latent constructs. PCA models total variance to create composites that maximize variance explained. PCA always explains more total variance, but factor analysis is better for understanding underlying constructs. If your goal is theoretical (what's driving these responses?), use factor analysis. If your goal is data reduction (give me fewer variables), PCA may suffice.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.