Latent Class Analysis: What It Is and How to Use It

Learn what latent class analysis is, how it identifies hidden segments in survey data, and when to use LCA for market research segmentation.

What Is Latent Class Analysis?

Latent class analysis (LCA) is a statistical method that identifies unobserved subgroups within a population based on patterns of responses across multiple observed variables. Unlike cluster analysis, which groups cases using distance measures on continuous variables, LCA works with categorical data and uses probability-based classification. Each respondent receives a probability of belonging to each latent class rather than a hard assignment to a single group. The technique was formalized by Lazarsfeld and Henry in 1968 and has since become a standard tool in market research segmentation, behavioral science, and health research. When you suspect your survey respondents aren't one homogeneous group but you can't see the subgroups directly, LCA finds them for you.

Why Latent Class Analysis Matters

Traditional segmentation approaches often rely on demographics or a single behavior variable, which misses the reality that people form groups based on combinations of attitudes, preferences, and behaviors simultaneously. LCA captures these multivariate patterns and reveals segments that demographic cuts alone would miss entirely. A Yankelovich study found that attitudinal segments identified through latent class methods predicted brand choice 2-3x better than demographic segments for consumer packaged goods.

How Latent Class Analysis Works

From Observed Responses to Hidden Groups

Suppose you run a survey asking 1,000 consumers about their attitudes toward organic food, price sensitivity, brand loyalty, shopping frequency, and preferred retail channels. Each respondent answers five categorical questions. LCA examines the joint distribution of all responses together and finds that certain answer patterns cluster into distinct profiles. One group might combine high organic preference, low price sensitivity, and specialty-store shopping. Another might show moderate organic interest, high price sensitivity, and supermarket-only shopping. These profiles are the latent classes.

Choosing the Number of Classes

You don't specify the segments in advance; you test models with different numbers of classes (typically 2 through 7) and compare fit statistics to find the best solution. The key metrics are the Bayesian Information Criterion (BIC), where lower values indicate better fit with appropriate complexity, and entropy, which measures how cleanly respondents are classified. An entropy value above 0.80 indicates good separation between classes. You also look at whether each class is substantively interpretable and large enough to be actionable. A technically optimal 6-class solution where one class contains 3% of respondents rarely works for marketing decisions.

Interpreting the Output

LCA produces two key outputs. First, class membership probabilities tell you how likely each respondent is to belong to each class. Second, item-response probabilities show the likelihood of each response category within each class. If Class 2 has an 85% probability of answering "very price sensitive" and a 70% probability of preferring online shopping, you've identified a price-conscious e-commerce segment. You can then profile each class against demographics, product usage, or media consumption to build actionable personas.

A Worked Example

A meal-kit delivery service surveyed 2,500 subscribers on cooking frequency, dietary restrictions, ordering motivation (convenience vs. cooking enjoyment), price tier preference, and ingredient flexibility. A 4-class LCA solution emerged with strong fit (BIC = 12,450; entropy = 0.84). Class 1 (32%) were "convenience seekers" who rarely cooked otherwise and chose the cheapest tier. Class 2 (24%) were "cooking enthusiasts" who ordered premium tiers and wanted exotic ingredients. Class 3 (28%) were "health-focused planners" who filtered by dietary restrictions and meal-prepped. Class 4 (16%) were "occasional treaters" who ordered sporadically for weekend meals. The company redesigned its email campaigns to target each segment with different messaging and saw a 22% increase in reorder rates.

LCA vs. K-Means Clustering

Both methods find groups, but they differ in important ways. K-means works with continuous variables and assigns each case to exactly one cluster based on distance from centroids. LCA works with categorical variables and assigns probabilistic membership. LCA also provides formal statistical criteria for selecting the number of groups, while K-means relies on heuristics like the elbow method. For survey data with Likert scales or multiple-choice responses, LCA is usually the better choice because it respects the categorical nature of the data.

When to Use Latent Class Analysis

Market segmentation studies where you want to identify attitude-based or behavior-based segments from survey data
Customer typology research grouping users by their combined product usage, preferences, and needs
Health behavior research classifying patients by combinations of risk factors, adherence behaviors, and treatment preferences
Conjoint analysis extensions using latent class conjoint to discover preference-based segments with different utility structures
Any categorical survey dataset where you suspect hidden subgroups drive different response patterns

Common Mistakes

Selecting the number of classes based only on fit statistics without checking whether each class is substantively meaningful and large enough to act on
Treating class assignments as certain when membership probabilities are below 0.70 for many respondents, which means the classes aren't well-separated
Using LCA on continuous data without discretizing first because the standard LCA model assumes categorical indicators; use latent profile analysis for continuous variables instead

How Quali-Fi Supports Latent Class Analysis

Quali-Fi's Research plan includes built-in cross-tabulation and segmentation tools that help you identify preliminary patterns before running LCA in dedicated statistical software. The platform exports clean, labeled datasets in SPSS and CSV formats with variable metadata intact, which saves significant data-prep time when moving to LCA analysis.

Frequently Asked Questions

How large a sample do I need for latent class analysis?

Most researchers recommend a minimum of 300-500 respondents for stable LCA results, though the exact requirement depends on the number of indicators and the number of classes you're testing. Models with more indicators and more classes need larger samples. A rough guideline is at least 50 cases per estimated parameter.

Can LCA handle ordinal data like Likert scales?

Standard LCA treats variables as nominal (unordered categories). For ordinal data, you can either collapse Likert responses into fewer categories or use an ordinal LCA variant that respects the ordered structure. Many software packages, including Mplus and R's poLCA package, support ordinal indicators.

How is latent class analysis different from factor analysis?

Factor analysis identifies latent continuous dimensions underlying observed variables. LCA identifies latent categorical groups. Factor analysis tells you "these items measure the same construct." LCA tells you "these respondents form distinct subpopulations." They answer fundamentally different questions, and some studies use both.

Build segmentation-ready surveys -- try Quali-Fi free for 14 days.

What Is Latent Class Analysis?

Why Latent Class Analysis Matters

How Latent Class Analysis Works

From Observed Responses to Hidden Groups

Choosing the Number of Classes

Interpreting the Output

A Worked Example

LCA vs. K-Means Clustering

When to Use Latent Class Analysis

Common Mistakes

How Quali-Fi Supports Latent Class Analysis

Frequently Asked Questions

How large a sample do I need for latent class analysis?

Can LCA handle ordinal data like Likert scales?

How is latent class analysis different from factor analysis?

Frequently Asked Questions

Related Guides

Factor Analysis Applied to Survey Data: Walkthrough

Cross-Tabulation Analysis: Applied Walkthrough

Conjoint Analysis: Complete Guide for Researchers

Brand Tracking Data Analysis: Applied Guide

Sample Size Formula: Detailed Walkthrough With Examples

Ready to apply this in your research?

Latent Class Analysis: What It Is and How to Use It

What Is Latent Class Analysis?

Why Latent Class Analysis Matters

How Latent Class Analysis Works

From Observed Responses to Hidden Groups

Choosing the Number of Classes

Interpreting the Output

A Worked Example

LCA vs. K-Means Clustering

When to Use Latent Class Analysis

Common Mistakes

How Quali-Fi Supports Latent Class Analysis

Frequently Asked Questions

How large a sample do I need for latent class analysis?

Can LCA handle ordinal data like Likert scales?

How is latent class analysis different from factor analysis?

Related Topics

Frequently Asked Questions

Related Guides

Factor Analysis Applied to Survey Data: Walkthrough

Cross-Tabulation Analysis: Applied Walkthrough

Conjoint Analysis: Complete Guide for Researchers

Brand Tracking Data Analysis: Applied Guide

Sample Size Formula: Detailed Walkthrough With Examples

Ready to apply this in your research?