Data Collection & Analysis

Discriminant Analysis Explained

6 min read

Learn what discriminant analysis is, how it classifies cases into groups based on predictor variables, and how it's used in segmentation and research.

What Is Discriminant Analysis?

Discriminant analysis is a statistical technique that identifies which variables best distinguish between two or more predefined groups and uses those variables to classify new cases into the correct group. In market research, it's most commonly used as a follow-up to cluster analysis: after you've identified customer segments, discriminant analysis tells you which variables most differentiate those segments and builds a classification function that assigns new customers to segments automatically. It answers two questions simultaneously, what makes these groups different, and how can I predict group membership for someone I haven't measured yet? That combination of description and prediction makes it one of the most practically useful multivariate techniques in applied research.

Why Discriminant Analysis Matters

Segmentation loses most of its value if you can't classify new people into segments. The cluster analysis that identified your segments used a large set of survey variables, but you can't give every new customer a 30-minute survey. Discriminant analysis identifies the small subset of variables that do most of the differentiating, enabling the construction of typing tools, short classifiers that assign individuals to segments using just 5-8 questions. It's the bridge between a one-time analytical insight and an ongoing operational capability.

How Discriminant Analysis Works

The Core Logic

Discriminant analysis finds linear combinations of predictor variables (discriminant functions) that maximize the separation between groups. Think of it as the flip side of regression: instead of predicting a continuous outcome from group membership, you're predicting group membership from continuous (or at least ordered) variables.

For two groups, there's one discriminant function. For k groups, there are at most k-1 functions. Each function is a weighted combination of the predictor variables, and the weights tell you which variables contribute most to separating the groups.

The Mathematical Process

  1. Calculate group means on all predictor variables, how does each group score on each variable?
  2. Compute the between-group and within-group variance matrices: how much do groups differ versus how much do individuals within groups vary?
  3. Find the discriminant functions: linear combinations that maximize the ratio of between-group to within-group variance.
  4. Calculate discriminant scores for each case, apply the function weights to each individual's variable scores.
  5. Classify cases: assign each case to the group whose centroid (mean discriminant score) it's closest to.

Key Outputs

Discriminant function coefficients: the standardized weights for each variable in the function. Larger absolute values indicate stronger contributions to group separation. These tell you which variables matter most for distinguishing between groups.

Structure matrix: correlations between each predictor and the discriminant functions. Often more stable and interpretable than the function coefficients, especially when predictors are correlated.

Classification accuracy: the percentage of cases correctly classified by the discriminant functions. Reported as a classification matrix (also called confusion matrix) showing actual group membership against predicted group membership.

Wilks' lambda: a measure of overall separation between groups. Values close to 0 indicate strong separation; values close to 1 indicate poor separation. It's also used to test whether the discriminant functions are statistically significant.

Types of Discriminant Analysis

Linear discriminant analysis (LDA) assumes equal covariance matrices across groups and creates linear classification boundaries. It's the most common form and works well when the equal covariance assumption is approximately met.

Quadratic discriminant analysis (QDA) allows each group to have its own covariance structure, creating curved classification boundaries. Use it when groups differ in variance patterns, not just means.

Stepwise discriminant analysis adds variables one at a time, selecting those that contribute the most to group separation. It's useful for variable reduction, identifying the minimum set of variables needed for accurate classification. This is the basis for most typing tool development.

Building a Typing Tool

The practical payoff of discriminant analysis in segmentation is the typing tool:

  1. Run stepwise discriminant analysis to identify the 5-8 variables that produce acceptable classification accuracy.
  2. Convert those variables into survey questions that can be fielded independently.
  3. Apply the discriminant function to new respondents' scores to assign them to segments.
  4. Validate classification accuracy on a holdout sample that wasn't used to develop the function.

A well-built typing tool correctly classifies 70-85% of cases, depending on how distinct the original segments are.

When to Use Discriminant Analysis

  • Segmentation follow-up: identifying which variables best differentiate customer segments identified through cluster analysis.
  • Typing tool development: building short classification instruments that assign new individuals to existing segments.
  • Group comparison: understanding what distinguishes promoters from detractors, churners from retained customers, or buyers from non-buyers.
  • Predictive classification: assigning new cases to predefined groups based on measured characteristics.
  • Variable importance assessment: determining which survey items contribute most to group separation.

Common Mistakes to Avoid

  • Using discriminant analysis to find groups: it doesn't discover segments; it distinguishes between groups you've already defined. Use cluster analysis or latent class analysis for discovery, then discriminant analysis for classification and profiling.
  • Evaluating accuracy on the training data only: classification accuracy is always higher on the data used to build the model. Validate on a holdout sample or use cross-validation to get a realistic accuracy estimate.
  • Ignoring assumptions: LDA assumes multivariate normality, equal covariance matrices, and no multicollinearity. Violations don't necessarily invalidate the results, but severe violations warrant QDA or logistic regression as alternatives.

Quali-Fi Support

Quali-Fi's data exports to SPSS, R, and Python support discriminant analysis workflows directly from survey data. For segmentation studies using the Intelligence product, typing tool development is included as a standard deliverable, the platform helps you identify the minimum question set for segment classification and validates accuracy before deployment.

Frequently Asked Questions

How is discriminant analysis different from logistic regression?

Both predict group membership, but they approach the problem differently. Discriminant analysis models the predictor distributions within each group. Logistic regression models the probability of group membership directly. Logistic regression makes fewer distributional assumptions and handles categorical predictors more naturally. In practice, they often produce similar classification accuracy.

What classification accuracy should I aim for?

It depends on the number of groups and their distinctiveness. For 3-4 segment typing tools, 70-85% accuracy is typical and acceptable. Compare your accuracy to the baseline rate (classifying everyone into the largest group), if the largest segment is 40% of cases, 70% accuracy represents a meaningful improvement over chance.

Can I use discriminant analysis with categorical predictors?

Standard LDA assumes continuous predictors. If you have categorical predictors, use logistic regression (for 2 groups) or multinomial logistic regression (for 3+ groups) instead. Alternatively, convert categorical variables to dummy codes, though this can create technical issues with the discriminant analysis assumptions.


Build segmentation-ready surveys with 40+ question types. Start your free 14-day Quali-Fi trial, no credit card required.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.