Data Collection & Analysis

Survival Analysis: What It Is and How It Works

6 min read

Learn what survival analysis is, how to model time-to-event data in research, and when to use Kaplan-Meier curves and Cox regression for survey studies.

What Is Survival Analysis?

Survival analysis is a branch of statistics focused on modeling the time until a specific event occurs, such as customer churn, product failure, subscription cancellation, or survey dropout. The defining feature is that it handles censored data, where some subjects haven't yet experienced the event by the end of the study period, so you know they survived at least that long without knowing exactly when the event will occur. Standard regression can't accommodate censoring without introducing bias. Originally developed for medical and engineering applications (hence the name), survival analysis is now widely used in marketing, customer analytics, HR research, and any domain where "how long until something happens" is the key question.

Why Survival Analysis Matters

If 20% of customers churned this year, that single number hides enormous variation. Did most leave in the first month, or did churn accelerate after month six? Survival analysis reveals the timing and shape of event patterns, not just whether events occurred. This distinction drives different interventions: early churn suggests onboarding problems, while late churn suggests value-delivery issues. Research by Fader and Hardie at Wharton showed that survival-based churn models predicted customer lifetime value 35% more accurately than models that ignored the time dimension.

How Survival Analysis Works

Censoring: The Core Concept

Censoring occurs when you don't observe the event for some subjects. Right censoring is the most common type: a customer who's still active at the end of your observation window is right-censored because you know they survived at least 12 months, but you don't know when they'll eventually churn. Dropping censored observations from your analysis (analyzing only churned customers) biases your estimates downward, making time-to-event look shorter than it actually is. Survival methods keep censored observations in the analysis, using their partial information appropriately.

The Kaplan-Meier Estimator

The Kaplan-Meier (KM) curve is the most intuitive survival analysis tool. It plots the probability of surviving past each observed event time, producing a step function that descends over time. Each step down represents one or more events (churns, failures). Censored observations are marked on the curve but don't cause steps. You can compare KM curves across groups (premium vs. basic subscribers) using the log-rank test, which tells you whether the survival distributions differ significantly.

For a SaaS company, the KM curve might show that 90% of customers survive past month 1, 78% past month 3, 65% past month 6, and 52% past month 12. The steepest drop between months 2 and 4 would signal the critical retention window.

Cox Proportional Hazards Regression

When you want to estimate how predictor variables affect survival time, Cox regression is the standard method. It models the hazard rate, which represents the instantaneous risk of the event occurring at any given time, as a function of covariates. A hazard ratio of 1.4 for "no onboarding call" means customers who didn't receive an onboarding call have a 40% higher risk of churning at any given point compared to those who did, holding other factors constant.

The "proportional hazards" assumption means the ratio of hazard rates between groups stays constant over time. If onboarding calls reduce early churn but have no effect after six months, this assumption is violated, and you'd need a time-varying coefficient or a different model.

A Practical Example

An online education platform wanted to understand what drove course completion. They tracked 3,200 enrolled students over 16 weeks, where the "event" was dropping out. At week 16, 1,100 students had completed the course (right-censored, since they never dropped out), 1,400 had dropped out at various points, and 700 were still in progress (also right-censored). The KM curve showed a steep drop in weeks 2-4, with 30% of eventual dropouts occurring in that window. Cox regression found that students who completed the first graded assignment had a 65% lower hazard of dropping out (HR = 0.35), while students who logged in fewer than twice per week had a 2.1x higher hazard. The platform redesigned its week-2 experience to prioritize assignment completion.

When to Use Survival Analysis

  • Customer churn modeling where you need to predict not just who will leave, but when, and what factors accelerate or delay departure
  • Subscription and trial conversion studies analyzing time-to-upgrade or time-to-cancel with censored observations from ongoing subscribers
  • Employee attrition research modeling tenure and identifying predictors of early versus late departure
  • Survey panel attrition understanding when and why respondents drop out of longitudinal studies
  • Product reliability studies estimating time-to-failure or time-to-first-complaint for physical products

Common Mistakes

  • Excluding censored observations and analyzing only subjects who experienced the event, which biases survival estimates by ignoring all the people who survived longer than your observation window
  • Violating the proportional hazards assumption without checking, which produces misleading hazard ratios; always test this assumption using Schoenfeld residuals or log-log plots before interpreting Cox regression output
  • Confusing survival time with probability when communicating results; a median survival of 8 months means half of subjects have experienced the event by month 8, not that all subjects will experience it at month 8

How Quali-Fi Supports Survival Analysis

Quali-Fi's panel survey tools track respondent participation across waves with timestamps and completion status, giving you the time-stamped event data survival analysis requires. The platform's automated re-invitation system and attrition flags make it straightforward to identify censored versus event observations when exporting data for survival modeling.

Frequently Asked Questions

What sample size do I need for survival analysis?

A common rule of thumb for Cox regression is 10-20 events per predictor variable. If your model includes 5 predictors, you need at least 50-100 observed events (not total subjects, but subjects who experienced the event). More events mean more stable hazard ratio estimates. Studies with high censoring rates need larger total samples to accumulate enough events.

Can survival analysis handle recurring events?

Standard survival models assume one event per subject. For recurring events (repeated purchases, multiple support tickets), you can use extensions like Andersen-Gill models or frailty models that account for within-subject correlation across events.

What's the difference between survival analysis and logistic regression for churn?

Logistic regression predicts whether churn happens within a fixed window (yes/no at 12 months) and ignores timing. Survival analysis models when churn happens and handles cases where you haven't observed the full window yet. If you have time-to-event data with any censoring, survival analysis provides more information and more efficient estimates.


Track time-to-event in your research -- try Quali-Fi free for 14 days.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.