Sampling Methods

Sampling With Replacement: What It Is and How to Use It in Research

6 min read

Learn what sampling with replacement is, how allowing the same unit to be selected more than once affects estimation, and when this approach is used in research.

What Is Sampling With Replacement?

Sampling with replacement is a selection method where each unit drawn from the population is returned to the pool before the next draw, making it possible for the same unit to be selected more than once. After you randomly select a person, household, or data point, you "replace" it in the population so it's eligible for selection again on the next draw. This means every draw is independent of all previous draws, the selection probability stays constant throughout the sampling process. In theory, one individual could appear in your sample multiple times. The method is foundational in statistics and probability theory (it's how bootstrap resampling works) and has specific practical applications in multi-stage survey sampling, simulation, and variance estimation. In most day-to-day survey research, sampling without replacement is the default, but understanding with-replacement logic is essential because many statistical formulas and estimation methods assume it.

Why Sampling With Replacement Matters

The distinction between with-replacement and without-replacement sampling affects your variance estimates, your confidence intervals, and whether you need to apply the finite population correction. Many standard statistical formulas, including the textbook formula for sampling variance, assume sampling with replacement. When you sample without replacement (as most surveys do), the actual variance is smaller than these formulas suggest. Understanding the with-replacement assumption tells you when your standard formulas overestimate uncertainty and when corrections are appropriate.

How Sampling With Replacement Works

The mechanics are straightforward, but the implications for survey design and estimation require careful thought.

The Basic Process

Start with a defined population of N units. Randomly select one unit, note its characteristics, then return it to the population. The population still has N units, and every unit (including the one just selected) has the same probability of being chosen on the next draw. Repeat until you've made n draws.

Because units can be selected more than once, your sample of n draws might contain fewer than n unique units. The expected number of unique units in a with-replacement sample of size n from a population of N is N(1 - (1 - 1/N)^n). For large populations relative to sample size, duplicates are rare. For small populations or large samples, they're more common.

Independence of Draws

The defining property of with-replacement sampling is that each draw is statistically independent. The probability of selecting any unit on draw k doesn't depend on what happened on draws 1 through k-1. This independence simplifies the math considerably, variance formulas, probability calculations, and sampling distributions all take simpler forms when draws are independent.

Without-replacement sampling creates negative dependence between draws. If you select Person A on the first draw, Person A can't be selected on the second draw, which slightly increases everyone else's probability. This dependence is what creates the finite population correction factor, the adjustment that accounts for sampling a meaningful fraction of the population.

Application in PPS Sampling

In probability-proportional-to-size (PPS) sampling, commonly used for selecting primary sampling units in multi-stage designs, the with-replacement version is simpler to implement and analyze. PPS with replacement selects each PSU independently with probability proportional to its size measure. A large PSU might be selected more than once, in which case you'd draw independent subsamples from it for each time it was selected.

The Hansen-Hurwitz estimator, designed for PPS with-replacement sampling, produces unbiased estimates with straightforward variance calculations. PPS without replacement uses the Horvitz-Thompson estimator, which requires knowledge of joint inclusion probabilities, computationally more complex.

Bootstrap Resampling

The most common practical application of with-replacement sampling is the bootstrap. In bootstrap estimation, you repeatedly resample (with replacement) from your observed data to estimate the sampling distribution of a statistic. Each bootstrap sample is the same size as the original data but contains duplicates and omits some original observations. The variation across thousands of bootstrap samples approximates the true sampling variability.

Bootstrap methods don't require the original data to come from with-replacement sampling, they use with-replacement resampling as a computational tool regardless of how the original data was collected.

Variance Comparison

For the same sample size n, sampling with replacement produces higher variance than sampling without replacement. The difference is captured by the finite population correction factor: (N - n) / (N - 1). When n is small relative to N (say, under 5%), the correction is close to 1 and the methods produce nearly identical variance. When n is a large fraction of N, sampling without replacement is substantially more precise.

This is why without-replacement sampling is preferred for surveys, it's more efficient. With-replacement sampling's advantage is mathematical simplicity, not statistical efficiency.

When to Use Sampling With Replacement

  • PPS sampling in multi-stage designs where with-replacement selection simplifies estimation and variance calculation
  • Bootstrap variance estimation where resampling from observed data with replacement approximates sampling distributions
  • Simulation studies where independent draws from a distribution are needed to model probabilistic processes
  • Theoretical work and teaching where with-replacement assumptions simplify derivations and make sampling concepts clearer
  • Very large populations where the probability of selecting the same unit twice is negligible, making with- and without-replacement sampling functionally equivalent

Common Mistakes to Avoid

  • Using with-replacement variance formulas for without-replacement samples. This overestimates your margin of error when you've sampled a substantial fraction of the population. Apply the finite population correction when your sampling fraction exceeds 5%.
  • Confusing methodological with-replacement sampling with data duplication errors. If the same respondent appears twice in your dataset, that's a quality control problem, not with-replacement sampling. True with-replacement designs account for duplicates in the estimation procedure.
  • Assuming bootstrap results automatically apply to your population. Bootstrap resampling estimates the variability of statistics within your sample data. If your original sample is biased, the bootstrap faithfully estimates the variability of that biased estimate.

How Quali-Fi Supports Sampling With Replacement

Quali-Fi's survey platform is designed for without-replacement data collection (each respondent completes once), which is appropriate for virtually all applied survey research. For teams needing bootstrap variance estimates or resampling analyses, Quali-Fi's data export integrates with R, Python, and SPSS, where bootstrap procedures can be applied to the collected data.

Frequently Asked Questions

When would I actually sample the same person twice in a survey?

In practical survey research, almost never. With-replacement logic is more relevant for selecting clusters or PSUs in multi-stage designs, where a large cluster might be selected more than once and subsampled independently each time. Individual-level with-replacement sampling is primarily a theoretical concept used in statistical formulas.

Does with-replacement sampling affect my sample size requirements?

Marginally. With-replacement sampling has slightly higher variance than without-replacement for the same sample size. In practice, the difference is negligible unless you're sampling a large fraction (10%+) of a small population. For most surveys, sample size calculations that assume with-replacement are conservative (slightly overestimating the required n).

What's the relationship between with-replacement sampling and the binomial distribution?

With-replacement sampling of a binary characteristic follows the binomial distribution exactly, because each draw is independent with constant probability. Without-replacement sampling follows the hypergeometric distribution. For large populations, both distributions converge, which is why we use binomial approximations in most survey calculations.


Understand the foundations of sampling theory. Start a free trial with Quali-Fi and use statistically sound survey designs with built-in quality controls and exportable data for advanced analysis.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.