What Is Undercoverage Bias? | Definition & Example
Undercoverage bias occurs when a part of the population is excluded from your sample. As a result, the sample is no longer representative of the target population. Non-probability sampling designs are susceptible to this type of research bias.
Undercoverage is a type of selection bias.
What is undercoverage bias?
Undercoverage bias is the systematic distortion of a study’s findings due to the way the sample was selected.
Ideally, researchers should draw a sample that, like a snapshot, adequately captures characteristics that are both present in the target population and relevant for the research. In other words, researchers aim to collect a representative sample.
In some cases, researchers may sample too few units from a specific segment of the population. If the segment is small in comparison to others in the population, this may not impact the research findings much. However, if the segment is larger, it can lead to a sample that doesn’t accurately capture the characteristics of the population.
In more extreme cases, researchers may completely fail to include a part of the population, which can distort the findings completely.
Keep in mind that two things must both happen for undercoverage bias to occur:
- Some segments of the population have not been included in your sample but should have been
- The included segments are different from the excluded ones in terms of one or more variables of interest
If your sampling frame excludes a large part of your target population, you need to step back and consider how the excluded units may systematically differ from those included in your sample.
Undercoverage bias vs. nonresponse bias
Although undercoverage bias and nonresponse bias may seem similar, they are actually quite different.
- Undercoverage bias occurs when some members of a population are totally excluded from the sample frame you use for your study.
- Nonresponse bias occurs when some of the respondents you selected to be in your sample don’t respond.
In other words, undercoverage means that some units never make it into the sample or are inadequately represented. Nonresponse means that some units are included in the sample, but their responses are missing.
What causes undercoverage bias?
There are two main sources of undercoverage bias:
Non-probability sampling
Non-probability sampling designs like convenience sampling are almost always biased. When researchers recruit study participants based on proximity or ease of access, their results can’t be representative of the population. The reason is that not all members of the population of interest have an equal chance of being selected for the survey.
For example, if you stand at a shopping mall and select shoppers as they walk by to fill out a survey, you are neglecting to survey everyone not at the mall that day.
Incomplete sampling frames
Probability samples are not immune to undercoverage bias either. Simple random sampling can also yield biased results if the sampling frame is incomplete.
For example, if you use email lists or phone lists as a sampling frame, error may be introduced due to the makeup of the list. In other words, the individuals included in the frame may differ from those who are not. As a result, a segment of the population is not sampled at all or is underrepresented in the sample.
Undercoverage bias example
Undercoverage varies in severity depending on the population studied.
How to avoid undercoverage bias
There are a few steps you can take to shield your research from undercoverage bias:
- Familiarize yourself with your target population. Understanding your target population allows you to capture all relevant characteristics and subgroups.
- Run a pilot survey. Before launching your survey, consider performing a trial run with fewer respondents. In this way, you can spot errors like undercoverage bias before launching the actual survey.
- Combine multiple sources of data to build your sampling frame. For example, researchers can create housing unit frames by using an already-existing list of addresses. Next, they update them in the field by adding units that are missing and removing those that don’t exist anymore or are not residential.
Use probability sampling. If the goal of your research is to draw a representative sample, probability samples will allow you to safely generalize your findings to a large population better than non-probability samples.
Other types of research bias
Frequently asked questions
- What is the difference between undercoverage and nonresponse bias?
-
Undercoverage bias happens when segments of the target population are entirely excluded or less represented in the sample than they are in the population. This means that these segments are excluded from the sampling process.
Nonresponse bias occurs when parts of the sampled population are unable or refuse to respond. In other words, nonrespondents are included in the sampling process, but their answers (responses) are not registered.
- What are common types of selection bias?
-
Common types of selection bias are:
- What is undercoverage bias in statistics?
-
Undercoverage bias in statistics is the underrepresentation of a segment of the target population in the sample. If the distribution of characteristics between the target population and the sample is significantly different, it is likely that the dataset has undercoverage bias.
Sources in this article
We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.
This Scribbr article Sources