Surveys gather information about a population by interviewing a smaller subset or sample. Confidence level and margin of error are calculated using the size of the population and the number of survey respondents to determine how accurately the sample represents the population.
Confidence level refers to the percentage of times the survey would yield the same results if it was administered again. Surveys most commonly have 90%, 95% or 99% confidence levels, with 95% being used as the default.
Margin of error refers to the expected difference between the actual survey results and the results if the entire population had taken the survey. A 5% margin of error is generally considered the default.
Let’s assume there is a 5% margin of error and evaluate the confidence level when there are between 500-10,000 people in a population and between 200-500 respondents in a survey (Figure 1). If we want the confidence level to be at least 90%, 200 respondents is only efficient when the population size is 500 people. However, if we have 300 respondents or more the confidence level is always 90% or larger regardless of the population size.
Figure 1
When we assume the confidence level is 95% with the same population and respondent sizes, we verify that we need at least 300 responses (Figure 2). When there are only 200 responses, the margin of error is larger than 5%, regardless of population size. At 300 responses, the margin of error is hovers around 5%. At 400 and 500 responses the margin of error is even lower.
Figure 2
Figure 1 slopes down, while Figure 2 slopes up because a higher confidence level is desirable while a lower margin of error is desirable.
When the population size is at least 5,000, the confidence level and margin of error plateau regardless of response size. When we enlarge the population size to 30,000, we see a slight change in confidence level and margin of error around 20,000, which is negligible because it’s less than a percent.
Commonly analyzed segments in a survey include customer size, industry, location and job function. When there are at least 300 responses to a survey, we can begin to segment the data. Specifically, we begin to have enough responses to analyze sub-segments within a segment. For instance, purchasing is a part of the job function segment and is considered thin data unless there are around 50 respondents.
At least 300 responses are required to curate a survey with a 5% margin of error and a 95% confidence level. This is a starting point because the more responses there are, the more heavily the survey can be analyzed on a segment basis.