Digging Out of The Data Quality Conundrum


Data quality has been a top of mind issue in the market research industry for as long as data has been collected, but online surveys and an increasingly sophisticated pool of respondents has made it more pressing than ever. The risk of fraud is very real and the tools with which people can get around insufficient safeguards are more robust than ever before.

That’s why, even with extensive fraud detection and resources in place to identify potential high-risk respondents during the registration process, it’s important to have resources in place to ensure high data quality in studies. This can be done through sample design and management, survey design, and ongoing member management. Let’s take a closer look at each of these three factors:

1. Sample Design and Management

The method by which sample is sourced by your vendor, along with the management and incentive system used will all impact the quality you receive. You should ask important questions that will directly impact how your sample is built, including:

  • How sample outgo is balanced
  • The measures implemented to ensure high quality
  • Demographic balance
  • Survey field time
  • Invitation and introductory language used
  • Competing survey inventory
  • Survey frequency and variation
  • Routing and project prioritization methods

Know how your vendor sources their sample, what sampling methods they use to match your client’s needs, and the steps they have taken to ensure quality in their panel.

2. Survey Design

Question design is an area where we have more work to do, especially as attention spans shorten and mobile devices provide a faster, more accessible method of interaction for users. What should you focus on in your survey design? There are several factors to keep in mind, including:

  • Non-leading Wording
  • Outs for Respondents
  • Sparing Use of Open-Ended Questions
  • Avoiding yes/no Format
  • Avoid Burdensome Question Formats
  • Concise Wording
  • Reduce Visual Clutter
  • Mobile Friendly Design

Good survey design will not only ensure it’s easier to track and identify potential fraud – it enables good respondents to provide the best possible answers in the format that is most accessible to them.

3. Member Management

Next, there’s member management. How do you ensure survey respondents are legitimate once they are presented with a survey, and how do you maintain the integrity of your results over time?

From a technology standpoint, there are several tools you can use to identify potential fraud in a survey:

  • Honey Pots – Using a programmatic computation behind the scenes you can add a hidden question to your survey – one that humans cannot see but that a bot can.
  • Algorithmic Solutions – Algorithms that track activity over time and identify LOI completions and repeat issues or invalids are highly effective when properly implemented.

In addition to technology, there are several common sense hands-on things you can do to validate user identity over time:

  • Profiling and Third Party Data Validation – There are services that will perform these validations and ensure the data in your member list is accurate and remains that way.
  • Demographic Consistency Checks – If you ask for basic demographic information during registration, this information can be rechecked later with validation questions.

Finally, there is the use of trap questions, which can be highly effective if used properly. These can be tricky, however, skewing either too complicated or too simple. On the one end of the spectrum, they can be frustrating and burdensome for users. On the other, machines can learn and overcome them, so it’s important to build them effectively.

The key to effective use of trap questions is to implement multiple measures and not rely on a single question to measure quality. At the same time, don’t place these questions at the end of a long survey as false positives can invalidate good surveys. Use them in places where users are more likely to respond accurately if they are legitimate respondents.

screenshot231.jpgBy fully understanding the problems that can develop when evaluating sample quality, it’s easier to build and maintain sophisticated safeguards and checks that protect data quality. Learn more about how this is done and the benefits of a good system that evolves over time in our new eBook, Defining Quality in Sample: