How to spot fraudulent respondents and data bots in your data

Charlotte Crichton

As a researcher, one of the things I most look forward to on a project is getting a first look at the data. Don’t get me wrong, I enjoy the set-up phase, but when you’ve done all the hard work designing a programme of research to answer a specific problem or question, you really want to get your hands on the data to see what people are saying and start to think about what that means for the client.

Ensuring high quality data has always been very important to us and we always review data before we start analysing it, to ensure that no poor quality respondents took part in the research. However, due to the constant development of new technologies, we regularly review our procedures so that we always have the best quality data.

On one recent project, it was the open-ended responses that made us question who was in our data. We reviewed some of the interim data and found that a significant number of responses had the following features:

  • They were all a single word when we had asked for three (see the picture on the right)
  • The words didn’t relate to the question – they were all adjectives
  • They were all spelt perfectly and with the first letter capitalised. Any aficionado of online surveys knows that this just isn’t how respondents type!

Whether these were fraudulent respondents or data bots, we didn’t want them in our data so we applied a handful of checks and rules to make sure that these ‘respondents’ weren’t included in our final data. Applying these checks and rules meant that the final data was of much better quality. Based on our experience, on this and other online surveys, we’ve put together some tips for spotting these fraudulent respondents and data bots in online surveys.

Here are our tips for spotting fraudulent respondents and data bots in online surveys:

  1. Look for patterns in the data, for example scale questions always being answered in the same way
  2. Look for unnatural answers, for example a high percentage of people saying they prefer not to say their age or gender
  3. Review the answers to all open-ended questions – are they the same (or almost the same), do they make sense?
  4. Look at the time taken to complete the survey
  5. Finally, and most importantly, build some quality checks into your survey

Ultimately, we need to work together across the industry to make sure that fraudulent respondents and bots don’t infiltrate our data. On your next project make sure you ask your suppliers what they will be doing to ensure there are only genuine respondents in the data and hey, why not do some of the checks yourself too?

Watch this space for more detail on the checks and rules you can use to identify fraudulent respondents and data bots.