Researchers who engage in online data collection may encounter fraudulent or low-quality responses, especially in studies that offer compensation. The following are best practices recommended to help protect data integrity while maintaining ethical research standards.
- Utilize a two-step recruitment process.
Do not post a publicly accessible survey link. Instead, have interested participants complete a brief screener or provide contact information to directly contact the research team. Researchers can then distribute the survey link individually and compare screener responses with survey data to assess consistency.
This reduces the likelihood of bots or repeat responders in accessing the survey and allows for the verification of participant eligibility.
- Distribute individualized survey links.
- Qualtrics allows researchers to generate unique, non-shareable links for each participant. By using individualized links, the risk of surveys being widely circulated on forums or shared among ineligible participants is reduced.
Individualized survey links are identifiable as they are linked to participants’ email address. Therefore, researchers must communicate that participants’ survey responses will be associated with their email address in the consent form.
- Delay compensation until data quality checks are completed.
- Immediate payment can incentivize fraudulent participation. By delaying compensation, the research team can ensure that incentives are tied to valid data.
Clearly state in the consent form that compensation will only be provided after responses are reviewed for completeness and validity.
- Include data quality and attention checks.
Incorporate strategies such as attention check items, consistency check across related questions, and minimum completion time thresholds. These methods help the research team to identify inattentive or automated responses.
- Incorporate at least one open-ended response item.
Add a free-text question relevant to the study topic. Open-ended responses can help distinguish genuine participants from bots or scripted responses, which often produces vague or inconsistent answers.
- Monitor for duplicate or suspicious entries.
Review metadata such as IP addresses, timestamps, completion times. Multiple submissions from the same IP address or unusually fast completion times may indicate patterns of fraudulent activity.
- Use a structured approach to flag potential fraud.
Consider implementing a simple checklist of fraud indicators (e.g., failed attention checks, inconsistent responses, duplicate entries). Responses meeting multiple criteria can be flagged for further review.
Incorporating a standardized approach promotes consistency and transparency in data cleaning decisions.
Clearly state in the consent form that respondents whose responses do not pass quality checks will not be compensated.
- Maintain records on decisions not to compensate participants.
- PIs should explicitly outline the rationale regarding why participants were not compensated. In the event a participant files a complaint about non-compensation, the researcher will have a detailed record of their decision.
Important Considerations
These strategies should only be applied in ways that are consistent with ethical guidelines as described in the Policy on the Protection of Human Subjects in Research and participant privacy protections, and they must be reviewed and approved by Lehigh’s Institutional Review Board (IRB). Not all strategies are appropriate for every study and Principal Investigators should select approaches that align with their study design, population, and risk level.
This list is not intended to be exhaustive. Methods for preventing and detecting fraudulent responses continue to evolve alongside changes in technology and researchers are encouraged to stay informed about emerging best practices and relevant literature.