CrowdTest: Gamification as Cognitive Support to Improve Bug Report Quality in Crowdsourced Software Testing
PDF

How to Cite

Gautam, Prabin. 2026. “CrowdTest: Gamification As Cognitive Support to Improve Bug Report Quality in Crowdsourced Software Testing”. Journal of Soft Computing Paradigm 8 (2): 96-118. https://doi.org/10.36548/jscp.2026.2.001.

Keywords

— Crowdsourced Testing
— Non-Reward Gamification
— Cognitive Scaffolding
— Bug Report Quality
— Non-Expert Testers
Published: 10-04-2026

Abstract

Non-expert testers often struggle to write clear, reproducible, and useful bug reports. This study examines whether providing cognitive support in crowdsourced software testing improves bug report quality. A web-based prototype, CrowdTest, was used to collect 30 bug reports from anonymous participants, with 15 reports submitted through a non-reward gamified interface and 15 through a baseline interface. Three independent evaluators blindly rated each report on clarity, reproducibility, completeness, and usefulness with acceptable inter-rater reliability for clarity, usefulness, and reproducibility, but lower reliability for completeness. Because the score distributions were non-normal, group differences were analyzed using the Mann-Whitney U test. The non-reward-based gamified condition performed significantly better on clarity, usefulness, and overall composite score, while reproducibility showed a positive but non-significant trend. Completeness is reported for observational purposes because of its lower inter-rater reliability, and the overall composite score should therefore be interpreted with caution. Overall, the results provide preliminary support that the cognitive support may improve the quality of bug reports, although the findings are limited by small sample size, scenario variation, and missing telemetry data.

References

  1. Hooimeijer, Pieter, and Westley Weimer. “Modeling Bug Report Quality.” Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (New York, NY, USA), ASE ’07, November 5, 2007, 34–43.
  2. Deterding, Sebastian, Dan Dixon, Rilla Khaled, and Lennart Nacke. “From Game Design Elements to Gamefulness: Defining ‘Gamification.’” Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments (New York, NY, USA), MindTrek ’11, September 28, 2011, 9–15.
  3. Flatla, David R., Carl Gutwin, Lennart E. Nacke, Scott Bateman, and Regan L. Mandryk. “Calibration Games: Making Calibration Tasks Enjoyable by Adding Motivating Game Elements.” Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (New York, NY, USA), UIST ’11, October 16, 2011, 403–12.
  4. Wood, David, Jerome S. Bruner, and Gail Ross. “The Role of Tutoring in Problem Solving.” Journal of Child Psychology and Psychiatry 17, no. 2 (1976): 89–100.
  5. Kittur, Aniket, Ed H. Chi, and Bongwon Suh. “Crowdsourcing User Studies with Mechanical Turk.” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (New York, NY, USA), CHI ’08, April 6, 2008, 453–56.
  6. Salehi, Niloufar, Jaime Teevan, Shamsi Iqbal, and Ece Kamar. “Communicating Context to the Crowd for Complex Writing Tasks.” Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (New York, NY, USA), CSCW ’17, February 25, 2017, 1890–901.
  7. Bettenburg, Nicolas, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, and Thomas Zimmermann. “What Makes a Good Bug Report?” Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (New York, NY, USA), SIGSOFT ’08/FSE-16, November 9, 2008, 308–18.
  8. Krippendorff, Klaus. Content Analysis: An Introduction to Its Methodology. Fourth Edition, Thousand Oaks, CA: SAGE Publications, Inc., 2019, 403-407
  9. Shapiro, S. S., and M. B. Wilk. “An Analysis of Variance Test for Normality (Complete Samples).” Biometrika 52, no. 3/4 (1965): 591–611.
  10. Hinton, Perry R. "Mann–Whitney U Test." In Encyclopedia of Research Design, edited by Neil J. Salkind. Thousand Oaks, CA: SAGE Publications, Inc., 2010, 747-50.
  11. Cohen, Jacob. “Statistical Power Analysis for the Behavioral Sciences.” 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988.
  12. Efron, Bradley, and R. J. Tibshirani. An Introduction to the Bootstrap. New York: Chapman and Hall/CRC, 1994.