CrowdTest: Gamification as Cognitive Support to Improve Bug Report Quality in Crowdsourced Software Testing

Prabin Gautam

doi:10.36548/jscp.2026.2.001

CrowdTest: Gamification as Cognitive Support to Improve Bug Report Quality in Crowdsourced Software Testing

Open Access

https://doi.org/10.36548/jscp.2026.2.001

Vol. 8, No. 2 (2026)

Published: 10 April, 2026

Pages: 96-118

Prabin Gautam Prabin Gautam

Department of Electronics and Computer Engineering, Paschimanchal Campus, Tribhuvan University, Pokhara, Nepal

Department of Electronics and Computer Engineering, Paschimanchal Campus, Tribhuvan University, Pokhara, Nepal

view PDF

How to Cite

Gautam, Prabin. 2026. “CrowdTest: Gamification As Cognitive Support to Improve Bug Report Quality in Crowdsourced Software Testing”. Journal of Soft Computing Paradigm 8 (2): 96-118. https://doi.org/10.36548/jscp.2026.2.001.

Keywords

Crowdsourced Testing

Non-Reward Gamification

Cognitive Scaffolding

Bug Report Quality

Non-Expert Testers

Abstract

Non-expert testers often struggle to write clear, reproducible, and useful bug reports. This study examines whether providing cognitive support in crowdsourced software testing improves bug report quality. A web-based prototype, CrowdTest, was used to collect 30 bug reports from anonymous participants, with 15 reports submitted through a non-reward gamified interface and 15 through a baseline interface. Three independent evaluators blindly rated each report on clarity, reproducibility, completeness, and usefulness with acceptable inter-rater reliability for clarity, usefulness, and reproducibility, but lower reliability for completeness. Because the score distributions were non-normal, group differences were analyzed using the Mann-Whitney U test. The non-reward-based gamified condition performed significantly better on clarity, usefulness, and overall composite score, while reproducibility showed a positive but non-significant trend. Completeness is reported for observational purposes because of its lower inter-rater reliability, and the overall composite score should therefore be interpreted with caution. Overall, the results provide preliminary support that the cognitive support may improve the quality of bug reports, although the findings are limited by small sample size, scenario variation, and missing telemetry data.

References

Hooimeijer, Pieter, and Westley Weimer. “Modeling Bug Report Quality.” Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (New York, NY, USA), ASE ’07, November 5, 2007, 34–43.
Deterding, Sebastian, Dan Dixon, Rilla Khaled, and Lennart Nacke. “From Game Design Elements to Gamefulness: Defining ‘Gamification.’” Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments (New York, NY, USA), MindTrek ’11, September 28, 2011, 9–15.
Flatla, David R., Carl Gutwin, Lennart E. Nacke, Scott Bateman, and Regan L. Mandryk. “Calibration Games: Making Calibration Tasks Enjoyable by Adding Motivating Game Elements.” Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (New York, NY, USA), UIST ’11, October 16, 2011, 403–12.
Wood, David, Jerome S. Bruner, and Gail Ross. “The Role of Tutoring in Problem Solving.” Journal of Child Psychology and Psychiatry 17, no. 2 (1976): 89–100.
Kittur, Aniket, Ed H. Chi, and Bongwon Suh. “Crowdsourcing User Studies with Mechanical Turk.” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (New York, NY, USA), CHI ’08, April 6, 2008, 453–56.
Salehi, Niloufar, Jaime Teevan, Shamsi Iqbal, and Ece Kamar. “Communicating Context to the Crowd for Complex Writing Tasks.” Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (New York, NY, USA), CSCW ’17, February 25, 2017, 1890–901.
Bettenburg, Nicolas, Sascha Just, Adrian Schröter, Cathrin Weiss, Rahul Premraj, and Thomas Zimmermann. “What Makes a Good Bug Report?” Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering (New York, NY, USA), SIGSOFT ’08/FSE-16, November 9, 2008, 308–18.
Krippendorff, Klaus. Content Analysis: An Introduction to Its Methodology. Fourth Edition, Thousand Oaks, CA: SAGE Publications, Inc., 2019, 403-407
Shapiro, S. S., and M. B. Wilk. “An Analysis of Variance Test for Normality (Complete Samples).” Biometrika 52, no. 3/4 (1965): 591–611.
Hinton, Perry R. "Mann–Whitney U Test." In Encyclopedia of Research Design, edited by Neil J. Salkind. Thousand Oaks, CA: SAGE Publications, Inc., 2010, 747-50.
Cohen, Jacob. “Statistical Power Analysis for the Behavioral Sciences.” 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988.
Efron, Bradley, and R. J. Tibshirani. An Introduction to the Bootstrap. New York: Chapman and Hall/CRC, 1994.

CrowdTest: Gamification as Cognitive Support to Improve Bug Report Quality in Crowdsourced Software Testing

How to Cite

Download Citation

Keywords

Abstract

References