Context-Aware MCQ Generation with Large Language Models: A Novel Framework
PDF

Keywords

MCQ Generation
Large Language Models
Automated Question Creation
Online Assessments
Text Summarization
Distractor Generation
Adaptive Learning

How to Cite

Context-Aware MCQ Generation with Large Language Models: A Novel Framework. (2025). Journal of Information Technology and Digital World, 7(2), 90-105. https://doi.org/10.36548/jitdw.2025.2.001

Abstract

The methods of conducting examinations are evolving with institutions increasingly adopting online systems, making Multiple-Choice Questions (MCQs) important due to their efficiency and scalability. However, constructing high-quality MCQs remains a manual, time-consuming process. Existing automated systems, mainly using BERT-based summarization and lexical distractor generation, such as WordNet, to suffer from limited contextual understanding and scalability. To address these challenges, this research proposes an innovative solution using Large Language Models (LLMs), specifically Gemini AI, for automated MCQ generation. The methodology involves LLM-based text summarization to extract key concepts, followed by direct MCQ and distractor generation with enhanced contextual relevance, diversity, and minimal manual intervention. Additionally, real-time feedback and adaptive difficulty adjustment are integrated to enhance personalized learning experiences. Comparative analysis with recent models like T5, GPT-3.5, and BERT shows that Gemini AI outperforms them in contextual quality, distractor coherence, and generation efficiency, achieving a 20% improvement in human-rated question quality, thus highlighting the potential of LLMs to revolutionize automated assessment design.

PDF

References

Al Shuraiqi, Somaiya, et al. "Automatic Generation of Medical Case-Based Multiple-Choice Questions (MCQs): A Review of Methodologies, Applications, Evaluation, and Future Directions." Big Data and Cognitive Computing 8.10 (2024): 139.

Omopekunola, Moses Oluoke, and Elena Yu Kardanova. "Automatic generation of physics items with Large Language Models (LLMs)." REID (Research and Evaluation in Education) 10.2 (2024): 4.

Tan, Sieow-Yeek, Ching-Chieh Kiu, and Dickson Lukose. "Evaluating multiple choice question generator." Knowledge Technology Week. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. 283-292.

A. Kumar, A. Nayak, Manjula Shenoy K, Chaitanya, and K. Ghosh, “A Novel Framework for the Generation of Multiple Choice Question Stems Using Semantic and Machine-Learning Techniques,” International Journal of Artificial Intelligence in Education, Mar. 2023, doi: https://doi.org/10.1007/s40593-023-00333-6

Indran, Inthrani Raja, Priya Paranthaman, Neelima Gupta, and Nurulhuda Mustafa. "Twelve tips to leverage AI for efficient and effective medical question generation: a guide for educators using Chat GPT." Medical Teacher 46, no. 8 (2024): 1021-1026.

Parlapalli, H. K. "Mitigating Order Sensitivity Challenges in Large Language Models using Policy Frameworks." (2024).

Prakash, Vijay, Kartikay Agrawal, and Syaamantak Das. "Q-genius: A gpt based modified mcq generator for identifying learner deficiency." In International Conference on Artificial Intelligence in Education, Cham: Springer Nature Switzerland, 2023. 632-638.

Mehta, Pritam Kumar, Prachi Jain, Chetan Makwana, and C. M. Raut. "Automated MCQ generator using natural language processing." In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, 284-290. 2021.

Kıyak, Yavuz Selim, and Andrzej A. Kononowicz. "Case-based MCQ generator: a custom ChatGPT based on published prompts in the literature for automatic item generation." Medical teacher 46, no. 8 (2024): 1018-1020.

Gill, Gurnoor S., Joby Tsai, Jillene Moxam, Harshal A. Sanghvi, and Shailesh Gupta. "Comparison of Gemini Advanced and ChatGPT 4.0’s Performances on the Ophthalmology Resident Ophthalmic Knowledge Assessment Program (OKAP) Examination Review Question Banks." Cureus 16, no. 9 (2024).

Myrzakhan, Aidar, Sondos Mahmoud Bsharat, and Zhiqiang Shen. "Open-llm-leaderboard: From multi-choice to open-style questions for llms evaluation, benchmark, and arena." arXiv preprint arXiv:2406.07545 (2024).

Säuberli, Andreas, and Simon Clematide. "Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models." arXiv preprint arXiv:2404.07720 (2024).

Zhou, Jincheng, Yue Hu, and Ya Wang. "QOG: Question and Options Generation based on language model." In IET Conference Proceedings CP915, vol. 2025, no. 2, Stevenage, UK: The Institution of Engineering and Technology, 2025. 174-179.

Zhou, Jincheng, Yue Hu, and Ya Wang. "QOG: Question and Options Generation based on language model." In IET Conference Proceedings CP915, vol. 2025, no. 2, Stevenage, UK: The Institution of Engineering and Technology, 2025. 174-179.

Mucciaccia, Sérgio Silva, et al. "Automatic Multiple-Choice Question Generation and Evaluation Systems Based on LLM: A Study Case With University Resolutions." Proceedings of the 31st International Conference on Computational Linguistics (COLING 2025), 2246–2260.