Journal of Trends in Computer Science and Smart Technology is accepted for inclusion in Scopus. click here
Home / Archives / Volume-7 / Issue-4 / Article-6

Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language

Avani N Dave ,  Sanjay M Shah,  Nakul R Dave
Open Access
Volume - 7 • Issue - 4 • december 2025
727-752  252 PDF
Abstract

Word sense disambiguation is the task of determining the exact meaning of a word based on its context. This task is crucial in natural language processing. The lack of labeled datasets and the complex structure of the language, which includes idiomatic usage and subtle semantic changes, contribute to the poor outcomes of earlier attempts to solve word sense disambiguation in Gujarati. As a result, various models have shown low accuracy. To address this issue, we have created a new dataset that is manually sense-annotated for unclear Gujarati words. The corpus contains 50 ambiguous words, and each word has been assigned to the appropriate context. This makes it a valuable starting point for evaluating supervised learning models. With this newly compiled corpus, we carry out a systematic study of two supervised machine learning algorithms-Decision Tree and Random Forest-using 3-fold and 5-fold cross-validation. Our results show that Random Forest obtains the highest accuracy, highlighting which supervised methods are best suited for this particular task. The main contributions of this work include the development of a much-needed annotated corpus and sufficient evidence to prove that supervised learning can be quite effective in improving WSD for Gujarati when proper data is integrated.

Cite this article
Dave, Avani N, Sanjay M Shah, and Nakul R Dave. "Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language." Journal of Trends in Computer Science and Smart Technology 7, no. 4 (2025): 727-752. doi: 10.36548/jtcsst.2025.4.006
Copy Citation
Dave, A. N., Shah, S. M., & Dave, N. R. (2025). Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language. Journal of Trends in Computer Science and Smart Technology, 7(4), 727-752. https://doi.org/10.36548/jtcsst.2025.4.006
Copy Citation
Dave, Avani N, et al. "Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language." Journal of Trends in Computer Science and Smart Technology, vol. 7, no. 4, 2025, pp. 727-752. DOI: 10.36548/jtcsst.2025.4.006.
Copy Citation
Dave AN, Shah SM, Dave NR. Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language. Journal of Trends in Computer Science and Smart Technology. 2025;7(4):727-752. doi: 10.36548/jtcsst.2025.4.006
Copy Citation
A. N. Dave, S. M. Shah, and N. R. Dave, "Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language," Journal of Trends in Computer Science and Smart Technology, vol. 7, no. 4, pp. 727-752, Dec. 2025, doi: 10.36548/jtcsst.2025.4.006.
Copy Citation
Dave, A.N., Shah, S.M. and Dave, N.R. (2025) 'Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language', Journal of Trends in Computer Science and Smart Technology, vol. 7, no. 4, pp. 727-752. Available at: https://doi.org/10.36548/jtcsst.2025.4.006.
Copy Citation
@article{dave2025,
  author    = {Avani N Dave and Sanjay M Shah and Nakul R Dave},
  title     = {{Evaluating Random Forest and Decision Tree Algorithms for Resolving Lexical Ambiguity in the Gujarati Language}},
  journal   = {Journal of Trends in Computer Science and Smart Technology},
  volume    = {7},
  number    = {4},
  pages     = {727-752},
  year      = {2025},
  publisher = {IRO Journals},
  doi       = {10.36548/jtcsst.2025.4.006},
  url       = {https://doi.org/10.36548/jtcsst.2025.4.006}
}
Copy Citation
Keywords
Machine Learning Word Sense Disambiguation Natural Language Processing Decision Tree Random Forest Sense Annotated Corpus Gujarati Language
References
  1. Dave, Avani N., and Sanjay M. Shah. "Comprehensive Analysis for Assessing the Effectiveness in the Implementation of Word Sense Disambiguation in the Gujarati Language." In IET Conference Proceedings CP920, vol. 2025, no. 7, Stevenage, UK: The Institution of Engineering and Technology, 2025, 1093-1100.
  2. Navigli, Roberto. "Word Sense Disambiguation: A Survey." ACM computing surveys (CSUR) 41, no. 2 (2009): 1-69.
  3. Gujjar, Vinto, Neeru Mago, Raj Kumari, Shrikant Patel, Nalini Chintalapudi, and Gopi Battineni. "A Literature Survey on Word Sense Disambiguation for the Hindi Language." Information 14, no. 9 (2023): 495.
  4. Escudero Bakx, Gerard. "Machine Learning Techniques for Word Sense Disambiguation." PhD diss., Universitat Politècnica de Catalunya (UPC), 2006.
  5. Dave, Nakul R., and Mayuri A. Mehta. "Comparative Analysis of Rule-Based, Dictionary-Based and Hybrid Stemmers for Gujarati Language." In International Conference on Big Data Analytics, Cham: Springer International Publishing, 2019, 140-155.
  6. Sinha, M., Kumar, M., Pande, P., Kashyap, L., & Bhattacharyya, P. (2004). Hindi Word Sense Disambiguation. International Symposium on Machine Translation, Natural Language Processing and Translation Support Systems, 1–7. https://api.semanticscholar.org/CorpusID:9438332
  7. Abraham, Ajith, Bineet Kumar Gupta, Satya Bhushan Verma, Archana Sachindeo Maurya, Mohammad Husain, Arshad Ali, Sami Alshmrany, and Sanjay Gupta. "Improvement of Translation Accuracy for the Word Sense Disambiguation System using Novel Classifier Approach." International Arab Journal of Information Technology (IAJIT) 21, no. 6 (2024).
  8. Zhou, Xiaohua, and Hyoil Han. "Survey of Word Sense Disambiguation Approaches." In FLAIRS, 2005, 307-313.
  9. Hao, Lianwang, Tao Zhang, and Huaixin Liang. "A Novel Rotational Causal Random Forest Approach for Word Sense Disambiguation of English Modal Verbs." Available at SSRN 5244147.
  10. Vasoya, Parth J., and T. Vyas. "A Survey on Word Sense Disambiguation Approaches." International journal of Trend in Research and Development 1 (2014): 1-3.
  11. Rekha, Smt. V. and Manish Shah. “A Study of Main Elements of Word Sense Disambiguation (WSD) for Gujarati Language.” (2016).
  12. Sheth, Mitul, Shivang Popat, and Tarjni Vyas. "Word Sense Disambiguation for Indian Languages." In International conference on emerging research in computing, information, communication and applications, Singapore: Springer Singapore, 2016, 583-593.
  13. Vyas, Tarjni, and Amit Ganatra. "Gujarati Language: Research Issues, Resources and Proposed Method on Word Sense Disambiguation." Int. J. Recent Technol. Eng.(IJRTE) ISSN 8, no. 2 suppl 11 (2019): 2277-3878.
  14. Geleta, Tabor Wegi, and Jara Muda Haro. "Semisupervised Learning‐Based Word‐Sense Disambiguation Using Word Embedding for Afaan Oromoo Language." Applied Computational Intelligence and Soft Computing 2024, no. 1 (2024): 4429069.
  15. Borah, Pranjal Protim, Gitimoni Talukdar, and Arup Baruah. "Assamese Word Sense Disambiguation Using Supervised Learning." In 2014 International Conference on Contemporary Computing and Informatics (IC3I), IEEE, 2014, 946-950.
  16. Wang, Tinghua, Junyang Rao, and Qi Hu. "Supervised Word Sense Disambiguation Using Semantic Diffusion Kernel." Engineering Applications of Artificial Intelligence 27 (2014): 167-174.
  17. Sarmah, Jumi, and Shikhar Kr Sarma. "Decision Tree Based Supervised Word Sense Disambiguation for Assamese." Int. J. Comput. Appl 141, no. 1 (2016): 42-48.
  18. Kokane, Chandrakant D., and Sachin D. Babar. "Supervised Word Sense Disambiguation with Recurrent Neural Network Model." Int. J. Eng. Adv. Technol.(IJEAT) 9, no. 2 (2019).
  19. Lai, Huei-Ling, Hsiao-Ling Hsu, Jyi-Shane Liu, Chia-Hung Lin, and Yanhong Chen. "Supervised Word Sense Disambiguation on Polysemy with Neural Network Models: A Case Study of BUN in Taiwan Hakka." International Journal of Asian Language Processing 30, no. 03 (2020): 2050011.
  20. Saif, Abdulgabbar, Nazlia Omar, Ummi Zakiah Zainodin, and Mohd Juziaddin Ab Aziz. "Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation." Procedia Computer Science 123 (2018): 403-412.
  21. Vyas, Tarjni, and Amit Ganatra. "Gujarati Language Model: Word Sense Disambiguation Using Supervised Technique." Int. J. Rec. Technol. Eng 8, no. 2 (2019): 3740-3744.
  22. Kumar, Sailendra, and Rakesh Kumar. "Word Sense Disambiguation in the Hindi Language: Neural Network Approach." Int. J. Tech. Res. Sci 1 (2021): 72-76.
  23. Preeti, B. "Word Sense Disambiguation in Gujarati Language." Int J Innov Res Comput Sci Technol (IJIRCST) 3, no. 1 (2015): 44-47.
  24. Le, Anh-Cuong, Akira Shimazu, Van-Nam Huynh, and Le-Minh Nguyen. "Semi-Supervised Learning Integrated with Classifier Combination for Word Sense Disambiguation." Computer Speech & Language 22, no. 4 (2008): 330-345.
  25. Taghipour, Kaveh, and Hwee Tou Ng. "Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains." In Proceedings of the 2015 conference of the North American chapter of the association for computational linguistics: human language technologies, 2015, 314-323.
  26. Rahman, Nazreena, and Bhogeswar Borah. "An Unsupervised Method for Word Sense Disambiguation." Journal of King Saud University-Computer and Information Sciences 34, no. 9 (2022): 6643-6651.
  27. Klapaftis, Ioannis P., and Suresh Manandhar. "Unsupervised Word Sense Disambiguation Using The WWW." Frontiers in Artificial Intelligence and Applications 142 (2006): 174.
  28. Martinez-Gil, Jorge. "Context-Aware Semantic Similarity Measurement for Unsupervised Word Sense Disambiguation." arXiv preprint arXiv:2305.03520 (2023).
  29. Kwon, Sunjae, Rishabh Garodia, Minhwa Lee, Zhichao Yang, and Hong Yu. "Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporatin Gloss Information." arXiv preprint arXiv:2305.01788 (2023).
  30. Pal, Alok Ranjan, Anirban Kundu, Abhay Singh, Raj Shekhar, and Kunal Sinha. "A Hybrid Approach to Word Sense Disambiguation Combining Supervised and Unsupervised Learning." arXiv preprint arXiv:1611.01083 (2015).
  31. Specia, Lucia. "A Hybrid Model for Word Sense Disambiguation in English-Portuguese Machine Translation." In Proceedings of the 8th Research Colloquium of the UK Special interest Group in Computational Linguistics, pp. 71-78. 2005.
  32. Vaishnav, Zankhana B., and Priti S. Sajja. "Knowledge-Based Approach for Word Sense Disambiguation Using Genetic Algorithm for Gujarati." In Information and Communication Technology for Intelligent Systems: Proceedings of ICTIS 2018, Volume 1, Singapore: Springer Singapore, 2018, 485-494.
  33. Vaishnav, Zankhana B. "Gujarati Word Sense Disambiguation Using Genetic Algorithm." International Journal on Recent and Innovation Trends in Computing and Communication 5, no. 6 (2017): 635-639.
  34. Rawat, Sunita, K. Kalambe, G. Kawade, and N. Korde. "Supervised Word Sense Disambiguation Using Decision Tree." International Journal of Recent Technology and Engineering (IJRTE) 8, no. 2 (2019): 4043-4047.
Published
02 December, 2025
×
Article Processing Charges

Journal of Trends in Computer Science and Smart Technology (jtcsst) is an open access journal. When a paper is accepted for publication, authors are required to pay Article Processing Charges (APCs) to cover its editorial and production costs. The APC for each submission is 400 USD. There are no additional charges based on color, length, figures, or other elements.

Category Fee
Article Access Charge 30 USD
Article Processing Charge 400 USD
Annual Subscription Fee 200 USD
Payment Gateway
Paypal: click here
Townscript: click here
Razorpay: click here
After payment,
please send an email to irojournals.contact@gmail.com / journals@iroglobal.com requesting article access.
Subscription form: click here