Abstract
Plagiarism refers to using other ideas or works as their own without giving proper acknowledgment. The act of plagiarism is inappropriate and untrue for many reasons, especially in the academic world. Academicians are aware of this and try to avoid the act of plagiarism by any means necessary. In the present context, the digital way of teaching and learning is in practice which has more chance of plagiarized content. This research provides plagiarism detection features due to the lack of such features in digital-based teaching-learning activities. This proposed system handles the document in text format and uses Winnowing Algorithm for fingerprinting the assignment documents, and the hashing technique chosen for this algorithm is the Rolling Hash function. The similarity value is calculated using Jaccard coefficient. The test results show the combinations of parameters (n-gram, window length, and the base prime number) for the successful implementation of the system. The system successfully detects plagiarism on student assignments. The overall system is developed by using Python Web Framework Django and MySQL as a database.
References
Akbar, A. (2018). Defining plagiarism: A literature review. Ethical Lingua: Journal of Language Teaching and Literature, 5(1), 31-38.
Park, C. (2003). In other (people's) words: Plagiarism by university students--literature and lessons. Assessment & evaluation in higher education, 28(5), 471-488.
7 Common Types of Plagiarism, With Examples. (2022, February 15). 7 Common Types of Plagiarism, With Examples | Grammarly Blog. https://www.grammarly.com/blog/types-of-plagiarism/
Hasan, E. G., Wicaksana, A., & Hansun, S. (2018, June). The implementation of winnowing algorithm for plagiarism detection in Moodle-based e-learning. In 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS) (pp. 321-325). IEEE.
Jaccard index - Wikipedia. (2011, March 1). Jaccard Index - Wikipedia. https://en.wikipedia.org/wiki/Jaccard_index
Duan, X., Wang, M., & Mu, J. (2017). A plagiarism detection algorithm based on extended winnowing. In MATEC Web of Conferences (Vol. 128, p. 02019). EDP Sciences.
Bowen, D. J., Kreuter, M., Spring, B., Cofta-Woerpel, L., Linnan, L., Weiner, D., ... & Fernandez, M. (2009). How we design feasibility studies. American journal of preventive medicine, 36(5), 452-457.
Koning, B. (2022). Extracting Sections From PDF-Formatted CTI Reports (Bachelor's thesis, University of Twente).
Kannan, S., Gurusamy, V., Vijayarani, S., Ilamathi, J., Nithya, M., Kannan, S., & Gurusamy, V. (2014). Preprocessing techniques for text mining. International Journal of Computer Science & Communication Networks, 5(1), 7-16.
Sidorov, G. (2013). Non-linear construction of n-grams in computational linguistics. México: Sociedad Mexicana de Inteligencia Artificial.
H. Jiang and S. -J. Lin, "A Rolling Hash Algorithm and the Implementation to LZ4 Data Compression," in IEEE Access, vol. 8, pp. 35529-35534, 2020, doi: 10.1109/ACCESS.2020.2974489.
J. Kornblum, “Identifying almost identical files using context triggered piecewise hashing,” Digital investigation, vol. 3, pp. 91–97, 2006.
E. G. Hasan, A. Wicaksana and S. Hansun, "The Implementation of Winnowing Algorithm for Plagiarism Detection in Moodle-based E-learning," 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), Singapore, 2018, pp. 321-325, doi: 10.1109/ICIS.2018.8466429.
Schleimer, S., Wilkerson, D. S., & Aiken, A. (2003, June). Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data (pp. 76-85).
Sharp, I., Yu, K. (2019). System Testing. In: Wireless Positioning: Principles and Practice. Navigation: Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-8791-2_11.
