A Hybrid Mechanism for Auto Text Categorization in Web Documents
PDF
PDF

How to Cite

Yogi, Manas Kumar, Ch Manikanta Kalyan, and Dwarampudi Aiswarya. 2023. “A Hybrid Mechanism for Auto Text Categorization in Web Documents”. Journal of Soft Computing Paradigm 4 (4): 272-82. https://doi.org/10.36548/jscp.2022.4.006.

Keywords

— Embedded
— particle swarm optimization
— dimensionality
— text categorization
Published: 20-01-2023

Abstract

Web personalization has become such a popular paradigm nowadays, that almost all e-commerce websites are including it in their websites. The main objective of web personalization is driven by grouping similar web pages. The text categorization principle becomes a challenge when daily users visit numerous pages. This paper develops a hybrid framework which categorizes the text extracted from a web document, by applying Neighbourhood Preserving Embedding algorithm and then Particle Swarm Optimization algorithm on the extracted text groups, resulting into a group of web documents which contain similar texts. The proposed mechanism relatively has a high performance which improves with time, and as the size of web documents increase, the particle swarm algorithm also evolves in its nature.

References

  1. Eberhart, R.C., Kennedy, J.: A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, pp. 39–43 (1995)
  2. Kennedy, J.: The particle swarm:social adaptation of knowledge. In: Proceedings of 1997 IEEE International Conference on Evolutionary Computation, Indianapolis, pp. 303–308 (1997)
  3. Salton, G., Wang, A., Yang, C.S.: A vector space model for information retrieval. Journal of the American Society for Information Science 18, 613–620 (1975)
  4. Donatella Merlini, Martina Rossini,Text categorization with WEKA: A survey,Machine Learning with Applications,Volume 4,2021,100033,ISSN 2666-8270,https://doi.org/10.1016/j.mlwa.2021.100033.
  5. Dhar, Ankita, et al. "Text categorization: past and present." Artificial Intelligence Review 54.4 (2021): 3007-3054.
  6. Sathe JB, Mali MP (2017) A hybrid sentiment classification method using neural network and fuzzy logic. In: Proceedings of IEEE international conference on intelligent systems and control, pp 93–96.
  7. Robertson SE, Jones KS (1976) Relevance weighting of search terms. J Am Soc Inf Sci 27(3):129–146
  8. Robertson SE, Walker S, Beaulieu M, Gatford M, Payne A (1995) Okapi at trec-4. In: Proceedings of the 4th Text Retrieval Conference, pp 73–97.
  9. Rocchio JJ (1971) Relevance feedback in information retrieval.The SMART Retrieval System - Experiments in Automatic Document Processing, pp 313–323
  10. Salehi S, Selamat A, Mashinchi MR, Fujita H (2015) The synergistic combination of particle swarm optimization and fuzzy sets to design granular classifier. Knowl-Based Syst 76:200–218
  11. Saraiva PC, Cavalcanti JM, de Moura ES, Goncalves M. A., Torres RDS (2016) A multimodal query expansion based on genetic programming for visually-oriented e-commerce applications. InfProcess Manag 52(5):783–800