Classification of Remote Sensing Image Scenes Using Double Feature Extraction Hybrid Deep Learning Approach
Volume-3 | Issue-2

Light Weight CNN based Robust Image Watermarking Scheme for Security
Volume-3 | Issue-2

Principle of 6G Wireless Networks: Vision, Challenges and Applications
Volume-3 | Issue-4

PROGRESS AND PRECLUSION OF KNEE OSTEOARTHRITIS: A STUDY
Volume-3 | Issue-3

Is Internet becoming a Major Contributor for Global warming - The Online Carbon Footprint
Volume-2 | Issue-4

Augmented Reality in Education
Volume-2 | Issue-4

A Study on Various Task-Work Allocation Algorithms in Swarm Robotics
Volume-2 | Issue-2

IoT based Biotelemetry for Smart Health Care Monitoring System
Volume-2 | Issue-3

Tungsten DiSulphide FBG Sensor for Temperature Monitoring in Float Glass Manufacturing
Volume-2 | Issue-4

GUI based Industrial Monitoring and Control System
Volume-3 | Issue-2

AUTOMATION USING IOT IN GREENHOUSE ENVIRONMENT
Volume-1 | Issue-1

Principle of 6G Wireless Networks: Vision, Challenges and Applications
Volume-3 | Issue-4

Classification of Remote Sensing Image Scenes Using Double Feature Extraction Hybrid Deep Learning Approach
Volume-3 | Issue-2

Light Weight CNN based Robust Image Watermarking Scheme for Security
Volume-3 | Issue-2

VIRTUAL REALITY GAMING TECHNOLOGY FOR MENTAL STIMULATION AND THERAPY
Volume-1 | Issue-1

Design of Digital Image Watermarking Technique with Two Stage Vector Extraction in Transform Domain
Volume-3 | Issue-3

Analysis of Natural Language Processing in the FinTech Models of Mid-21st Century
Volume-4 | Issue-3

PROGRESS AND PRECLUSION OF KNEE OSTEOARTHRITIS: A STUDY
Volume-3 | Issue-3

Image Augmentation based on GAN deep learning approach with Textual Content Descriptors
Volume-3 | Issue-3

Comparative Analysis for Personality Prediction by Digital Footprints in Social Media
Volume-3 | Issue-2

Home / Archives / Volume-6 / Issue-1 / Article-8

Volume - 6 | Issue - 1 | march 2024

TF-IDF Vectorization and Clustering for Extractive Text Summarization
Muthu Virumeshwaran T  , R Thirumahal
Pages: 96-111
Cite this article
T, Muthu Virumeshwaran, and R Thirumahal. "TF-IDF Vectorization and Clustering for Extractive Text Summarization." Journal of Information Technology and Digital World 6, no. 1 (2024): 96-111
Published
29 April, 2024
Abstract

Extractive document summarization is a vital technique for condensing large volumes of text while retaining key information. This research introduces a dynamic feature space mapping approach to enhance extractive document summarization, aiming to succinctly encapsulate key information from extensive text volumes. The proposed method involves extracting various document properties like term frequency, sentence length, and position to comprehensively describe content. By employing a mapping function, these features are projected into a dynamic feature space, enhancing summarization efficiency and feature clarity. Clustering similar phrases in this space facilitates easier sentence grouping, aiding summary creation. Leveraging TF-IDF vectorization, the most representative phrases are chosen from each cluster based on importance and diversity. This process culminates in generating a high-quality document summary quickly and systematically. The dynamic mapping method streamlines sentence grouping, systematically capturing essential document attributes. This approach addresses challenges in extractive summarization, contributing significantly to automated text summarization. Its applicability spans domains requiring rapid extraction of information from vast textual data.

Keywords

Extractive Summarization Dynamic Feature Space Mapping TF-IDF Vectorization K-Means Clustering Document Preprocessing Sentence Clustering Summarization Efficiency Feature Extraction Natural Language Processing Information Retrieval

×

Currently, subscription is the only source of revenue. The subscription resource covers the operating expenses such as web presence, online version, pre-press preparations, and staff wages.

To access the full PDF, please complete the payment process.

Subscription Details

Category Fee
Article Access Charge
15 USD
Open Access Fee 100 USD
Annual Subscription Fee
200 USD
Subscription form: click here