Detection of Voice and Lung Pathological Signal Using Acoustic Spectrogram Transformers

S., Revathi; K., Mohana Sundaram; Sharma, Padmini; Silas, Manjusha

In the medical field, identifying various pathological conditions poses a crucial challenge because it requires an invasive and contact-based data extraction technique. Therefore, non-invasive and non-contact forms of vital data, such as speech signals, can be used to identify various pathological conditions. Speech signals have distinguishing phonetic characteristics that change when a pathological condition occurs in the human body. By using these changes, various pathological signals can be classified by training machine learning and deep learning models with the acoustic features of speech signals. This work proposes the acoustic spectrogram transformer, where all the layers in the transformer are trained using acoustic characteristics extracted from the speech signals of voice and lung disease patients. Mel-frequency cepstral coefficients (MFCCs), Mel spectrograms, and spectral variables like centroid, bandwidth, roll-off, and zero-crossing rate are used for feature extraction from the voice and lung dataset. These acoustic features train the transformer blocks and depth-adaptive parameters, enabling the model to capture complex patterns for effective signal classification. Along with this architecture, the model consists of frequency-focused attention mechanisms used to extract spectral characteristics that are most indicative of pathological conditions. Meanwhile, multiple pooling strategies are employed for the effective aggregation of temporal information. Due to this targeted design, the system serves as an effective clinical tool for classification, minimizing computational complexity and achieving an accuracy of about 83% in voice pathology classification and 99% in lung pathology classification.

Category	Fee
Article Access Charge	30 USD
Article Processing Charge	400 USD
Annual Subscription Fee	200 USD

Volume - 7 | Issue - 4 | december 2025

Revathi S.

DOI

10.36548/jiip.2025.4.012

Published

07 November, 2025

e-ISSN: 2582-4252
4 issues per year
DOI: https://doi.org/10.36548/jiip

Indexing
Scopus | GoogleScholar | Crossref | MicrosoftAcademic | ScienceGate | J-Gate

Publisher

Inventive Research Organization

Open Access Journal