Integration of Sentiment Analysis and Speech Text Processing in Phonetic Flow System

Vandana; Ritik Rana; Akhilendra Khare; Subash Harizan

doi:10.36548/jucct.2025.4.001

Integration of Sentiment Analysis and Speech Text Processing in Phonetic Flow System

Open Access

https://doi.org/10.36548/jucct.2025.4.001

Vol. 7, No. 4 (2025)

Published: 04 November, 2025

Pages: 327-338

Vandana , Vandana

Department of CSE, Chitkara University Institute of Engineering and Technology, Punjab

Department of CSE, Chitkara University Institute of Engineering and Technology, Punjab
Ritik Rana , Ritik Rana

Department of CSE, Chitkara University Institute of Engineering and Technology, Punjab

Department of CSE, Chitkara University Institute of Engineering and Technology, Punjab
Akhilendra Khare , Akhilendra Khare

Department of CSE, Galgotias University, Noida

Department of CSE, Galgotias University, Noida
Subash Harizan Subash Harizan

Department of CSE, SRMIST, NCR Delhi, Gaziabad

Department of CSE, SRMIST, NCR Delhi, Gaziabad

view PDF

How to Cite

Vandana, Ritik Rana, Akhilendra Khare, and Subash Harizan. 2025. “Integration of Sentiment Analysis and Speech Text Processing in Phonetic Flow System”. Journal of Ubiquitous Computing and Communication Technologies 7 (4): 327-38. https://doi.org/10.36548/jucct.2025.4.001.

Keywords

HCI

STT

TTS

Bi-LSTM.

Abstract

Human-computer interaction (HCI) applications increasingly rely on reading speech, understanding emotional context, and generating natural language. The heterogeneous set of approaches the existing solutions use for sentiment analysis, speech synthesis, and speech recognition results in an unbalanced user experience. Designing an integrated system that can perform speech-to-text (STT) and text-to-speech (TTS) processing, sentiment analysis of input text, and neighboring aware speech generation is the problem this paper attempts to solve. To identify complexity like sarcasm and negation, Bi – LSTM is selected because it can learn context from context words (previous and next words in a sentence). In spite of data sparsity conditions, GloVe embeddings improve model generalisation by offering deep semantic understanding from large corpora. Following experimental verification, our Bi-LSTM with GloVe embeddings achieves 90% sentiment classification accuracy that is 7-10% higher relative to standard baselines like SVM (82%) and Naïve Bayes (75%). With true positive values above 88%, the model achieves well-balanced performance on the positive, neutral, and negative classes. Due to its low latency and about 87% accuracy during live testing, the system is an excellent option for interactive systems. All these features are amalgamated in our Phonetic Flow System, which enhances them to develop an extensible system that supports quicker, more natural, and emotionally intelligent human-machine interaction.

References

J. Patel, “Twitter Entity Sentiment Analysis,” Kaggle, 2020.[Online]. Available: https://www.kaggle.com/datasets/jp79 7498e/twitter-entity-sentiment- analysis[Accessed on 22-11-2024]
Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. "Glove: Global vectors for word representation." In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, 1532-1543.
Graves, Alex, and Jürgen Schmidhuber. "Framewise phoneme classification with bidirectional LSTM and other neural network architectures." Neural networks 18, no. 5-6 (2005): 602-610
https://pypi.org/project/SpeechRecogn%20ition/
McKinney, Wes. "Data structures for statistical computing in Python." scipy 445, no. 1 (2010): 51-56.
https://pypi.org/project/pyttsx3/
https://pypi.org/project/gTTS
PyTorch, “An open source machine learning framework.”[Online]. Available: https://pytorch.org
Harris, Charles R., K. Jarrod Millman, Stéfan J. Van Der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser et al. "Array programming with NumPy." nature 585, no. 7825 (2020): 357-362.
Hugging Face, “Transformers: State-of-the-art Natural Language Processing.”[Online]. Available: https://huggingface.co/transformers
Wang, Yuxuan, R. J. Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang et al. "Tacotron: Towards end-to-end speech synthesis." arXiv preprint arXiv:1703.10135 (2017).
Oord, Aaron van den, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. "Wavenet: A generative model for raw audio." arXiv preprint arXiv:1609.03499 (2016).
Ren, Yi, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, and Tie-Yan Liu. "Fastspeech: Fast, robust and controllable text to speech." Advances in neural information processing systems 32 (2019).
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "Bert: Pre-training of deep bidirectional transformers for language understanding." In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, 4171-4186.
Bird, Steven, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.", 2009.
https://code.visualstudio.com
Juhttps://jupyter.org
https://flask.palletsprojects.com
https://docs.python.org/3.8/

Integration of Sentiment Analysis and Speech Text Processing in Phonetic Flow System

How to Cite

Download Citation

Keywords

Abstract

References