Decoding Marathi Emotions: Enhanced Speech Emotion Recognition via Deep Belief Network-SVM Integration

Authors

  • Varsha Nilesh Gaikwad RMD Sinhgad school of engineering
  • Rahul Kumar Budania

DOI:

https://doi.org/10.6977/IJoSI.202508_9(4).0006

Abstract

SER in Marathi presents considerable hurdles due to the language's distinct grammatical and emotional characteristics. This paper presents a robust methodology for classifying emotions in Marathi speech utilizing advanced signal processing, feature extraction, and machine learning techniques. The method entails collecting a diverse collection of Marathi speech samples and using pre-processing steps such as Pre-Emphasis and VAD to improve signal quality. Speech signals are segmented using the Hamming window to reduce discontinuities, and features such as MFCCs, pitch, intensity, and spectral properties are retrieved. For classification, an attentive DBN is paired with an SVM, which uses attention techniques and batch normalization to improve performance and reduce overfitting. The suggested approach surpasses existing models, with 98% accuracy, 98% F1-Score, 99% specificity, 99% sensitivity, 98% precision, and 98% recall.

Downloads

Published

2025-08-15