Log in

Relevant bibliographies by topics / Speech Recognition (SER) / Journal articles

To see the other types of publications on this topic, follow the link: Speech Recognition (SER).

Journal articles on the topic 'Speech Recognition (SER)'

Author: Grafiati

Published: 5 June 2025

Last updated: 2 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Speech Recognition (SER).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

A, Prof Swethashree. "Speech Emotion Recognition." International Journal for Research in Applied Science and Engineering Technology 9, no. 8 (2021): 2637–40. http://dx.doi.org/10.22214/ijraset.2021.37375.

Full text

Abstract:

Abstract: Speech Emotion Recognition, abbreviated as SER, the act of trying to identify a person's feelings and relationships. Affected situations from speech. This is because the truth often reflects the basic feelings of tone and tone of voice. Emotional awareness is a fast-growing field of research in recent years. Unlike humans, machines do not have the power to comprehend and express emotions. But human communication with the computer can be improved by using automatic sensory recognition, accordingly reducing the need for human intervention. In this project, basic emotions such as peace,

APA, Harvard, Vancouver, ISO, and other styles

2

Venkateswarlu, Dr S. China. "Speech Emotion Recognition using Machine Learning." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem48705.

Full text

Abstract:

Abstract -- Speech signals are being considered as most effective means of communication between human beings. Many researchers have found different methods or systems to identify emotions from speech signals. Here, the various features of speech are used to classify emotions. Features like pitch, tone, intensity are essential for classification. Large number of the datasets are available for speech emotion recognition. Firstly, the extraction of features from speech emotion is carried out and then another important part is classification of emotions based upon speech. Hence, different classif

APA, Harvard, Vancouver, ISO, and other styles

3

Dr. M. Narendra and Lankala Suvarchala. "An Enhanced Human Speech Based Emotion Recognition." International Journal of Scientific Research in Science and Technology 11, no. 3 (2024): 518–28. http://dx.doi.org/10.32628/ijsrst24113128.

Full text

Abstract:

Speech Emotion Recognition (SER) is a Machine Learning (ML) topic that has attracted substantial attention from researchers, particularly in the field of emotional computing. This is because of its growing potential, improvements in algorithms, and real-world applications. Pitch, intensity, and Mel-Frequency Cepstral Coefficients (MFCC) are examples of quantitative variables that can be used to represent the paralinguistic information found in human speech. The three main processes of data processing, feature selection/extraction, and classification based on the underlying emotional traits are

APA, Harvard, Vancouver, ISO, and other styles

4

Sri Murugharaj B R, Shakthy B, Sabari L, and Kamaraj K. "Speech Based Emotion Recognition System." international journal of engineering technology and management sciences 7, no. 1 (2023): 332–37. http://dx.doi.org/10.46647/ijetms.2023.v07i01.050.

Full text

Abstract:

Emotion reputation from speech alerts is a crucial yet difficult part of human-computer interaction (HCI). Several well-known speech assessment and type processes were employed in the literature on speech emotion reputation (SER) to extract emotions from warnings. Deep learning algorithms have recently been proposed as an alternative to conventional ones for SER. We develop a SER system that is totally based on exclusive classifiers and functions extraction techniques. Features from the speech alerts are utilised to train exclusive classifiers. To identify the broadest feasible appropriate cha

APA, Harvard, Vancouver, ISO, and other styles

5

Abbaschian, Babak Joze, Daniel Sierra-Sosa, and Adel Elmaghraby. "Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models." Sensors 21, no. 4 (2021): 1249. http://dx.doi.org/10.3390/s21041249.

Full text

Abstract:

The advancements in neural networks and the on-demand need for accurate and near real-time Speech Emotion Recognition (SER) in human–computer interactions make it mandatory to compare available methods and databases in SER to achieve feasible solutions and a firmer understanding of this open-ended problem. The current study reviews deep learning approaches for SER with available datasets, followed by conventional machine learning techniques for speech emotion recognition. Ultimately, we present a multi-aspect comparison between practical neural network approaches in speech emotion recognition.

APA, Harvard, Vancouver, ISO, and other styles

6

Samyuktha S and Sarwath Unnisa. "Emotional Speech Recognition using CNN model." International Journal of Information Technology, Research and Applications 4, no. 1 (2025): 30–38. https://doi.org/10.59461/ijitra.v4i1.164.

Full text

Abstract:

Speech Emotion Recognition (SER) is a new area of artificial intelligence that deals with recognizing human emotions from speech signals. Emotions are an important aspect of communication, affecting social interactions and decision-making processes. This paper introduces a complete SER system that uses state-of-the-art deep learning methods to recognize emotions like Happy, Sad, Angry, Neutral, Surprise, Calm, Fear, and Disgust. The suggested model uses Mel-Spectrograms, MFCCs, and Chroma features for efficient feature extraction. Convolutional layers are utilized to capture complex patterns i

APA, Harvard, Vancouver, ISO, and other styles

7

Setyono, Jonathan Christian, and Amalia Zahra. "Data augmentation and enhancement for multimodal speech emotion recognition." Bulletin of Electrical Engineering and Informatics 12, no. 5 (2023): 3008–15. http://dx.doi.org/10.11591/eei.v12i5.5031.

Full text

Abstract:

Humans’ fundamental need is interaction with each other such as using conversation or speech. Therefore, it is crucial to analyze speech using computer technology to determine emotions. The speech emotion recognition (SER) method detects emotions in speech by examining various aspects. SER is a supervised method to decide the emotion class in speech. This research proposed a multimodal SER model using one of the deep learning based enhancement techniques, which is the attention mechanism. Additionally, this research addresses the imbalanced dataset problem in the SER field using generative adv

APA, Harvard, Vancouver, ISO, and other styles

8

Harikant, Shashidhar, Rakshitha Prasad, Vijaya Lakshmi R, and Sidhramappa H. "SPEECH EMOTION RECOGNITION USING DEEP LEARNING." International Research Journal of Computer Science 9, no. 8 (2022): 267–71. http://dx.doi.org/10.26562/irjcs.2022.v0908.22.

Full text

Abstract:

Speech Emotion Recognition is a present topic of the research since it has wide range of application. SER is a vital part of effective human interaction in the speech processing. Speech Emotion recognition is a domain that is growing rapidly in the recent years. Unlike humans, machines deficit the potential to perceive and express emotions. But the improvisation of human-computer interaction can be done by automated SER thereby turn down the need of human mediation in recent time. The primary goal of SER is to improve man-machine interface. This paper covers Deep Learning to train the model, L

APA, Harvard, Vancouver, ISO, and other styles

9

Gawali, Swayam. "Audio Aura - Speech Emotion Recognition System." International Journal for Research in Applied Science and Engineering Technology 13, no. 4 (2025): 7082–88. https://doi.org/10.22214/ijraset.2025.70092.

Full text

Abstract:

Abstract: Speech emotion recognition (SER) plays a crucial role in human-computer interaction, enabling systems to in- terpret and respond to user emotions effectively. In human- computer interaction, speech emotion recognition (SER) is es- sential because it allows systems to efficiently understand and react to user emotions. In this research, we introduce Audio Aura, a machine learning-based system for voice signal emotion classification. To improve classification accuracy and extractrich speech representations, the system uses a transformer-based model called Wav2Vec2. By leveraging Wav2Vec

APA, Harvard, Vancouver, ISO, and other styles

10

Kumar, Balbant. "Speech Emotion Recognition using CNN." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem45881.

Full text

Abstract:

Abstract Speech Emotion Recognition (SER) is a growing area in affective computing that aims to detect and understand human emotions through speech signals. It finds extensive use in human-computer interaction, virtual assistants, mental health tracking, and automating customer service. This project introduced a deep learning method for SER utilizing Convolutional Neural Networks (CNNs). The system extracts mel-frequency cepstral coefficients (MFCCs) and spectrograms from raw audio inputs and transforms speech signals into two-dimensional images. These images were then processed by a CNN frame

APA, Harvard, Vancouver, ISO, and other styles

11

Benzirar, Abdelkader, Mohamed Hamidi, and Mouncef Filali Bouami. "Conception of speech emotion recognition methods: a review." Indonesian Journal of Electrical Engineering and Computer Science 37, no. 3 (2025): 1856. https://doi.org/10.11591/ijeecs.v37.i3.pp1856-1864.

Full text

Abstract:

In recent years, speech emotion recognition (SER) has emerged as a pivotal tool for understanding and enhancing human-computer interaction (HCI), thus garnering significant attention from researchers due to its diverse range of applications. However, SER systems encounter numerous challenges, particularly concerning the selection of appropriate features and classifiers for emotion recognition. This paper provides a concise survey of the field of speech emotion recognition, elucidating its classification algorithms and various feature extraction techniques across multiple languages. Additionall

APA, Harvard, Vancouver, ISO, and other styles

12

Abdelkader, Benzirar Mohamed Hamidi Mouncef Filali Bouami. "Conception of speech emotion recognition methods: a review." Indonesian Journal of Electrical Engineering and Computer Science 37, no. 3 (2025): 1856–64. https://doi.org/10.11591/ijeecs.v37.i3.pp1856-1864.

Full text

Abstract:

In recent years, speech emotion recognition (SER) has emerged as a pivotal tool for understanding and enhancing human-computer interaction (HCI), thus garnering significant attention from researchers due to its diverse range of applications. However, SER systems encounter numerous challenges, particularly concerning the selection of appropriate features and classifiers for emotion recognition. This paper provides a concise survey of the field of speech emotion recognition, elucidating its classification algorithms and various feature extraction techniques across multiple languages. Additionall

APA, Harvard, Vancouver, ISO, and other styles

13

Ganapathy, Apoorva. "Speech Emotion Recognition Using Deep Learning Techniques." ABC Journal of Advanced Research 5, no. 2 (2016): 113–22. http://dx.doi.org/10.18034/abcjar.v5i2.550.

Full text

Abstract:

The developments in neural systems and the high demand requirement for exact and close actual Speech Emotion Recognition in human-computer interfaces mark it compulsory to liken existing methods and datasets in speech emotion detection to accomplish practicable clarifications and a securer comprehension of this unrestricted issue. The present investigation assessed deep learning methods for speech emotion detection with accessible datasets, tracked by predictable machine learning methods for SER. Finally, we present-day a multi-aspect assessment between concrete neural network methods in SER.

APA, Harvard, Vancouver, ISO, and other styles

14

Kawade, Rupali, and Sonal Jagtap. "Comprehensive Study of Automatic Speech Emotion Recognition Systems." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 9s (2023): 709–17. http://dx.doi.org/10.17762/ijritcc.v11i9s.7743.

Full text

Abstract:

Speech emotion recognition (SER) is the technology that recognizes psychological characteristics and feelings from the speech signals through techniques and methodologies. SER is challenging because of more considerable variations in different languages arousal and valence levels. Various technical developments in artificial intelligence and signal processing methods have encouraged and made it possible to interpret emotions.SER plays a vital role in remote communication. This paper offers a recent survey of SER using machine learning (ML) and deep learning (DL)-based techniques. It focuses on

APA, Harvard, Vancouver, ISO, and other styles

15

Pavithra, Avvari, Sukanya Ledalla, J. Sirisha Devi, Golla Dinesh, Monika Singh, and G. Vijendar Reddy. "Deep Learning-based Speech Emotion Recognition: An Investigation into a sustainably Emotion-Speech Relationship." E3S Web of Conferences 430 (2023): 01091. http://dx.doi.org/10.1051/e3sconf/202343001091.

Full text

Abstract:

Speech Emotion Recognition (SER) poses a significant challenge with promising applications in psychology, speech therapy, and customer service. This research paper proposes the development of an SER system utilizing machine learning techniques, particularly deep learning and recurrent neural networks. The model will be trained on a carefully labeled dataset of diverse speech samples representing various emotions. By analyzing crucial audio features such as pitch, rhythm, and prosody, the system aims to achieve accurate emotion recognition for novel speech samples. The primary objective of this

APA, Harvard, Vancouver, ISO, and other styles

16

Assunção, Gustavo, Paulo Menezes, and Fernando Perdigão. "Speaker Awareness for Speech Emotion Recognition." International Journal of Online and Biomedical Engineering (iJOE) 16, no. 04 (2020): 15. http://dx.doi.org/10.3991/ijoe.v16i04.11870.

Full text

Abstract:

<div class="page" title="Page 1"><div class="layoutArea"><div class="column"><p><span>The idea of recognizing human emotion through speech (SER) has recently received considerable attention from the research community, mostly due to the current machine learning trend. Nevertheless, even the most successful methods are still rather lacking in terms of adaptation to specific speakers and scenarios, evidently reducing their performance when compared to humans. In this paper, we evaluate a largescale machine learning model for classification of emotional states. This

APA, Harvard, Vancouver, ISO, and other styles

17

Anvarjon, Tursunov, Mustaqeem, and Soonil Kwon. "Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features." Sensors 20, no. 18 (2020): 5212. http://dx.doi.org/10.3390/s20185212.

Full text

Abstract:

Artificial intelligence (AI) and machine learning (ML) are employed to make systems smarter. Today, the speech emotion recognition (SER) system evaluates the emotional state of the speaker by investigating his/her speech signal. Emotion recognition is a challenging task for a machine. In addition, making it smarter so that the emotions are efficiently recognized by AI is equally challenging. The speech signal is quite hard to examine using signal processing methods because it consists of different frequencies and features that vary according to emotions, such as anger, fear, sadness, happiness

APA, Harvard, Vancouver, ISO, and other styles

18

Thilaga, P. Jothi, S. Kavipriya, and K. Vijayalakshmi. "Deep Learning based Speech Emotion Recognition System." Journal of University of Shanghai for Science and Technology 23, no. 12 (2021): 212–23. http://dx.doi.org/10.51201/jusst/21/121003.

Full text

Abstract:

Emotions are elementary for humans, impacting perception and everyday activities like communication, learning and decision-making. Speech emotion Recognition (SER) systems aim to facilitate the natural interaction with machines by direct voice interaction rather than exploitation ancient devices as input to know verbal content and build it straightforward for human listeners to react. During this SER system primarily composed of 2 sections called feature extraction and feature classification phase. SER implements on bots to speak with humans during a non-lexical manner. The speech emotion reco

APA, Harvard, Vancouver, ISO, and other styles

19

Maryamah, Maryamah, Nicholas Juan Kalvin Pradiptamurty, Hafiyyah Khayyiroh Shafro, Mohammad Sihabudin Al Qurtubi, Giovanny Alberta Tambahjong, and Qothrotunnidha' Almaulidiyah. "Speech Emotion Recognition (SER) dengan Metode Bidirectional LSTM." PROSIDING SEMINAR NASIONAL SAINS DATA 3, no. 1 (2023): 153–61. http://dx.doi.org/10.33005/senada.v3i1.105.

Full text

Abstract:

Emotions are a part of humans as a form of response to experienced events. Emotion analysis or known as speech emotion recognition (SER) is a field many researchers are interested in because voice recognition systems can assist in criminal investigations, monitoring, and detection of potentially dangerous events, and assisting the health care system. Therefore, this study proposes the detection of SER using the Bidirectional Long short-term memory (Bi-LSTM) model approach. The dataset used was scraped on the YouTube platform. The dataset is manually labeled then feature extraction is performed

APA, Harvard, Vancouver, ISO, and other styles

20

Shirbhate, Tanvi, Devashish Deshmukh, Chetan Rajurkar, Sayali Sagane, and Prof. (Dr) Anup W. Burange. "Speech Emotion Recognition Using Machine Learning." International Journal of Ingenious Research, Invention and Development (IJIRID) 3, no. 2 (2024): 101–9. https://doi.org/10.5281/zenodo.11049046.

Full text

Abstract:

<em>Language is the most important medium of communication. Emotions play an important role in human life. Recognizing emotion in speech is both important and challenging because we are dealing with human-computer interaction. Speech Emotion Recognition (SER) has many applications, and a lot of research has focused on this interest in recent years. Speech Emotion Recognition (SER) has become an important collaboration at the intersection of music processing and machine learning. The goal of the system is to identify and classify emotions in speech, leading to human-computer applications, psych

APA, Harvard, Vancouver, ISO, and other styles

21

Kambale, Prof Jagdish, Abhijeet Khedkar, Prasad Patil, and Tejas Sonone. "Speech Emotion Recognition Using Deep Learning." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (2023): 4829–33. http://dx.doi.org/10.22214/ijraset.2023.49703.

Full text

Abstract:

Abstract: Due to different technical developments, speech signals have evolved into a kind of human-machine communication in the digital age. Recognizing the emotions of the person behind his or her speech is a crucial part of Human-Computer Interaction (HCI). Many methods, including numerous well-known speech analysis and classification algorithms, have been employed to extract emotions from signals in the literature on voice emotion recognition (SER). Speech Emotion Recognition (SER) approaches have become obsolete as the Deep Learning concept has come into play. In this paper, the algorithm

APA, Harvard, Vancouver, ISO, and other styles

22

Barua, Manya. "Decoding Emotions: Machine Learning Approach to Speech Emotion Recognition." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 06 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem36178.

Full text

Abstract:

Speech Emotion Recognition (SER) stands at the forefront of human-computer interaction, offering profound implications for fields such as healthcare, education, and entertainment. This project report delves into the application of Machine Learning (ML) techniques for SER, aiming to discern the emotional content from speech signals. The report begins with an overview of the significance of SER in various domains, emphasizing the need for accurate and robust emotion detection systems. Following this,a detailed exploration of the methodologies employed in SER is presented, encompassing feature ex

APA, Harvard, Vancouver, ISO, and other styles

23

Michael, Stefanus, and Amalia Zahra. "Multimodal speech emotion recognition optimization using genetic algorithm." Bulletin of Electrical Engineering and Informatics 13, no. 5 (2024): 3309–16. http://dx.doi.org/10.11591/eei.v13i5.7409.

Full text

Abstract:

Speech emotion recognition (SER) is a technology that can detect emotions in speech. Various methods have been used in developing SER, such as convolutional neural networks (CNNs), long short-term memory (LSTM), and multilayer perceptron. However, sometimes in addition to model selection, other techniques are still needed to improve SER performance, namely optimization methods. This paper compares manual hyperparameter tuning using grid search (GS) and hyperparameter tuning using genetic algorithm (GA) on the LSTM model to prove the performance increase in the multimodal SER model after optimi

APA, Harvard, Vancouver, ISO, and other styles

24

Mohamed. A. Ahmad. "Artificial Neural Network vs. Support Vector Machine For Speech Emotion Recognition." Tikrit Journal of Pure Science 21, no. 6 (2023): 167–72. http://dx.doi.org/10.25130/tjps.v21i6.1097.

Full text

Abstract:

Today, the subject of emotion recognition from speech got the attention of many researchers who are interested in the topic of speech recognition and it has engaged in many applications. Furthermore, Speech Emotion Recognition (SER) a pivotal part of influential human interaction and it has been a modern challenge to speech processing. SER has two basic phases; which are, features extraction and emotion classification. This paper presents a comparison in performances of two popular techniques used for classification that Artificial Neural Network (ANN) as well as Support Vector Machine (

APA, Harvard, Vancouver, ISO, and other styles

25

Sondawale, Shweta. "Face and Speech Emotion Recognition System." International Journal for Research in Applied Science and Engineering Technology 12, no. 4 (2024): 5621–28. http://dx.doi.org/10.22214/ijraset.2024.61278.

Full text

Abstract:

Abstract: Emotions serve as the cornerstone of human communication, facilitating the expression of one's inner thoughts and feelings to others. Speech Emotion Recognition (SER) represents a pivotal endeavour aimed at deciphering the emotional nuances embedded within a speaker's voice signal. Universal emotions such as neutrality, anger, happiness, and sadness form the basis of this recognition process, allowing for the identification of fundamental emotional states. To achieve this, spectral and prosodic features are leveraged, each offering unique insights into the emotional content of speech

APA, Harvard, Vancouver, ISO, and other styles

26

Tajalsir, ohammed, Susana Mu˜noz Hern´andez, and Fatima Abdalbagi Mohammed. "ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Model." Signal & Image Processing : An International Journal 13, no. 1 (2022): 19–27. http://dx.doi.org/10.5121/sipij.2022.13102.

Full text

Abstract:

The swift progress in the study field of human-computer interaction (HCI) causes to increase in the interest in systems for Speech emotion recognition (SER). The speech Emotion Recognition System is the system that can identify the emotional states of human beings from their voice. There are well works in Speech Emotion Recognition for different language but few researches have implemented for Arabic SER systems and that because of the shortage of available Arabic speech emotion databases. The most commonly considered languages for SER is English and other European and Asian languages. Several

APA, Harvard, Vancouver, ISO, and other styles

27

Alam Monisha, Syeda Tamanna, and Sadia Sultana. "A Review of the Advancement in Speech Emotion Recognition for Indo-Aryan and Dravidian Languages." Advances in Human-Computer Interaction 2022 (December 1, 2022): 1–11. http://dx.doi.org/10.1155/2022/9602429.

Full text

Abstract:

Speech emotion recognition (SER) has grown to be one of the most trending research topics in computational linguistics in the last two decades. Speech being the primary communication medium, understanding the emotional state of humans from speech and responding accordingly have made the speech emotion recognition system an essential part of the human-computer interaction (HCI) field. Although there are a few review works carried out for SER, none of them discusses the development of SER system for the Indo-Aryan or Dravidian language families. This paper focuses on some studies carried out for

APA, Harvard, Vancouver, ISO, and other styles

28

Yerawar, Atharva. "Research on Speech Emotion Recognition System Using Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 04 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem30141.

Full text

Abstract:

Speech Emotion Recognition (SER) remains a hot topic in the domain of affective computing, drawing considerable research interest. Its growing potential, advancements in algorithms, and real-world applications contribute to this interest. Human speech carries paralinguistic cues that can be quantitatively represented through features like pitch, intensity, and Mel-Frequency Cepstral Coefficients (MFCC). SER typically involves three main stages: data processing, feature extraction/selection, and classification based on emotional features present. These steps, tailored to the unique attributes o

APA, Harvard, Vancouver, ISO, and other styles

29

Lu, Cheng, Yuan Zong, Chuangao Tang, et al. "Implicitly Aligning Joint Distributions for Cross-Corpus Speech Emotion Recognition." Electronics 11, no. 17 (2022): 2745. http://dx.doi.org/10.3390/electronics11172745.

Full text

Abstract:

In this paper, we investigate the problem of cross-corpus speech emotion recognition (SER), in which the training (source) and testing (target) speech samples belong to different corpora. This case thus leads to a feature distribution mismatch between the source and target speech samples. Hence, the performance of most existing SER methods drops sharply. To solve this problem, we propose a simple yet effective transfer subspace learning method called joint distribution implicitly aligned subspace learning (JIASL). The basic idea of JIASL is very straightforward, i.e., building an emotion discr

APA, Harvard, Vancouver, ISO, and other styles

30

Muhammad, Fahreza Alghifari, Surya Gunawan Teddy, and Kartiwi Mira. "Speech Emotion Recognition Using Deep Feedforward Neural Network." Indonesian Journal of Electrical Engineering and Computer Science 10, no. 2 (2018): 554–61. https://doi.org/10.11591/ijeecs.v10.i2.pp554-561.

Full text

Abstract:

Speech emotion recognition (SER) is currently a research hotspot due to its challenging nature but bountiful future prospects. The objective of this research is to utilize Deep Neural Networks (DNNs) to recognize human speech emotion. First, the chosen speech feature Mel-frequency cepstral coefficient (MFCC) were extracted from raw audio data. Second, the speech features extracted were fed into the DNN to train the network. The trained network was then tested onto a set of labelled emotion speech audio and the recognition rate was evaluated. Based on the accuracy rate the MFCC, number of neuro

APA, Harvard, Vancouver, ISO, and other styles

31

Kumar, K. Ashok, and Dr J. L. Mazher Iqbal. "Machine Learning Based Emotion Recognition using Speech Signal." International Journal of Engineering and Advanced Technology 9, no. 1s5 (2019): 295–302. http://dx.doi.org/10.35940/ijeat.a1068.1291s519.

Full text

Abstract:

The challenging module in CAS (computer-aided services) has recognized the emotion from the signals of speech. In SER (speech emotion recognition), several schemes have used for extracting emotions from the signals, comprising various classification & speech analysis methods. This manuscript represents an outline of methods & explores some contemporary literature where the existing models have used for emotion recognition based on speech. This literature review presents contributions that made towards emotion recognition of speech and extracted the features for determining emotions.

APA, Harvard, Vancouver, ISO, and other styles

32

Alghifari, Muhammad Fahreza, Teddy Surya Gunawan, and Mira Kartiwi. "Speech Emotion Recognition Using Deep Feedforward Neural Network." Indonesian Journal of Electrical Engineering and Computer Science 10, no. 2 (2018): 554. http://dx.doi.org/10.11591/ijeecs.v10.i2.pp554-561.

Full text

Abstract:

Speech emotion recognition (SER) is currently a research hotspot due to its challenging nature but bountiful future prospects. The objective of this research is to utilize Deep Neural Networks (DNNs) to recognize human speech emotion. First, the chosen speech feature Mel-frequency cepstral coefficient (MFCC) were extracted from raw audio data. Second, the speech features extracted were fed into the DNN to train the network. The trained network was then tested onto a set of labelled emotion speech audio and the recognition rate was evaluated. Based on the accuracy rate the MFCC, number of neuro

APA, Harvard, Vancouver, ISO, and other styles

33

Farooq, Misbah, Fawad Hussain, Naveed Khan Baloch, Fawad Riasat Raja, Heejung Yu, and Yousaf Bin Zikria. "Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network." Sensors 20, no. 21 (2020): 6008. http://dx.doi.org/10.3390/s20216008.

Full text

Abstract:

Speech emotion recognition (SER) plays a significant role in human–machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features must be extracted from the speech data. Traditionally, handcrafted features were used for emotional classification from speech signals; however, they are not efficient enough to accurately depict the emotional states of the speaker. In this study, the benefits of a deep convolutional neural network (D

APA, Harvard, Vancouver, ISO, and other styles

34

Jubear, Sk Mohammed, D. Pavan Kumar Reddy, G. Subramanyam, Sk Farooq, T. Sreenivasulu, and N. Srinivasa Rao. "A Review on Speech Emotion Recognition Using Machine Learning." International Journal of Innovative Research in Computer Science and Technology 10, no. 3 (2022): 406–11. http://dx.doi.org/10.55524/ijircst.2022.10.3.65.

Full text

Abstract:

This paper focuses on the development of a robust speech emotion recognition system using a combination of different speech features with feature optimization techniques and speech de-noising technique to acquire improved emotion classification accuracy, decreasing the system complexity and obtain noise robustness. Additionally, we create original methods for SER to merge features. We employ feature optimization methods that are based on the feature transformation and feature selection machine learning techniques in order to build SER. The following is a list of the upcoming events. A neural n

APA, Harvard, Vancouver, ISO, and other styles

35

Li, Hui, Jiawen Li, Hai Liu, Tingting Liu, Qiang Chen, and Xinge You. "MelTrans: Mel-Spectrogram Relationship-Learning for Speech Emotion Recognition via Transformers." Sensors 24, no. 17 (2024): 5506. http://dx.doi.org/10.3390/s24175506.

Full text

Abstract:

Speech emotion recognition (SER) is not only a ubiquitous aspect of everyday communication, but also a central focus in the field of human–computer interaction. However, SER faces several challenges, including difficulties in detecting subtle emotional nuances and the complicated task of recognizing speech emotions in noisy environments. To effectively address these challenges, we introduce a Transformer-based model called MelTrans, which is designed to distill critical clues from speech data by learning core features and long-range dependencies. At the heart of our approach is a dual-stream f

APA, Harvard, Vancouver, ISO, and other styles

36

Chen, Shouyan, Mingyan Zhang, Xiaofen Yang, Zhijia Zhao, Tao Zou, and Xinqi Sun. "The Impact of Attention Mechanisms on Speech Emotion Recognition." Sensors 21, no. 22 (2021): 7530. http://dx.doi.org/10.3390/s21227530.

Full text

Abstract:

Speech emotion recognition (SER) plays an important role in real-time applications of human-machine interaction. The Attention Mechanism is widely used to improve the performance of SER. However, the applicable rules of attention mechanism are not deeply discussed. This paper discussed the difference between Global-Attention and Self-Attention and explored their applicable rules to SER classification construction. The experimental results show that the Global-Attention can improve the accuracy of the sequential model, while the Self-Attention can improve the accuracy of the parallel model when

APA, Harvard, Vancouver, ISO, and other styles

37

B, Chakradhar. "Machine Learning Based Speech Emotion Recognition System." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 04 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem32211.

Full text

Abstract:

In the last decade, there has been significant research into Automatic Speech Emotion Recognition (SER). The primary goal of SER is to improve human-machine interfaces. It can also monitor someone's psychological state for lie detection applications. Recently, speech emotion recognition has found uses in medicine and forensics. This paper recognizes 7 emotions using pitch and prosody features. The majority of speech features used here are in the time domain. A Support Vector Machine (SVM) classifier categorizes the emotions. The Berlin emotional database was used for this task. A good recognit

APA, Harvard, Vancouver, ISO, and other styles

38

Farhan Fadhil, Muhammad, and Amalia Zahra. "The use of generative adversarial network as a domain adaptation method for cross-corpus speech emotion recognition." Bulletin of Electrical Engineering and Informatics 14, no. 1 (2025): 297–306. http://dx.doi.org/10.11591/eei.v14i1.8339.

Full text

Abstract:

The research of speech emotion recognition (SER) is growing rapidly. However, SER still faces a cross-corpus SER problem which is performance degradation when a single SER model is tested in different domains. This study shows the impact of implementing a generative adversarial network (GAN) model for adapting speech data from different domains and performs emotion classification from the speech features using a 1D convolutional neural network (CNN) model. The results of this study found that the domain adaptation approach using a GAN model could improve the accuracy of emotion classification

APA, Harvard, Vancouver, ISO, and other styles

39

Tank, Vishal P., and S. K. Hadia. "Creation of speech corpus for emotion analysis in Gujarati language and its evaluation by various speech parameters." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 5 (2020): 4752. http://dx.doi.org/10.11591/ijece.v10i5.pp4752-4758.

Full text

Abstract:

In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official’s language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. To obse

APA, Harvard, Vancouver, ISO, and other styles

40

Vishal, P. Tank, and K. Hadia S. "Creation of speech corpus for emotion analysis in Gujarati language and its evaluation by various speech parameters." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 5 (2020): 4752–58. https://doi.org/10.11591/ijece.v10i5.pp4752-4758.

Full text

Abstract:

In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official’s language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. T

APA, Harvard, Vancouver, ISO, and other styles

41

Kothuri, Jhansi. "Speech Emotion Recognition: An LSTM Approach." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem45580.

Full text

Abstract:

Abstract – This paper presents a novel approach to Speech Emotion Recognition (SER) utilizing a Long Short-Term Memory (LSTM) network to classify emotions from audio inputs in real-time. The primary goal of this research is to accurately identify various emotions, including happiness, sadness, anger, fear, and surprise, enhancing user experience in applications such as human-computer interaction, virtual assistants, and mental health monitoring. The methodology involves a comprehensive process that begins with the preprocessing of audio signals to ensure clarity and consistency. This is follow

APA, Harvard, Vancouver, ISO, and other styles

42

Alluhaidan, Ala Saleh, Oumaima Saidani, Rashid Jahangir, Muhammad Asif Nauman, and Omnia Saidani Neffati. "Speech Emotion Recognition through Hybrid Features and Convolutional Neural Network." Applied Sciences 13, no. 8 (2023): 4750. http://dx.doi.org/10.3390/app13084750.

Full text

Abstract:

Speech emotion recognition (SER) is the process of predicting human emotions from audio signals using artificial intelligence (AI) techniques. SER technologies have a wide range of applications in areas such as psychology, medicine, education, and entertainment. Extracting relevant features from audio signals is a crucial task in the SER process to correctly identify emotions. Several studies on SER have employed short-time features such as Mel frequency cepstral coefficients (MFCCs), due to their efficiency in capturing the periodic nature of audio signals. However, these features are limited

APA, Harvard, Vancouver, ISO, and other styles

43

Ahn, Youngdo, Sangwook Han, Seonggyu Lee, and Jong Won Shin. "Speech Emotion Recognition Incorporating Relative Difficulty and Labeling Reliability." Sensors 24, no. 13 (2024): 4111. http://dx.doi.org/10.3390/s24134111.

Full text

Abstract:

Emotions in speech are expressed in various ways, and the speech emotion recognition (SER) model may perform poorly on unseen corpora that contain different emotional factors from those expressed in training databases. To construct an SER model robust to unseen corpora, regularization approaches or metric losses have been studied. In this paper, we propose an SER method that incorporates relative difficulty and labeling reliability of each training sample. Inspired by the Proxy-Anchor loss, we propose a novel loss function which gives higher gradients to the samples for which the emotion label

APA, Harvard, Vancouver, ISO, and other styles

44

Ullah, Rizwan, Muhammad Asif, Wahab Ali Shah, et al. "Speech Emotion Recognition Using Convolution Neural Networks and Multi-Head Convolutional Transformer." Sensors 23, no. 13 (2023): 6212. http://dx.doi.org/10.3390/s23136212.

Full text

Abstract:

Speech emotion recognition (SER) is a challenging task in human–computer interaction (HCI) systems. One of the key challenges in speech emotion recognition is to extract the emotional features effectively from a speech utterance. Despite the promising results of recent studies, they generally do not leverage advanced fusion algorithms for the generation of effective representations of emotional features in speech utterances. To address this problem, we describe the fusion of spatial and temporal feature representations of speech emotion by parallelizing convolutional neural networks (CNNs) and

APA, Harvard, Vancouver, ISO, and other styles

45

Hafsa, Qazi, and Nath Kaushik Baij. "A Hybrid Technique using CNN+LSTM for Speech Emotion Recognition." International Journal of Engineering and Advanced Technology (IJEAT) 9, no. 5 (2020): 1126–30. https://doi.org/10.35940/ijeat.E1027.069520.

Full text

Abstract:

Automatic speech emotion recognition is a very necessary activity for effective human-computer interaction. This paper is motivated by using spectrograms as inputs to the hybrid deep convolutional LSTM for speech emotion recognition. In this study, we trained our proposed model using four convolutional layers for high-level feature extraction from input spectrograms, LSTM layer for accumulating long-term dependencies and finally two dense layers. Experimental results on the SAVEE database shows promising performance. Our proposed model is highly capable as it obtained an accuracy of 94.26%.

APA, Harvard, Vancouver, ISO, and other styles

46

Saumard, Matthieu. "Enhancing Speech Emotions Recognition Using Multivariate Functional Data Analysis." Big Data and Cognitive Computing 7, no. 3 (2023): 146. http://dx.doi.org/10.3390/bdcc7030146.

Full text

Abstract:

Speech Emotions Recognition (SER) has gained significant attention in the fields of human–computer interaction and speech processing. In this article, we present a novel approach to improve SER performance by interpreting the Mel Frequency Cepstral Coefficients (MFCC) as a multivariate functional data object, which accelerates learning while maintaining high accuracy. To treat MFCCs as functional data, we preprocess them as images and apply resizing techniques. By representing MFCCs as functional data, we leverage the temporal dynamics of speech, capturing essential emotional cues more effecti

APA, Harvard, Vancouver, ISO, and other styles

47

He, ZiRui. "Research Advanced in Speech Emotion Recognition based on Deep Learning." Theoretical and Natural Science 86, no. 1 (2025): 45–52. https://doi.org/10.54254/2753-8818/2025.20333.

Full text

Abstract:

Speech Emotion Recognition (SER)s burgeoning significance within intelligent systems is underscored by its transformative impact across various fields, from human-computer interaction, and virtual assistants to mental health monitoring. Over the rapid development of this technology in the past two decades, studies have continuously confronted and overcome kinds of real-world challenges, such as data scarcity, environmental noise, and cross-language differences. This survey focuses on recent innovations in SER, particularly deep learning architectures, and synthetic data augmentation, and addre

APA, Harvard, Vancouver, ISO, and other styles

48

Ramli, Izzad, Nursuriati Jamil, Norizah Ardi, and Raseeda Hamzah. "Emolah: A Malay Language Spontaneous Speech Emotion Recognition on iOS Platform." International Journal of Engineering & Technology 7, no. 3.15 (2018): 151. http://dx.doi.org/10.14419/ijet.v7i3.15.17520.

Full text

Abstract:

This paper presented the implementation of spontaneous speech emotion recognition (SER) using smartphone on iOS platform. The novelty of this work is at the time of writing, no similar work has been done using Malay language spontaneous speech. The development of SER using a mobile device is important for ease of use anytime and anywhere. The main factors to be considered is the computational complexity of classifying the emotions in real-time. Therefore, we introduced EmoLah, a Malay language spontaneous SER that is able to recognize emotions on the go with satisfactory accuracy rate. Pitch a

APA, Harvard, Vancouver, ISO, and other styles

49

Ramana, M. Venkata. "EmoTeluNet: A Deep Learning Architecture for Telugu Speech Emotion Recognition." International Scientific Journal of Engineering and Management 04, no. 05 (2025): 1–9. https://doi.org/10.55041/isjem03765.

Full text

Abstract:

Abstract—Speech Emotion Recognition (SER) is pivotal for advancing human-centric artificial intelligence, yet regional lan- guages like Telugu, spoken by over 80 million people, lack robust SER frameworks. This paper introduces Deep Telugu Emotion, a deep learning framework designed to recognize emotions in Telugu speech. We curated a novel dataset of Telugu emotional speech and evaluated six neural network models: Artificial Neural Network (ANN), Multi-Layer Perceptron (MLP), Bidirectional Long Short-Term Memory (BiLSTM), Attention-based BiLSTM, Convolutional Recurrent Neural Network (CRNN),

APA, Harvard, Vancouver, ISO, and other styles

50

A.Poongodai, Y.Nandini, T.Mounika, A.Karishma, and N.Kevalya Kumar. "Speech Emotion Recognition using Convolutional Neural Networks with Attention Mechanisms." International Research Journal of Innovations in Engineering and Technology 09, Special Issue ICCIS (2025): 162–67. https://doi.org/10.47001/irjiet/2025.iccis-202526.

Full text

Abstract:

Abstract - Speech Emotion Recognition (SER) is a crucial component in enhancing human- computer interaction by enabling machines to recognize and respond to human emotions effectively. This study proposes a novel SER framework using Convolutional Neural Networks (CNNs) augmented with attention mechanisms. The CNNs are employed to capture hierarchical and spatial features from spectrogram representations of speech signals, while Attention mechanisms focus on emotionally salient regions, improving interpretability and accuracy. The proposed model is evaluated on benchmark datasets, demonstrating

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!