Log in

Relevant bibliographies by topics / Emotion detection from speech / Journal articles

To see the other types of publications on this topic, follow the link: Emotion detection from speech.

Journal articles on the topic 'Emotion detection from speech'

Author: Grafiati

Published: 4 June 2025

Last updated: 1 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Emotion detection from speech.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Vishal, P. Tank, and K. Hadia S. "Creation of speech corpus for emotion analysis in Gujarati language and its evaluation by various speech parameters." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 5 (2020): 4752–58. https://doi.org/10.11591/ijece.v10i5.pp4752-4758.

Full text

Abstract:

In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official’s language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. T

APA, Harvard, Vancouver, ISO, and other styles

2

Thamaraiselvi , Dr D., J. Pranay, and S. Hruthik Kasyap. "Emotion Detection from Video and Audio and Text." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 01 (2025): 1–9. https://doi.org/10.55041/ijsrem40494.

Full text

Abstract:

Emotion detection from video, audio, and text has emerged as a vital area of research within the fields of artificial intelligence and human-computer interaction. As digital communication increasingly integrates multiple modalities, understanding human emotions through these various channels has become essential for enhancing user experience, improving mental health diagnostics, and advancing affective computing technologies. This paper presents a comprehensive overview of the methodologies and frameworks developed for detecting emotions from video, audio, and text inputs, highlighting the syn

APA, Harvard, Vancouver, ISO, and other styles

3

Rastogi, Rohit, Tushar Anand, Shubham Kumar Sharma, and Sarthak Panwar. "Emotion Detection via Voice and Speech Recognition." International Journal of Cyber Behavior, Psychology and Learning 13, no. 1 (2023): 1–24. http://dx.doi.org/10.4018/ijcbpl.333473.

Full text

Abstract:

Emotion detection from voice signals is needed for human-computer interaction (HCI), which is a difficult challenge. In the literature on speech emotion recognition, various well known speech analysis and classification methods have been used to extract emotions from signals. Deep learning strategies have recently been proposed as a workable alternative to conventional methods and discussed. Several recent studies have employed these methods to identify speech-based emotions. The review examines the databases used, the emotions collected, and the contributions to speech emotion recognition. Th

APA, Harvard, Vancouver, ISO, and other styles

4

Handwi, Narendra Rahman, and Rila Mandala. "Enhancing Work Safety Systems Through Real-Time Speech Emotion Detection Classifier Using CNN Algorithm." Eduvest - Journal of Universal Studies 5, no. 7 (2025): 9344–60. https://doi.org/10.59188/eduvest.v5i7.51808.

Full text

Abstract:

Speech emotion detection has emerged as a significant research area due to its potential applications in various domains. In work safety systems, the ability to accurately recognize emotions can provide vital information about the mental state of workers, which can be utilized to prevent work accidents and ensure a safer work environment. The objective of this study is to develop a speech emotion detection classifier using the CNN algorithm. The classifier aims to accurately classify emotions from speech signals, enabling real-time recognition of workers' emotional states. By achieving this ob

APA, Harvard, Vancouver, ISO, and other styles

5

Sai Srinivas, T. Aditya, and M. Bharathi. "EmoSonics: Emotion Detection via Voice and Speech Recognition." JOURNAL OF COMPUTER SCIENCE AND SYSTEM SOFTWARE 1, no. 2 (2024): 1–7. http://dx.doi.org/10.48001/jocsss.2024.121-7.

Full text

Abstract:

Understanding emotions from speech is like deciphering a rich tapestry of human expression in the realm of human-computer interaction. It's akin to listening to someone's tone and inflection to discern whether they're happy, surprised, or experiencing a range of other feelings. Researchers use a variety of techniques, from analyzing speech patterns to utilizing advanced technologies like fMRI, to decode these emotional cues. Emotions aren't just simple labels; they're complex and nuanced, demanding sophisticated methods for accurate interpretation. Some methods break emotions down into simple

APA, Harvard, Vancouver, ISO, and other styles

6

Shreya, S., P. Likitha, G. Sai Charan, and Dr Shruti Bhargava Choubey. "Speech Emotion Detection Through Live Calls." International Journal for Research in Applied Science and Engineering Technology 11, no. 5 (2023): 691–95. http://dx.doi.org/10.22214/ijraset.2023.51575.

Full text

Abstract:

Abstract: Speech emotion recognition is a popular study area right now, with the goal of enhancing human-machine connection. Most of the research being done in this field now classifies emotions into different groups by extracting discriminatory features. Most of the work done nowadays concerns verbal expressions used for lexical analysis and emotion recognition. In our project, emotions are categorized into the following categories: angry, calm, fearful, happy, and sad. Speech Emotion Recognition, often known as, SER, is a technology that takes advantage of the fact that tone and pitch in a s

APA, Harvard, Vancouver, ISO, and other styles

7

Venkateswarlu, Dr S. China. "Speech Emotion Recognition using Machine Learning." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem48705.

Full text

Abstract:

Abstract -- Speech signals are being considered as most effective means of communication between human beings. Many researchers have found different methods or systems to identify emotions from speech signals. Here, the various features of speech are used to classify emotions. Features like pitch, tone, intensity are essential for classification. Large number of the datasets are available for speech emotion recognition. Firstly, the extraction of features from speech emotion is carried out and then another important part is classification of emotions based upon speech. Hence, different classif

APA, Harvard, Vancouver, ISO, and other styles

8

Kamińska, Dorota, and Adam Pelikant. "Recognition of Human Emotion from a Speech Signal Based on Plutchik's Model." International Journal of Electronics and Telecommunications 58, no. 2 (2012): 165–70. http://dx.doi.org/10.2478/v10177-012-0024-4.

Full text

Abstract:

Recognition of Human Emotion from a Speech Signal Based on Plutchik's ModelMachine recognition of human emotional states is an essential part in improving man-machine interaction. During expressive speech the voice conveys semantic message as well as the information about emotional state of the speaker. The pitch contour is one of the most significant properties of speech, which is affected by the emotional state. Therefore pitch features have been commonly used in systems for automatic emotion detection. In this work different intensities of emotions and their influence on pitch features have

APA, Harvard, Vancouver, ISO, and other styles

9

Reddy, Dr N. V. Rajasekhar. "Speech Emotion Recognition Using Convolutional Neural Networks." International Journal for Research in Applied Science and Engineering Technology 12, no. 8 (2024): 30–36. http://dx.doi.org/10.22214/ijraset.2024.63859.

Full text

Abstract:

Abstract: Speech is a powerful way to express our thoughts and feelings. It can give us valuable insights into human emotions. Speech emotion recognition (SER) is a crucial tool used in various fields like human-computer interaction (HCI), medical diagnosis, and lie detection. However, understanding emotions from speech is challenging. This research aims to address this challenge. It uses multiple datasets, including CREMA-D, RAVDESS, TESS, and SAVEE, to identify different emotional states. The researchers reviewed existing literature to inform their methodology. They used spectrograms and mel

APA, Harvard, Vancouver, ISO, and other styles

10

Gao, Xiyuan, Shekhar Nayak, and Matt Coler. "Enhancing sarcasm detection through multimodal data integration: A proposal for augmenting audio with text and emoticon." Journal of the Acoustical Society of America 155, no. 3_Supplement (2024): A264. http://dx.doi.org/10.1121/10.0027441.

Full text

Abstract:

Sarcasm detection presents unique challenges in speech technology, particularly for individuals with disorders that affect pitch perception or those lacking contextual auditory cues. While previous research [1, 2] has established the significance of pitch variation in sarcasm detection, these studies have primarily focused on singular modalities, often overlooking the potential synergies of integrating multimodal data. We propose an approach that synergizes auditory, textual, and emoticon data to enhance sarcasm detection. This involves augmenting sarcastic audio data with corresponding text u

APA, Harvard, Vancouver, ISO, and other styles

11

Silviana, Widya Lestari, Kahar Saliyah, and Dwi Trismayanti. "Deep learning techniques for speech emotion recognition: A review." International Research Journal of Science, Technology, Education, and Management 3, no. 2 (2023): 78–91. https://doi.org/10.5281/zenodo.8139722.

Full text

Abstract:

Speech emotion recognition is gaining significant importance in the domains of pattern recognition and natural language processing. In recent years, there has been notable progress in voice emotion detection within this field, primarily attributed to the successful application of deep learning techniques. Some research in this area lacks a thorough comparative study of different deep learning models and techniques related to speech emotion detection. This makes it difficult to identify the best performing approaches and their relative strengths and weaknesses. Therefore, the purpose of this wo

APA, Harvard, Vancouver, ISO, and other styles

12

Muhammad, Fahreza Alghifari, Surya Gunawan Teddy, Aminah binti Wan Nordin Mimi, Asif Ahmad Qadri Syed, Kartiwi Mira, and Janin Zuriati. "On the use of voice activity detection in speech emotion recognition." Bulletin of Electrical Engineering and Informatics 8, no. 4 (2019): 1324–32. https://doi.org/10.11591/eei.v8i4.1646.

Full text

Abstract:

Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this pap

APA, Harvard, Vancouver, ISO, and other styles

13

Hartmann, Kim, Ingo Siegert, David Philippou-Hübner, and Andreas Wendemuth. "Emotion Detection in HCI: From Speech Features to Emotion Space." IFAC Proceedings Volumes 46, no. 15 (2013): 288–95. http://dx.doi.org/10.3182/20130811-5-us-2037.00049.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

PROLAY GHOSH. "An Improved Convolutional Neural Network For Speech Detection." Journal of Information Systems Engineering and Management 10, no. 3 (2025): 621–30. https://doi.org/10.52783/jisem.v10i3.5951.

Full text

Abstract:

The detection of emotions from speech is the aim of this paper. Speech consists of anger, joy and fear have very high and wide range in pitch, whereas Speech consists of sad and tired emotion have very low pitch. Speech Emotion detection technology can recognize human emotions to help machines better for understanding intentions of a user to improve the human-computer interaction. Classification model named Convolutional Neural Network (CNN) based on mainly Mel Frequency Cepstral Coefficient (MFCC) feature to detect emotion have been presented here. Different approaches have been discussed and

APA, Harvard, Vancouver, ISO, and other styles

15

G, Apeksha. "Speech Emotion Recognition Using ANN." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem32584.

Full text

Abstract:

The speech is the most effective means of communication, to recognize the emotions in speech is the most crucial task. In this paper we are using the Artificial Neural Network to recognize the emotions in speech. Hence, providing an efficient and accurate technique for speech based emotion recognition is also an important task. This study is focused on seven basic human emotions (angry, disgust, fear, happy, neutral, surprise, sad). The training and validating accuracy and also lose can be seen in a graph while training the dataset. According to it confusion matrix for model is created. The se

APA, Harvard, Vancouver, ISO, and other styles

16

Graterol, Wilfredo, Jose Diaz-Amado, Yudith Cardinale, Irvin Dongo, Edmundo Lopes-Silva, and Cleia Santos-Libarino. "Emotion Detection for Social Robots Based on NLP Transformers and an Emotion Ontology." Sensors 21, no. 4 (2021): 1322. http://dx.doi.org/10.3390/s21041322.

Full text

Abstract:

For social robots, knowledge regarding human emotional states is an essential part of adapting their behavior or associating emotions to other entities. Robots gather the information from which emotion detection is processed via different media, such as text, speech, images, or videos. The multimedia content is then properly processed to recognize emotions/sentiments, for example, by analyzing faces and postures in images/videos based on machine learning techniques or by converting speech into text to perform emotion detection with natural language processing (NLP) techniques. Keeping this inf

APA, Harvard, Vancouver, ISO, and other styles

17

E.S, Pallavi. "Speech Emotion Recognition Based on Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem33995.

Full text

Abstract:

The speech is the most effective means of communication, to recognize the emotions in speech is the most crucial task. In this paper we are using the Artificial Neural Network to recognize the emotions in speech. Hence, providing an efficient and accurate technique for speech based emotion recognition is also an important task. This study is focused on seven basic human emotions (angry, disgust, fear, happy, neutral, surprise, sad). The training and validating accuracy and also lose can be seen in a graph while training the dataset.According to it confusion matrix for model is created. The fea

APA, Harvard, Vancouver, ISO, and other styles

18

G, Apeksha. "Speech Emotion Recognition Using Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem32388.

Full text

Abstract:

The speech is the most effective means of communication, to recognize the emotions in speech is the most crucial task. In this paper we are using the Artificial Neural Network to recognize the emotions in speech. Hence, providing an efficient and accurate technique for speech based emotion recognition is also an important task. This study is focused on seven basic human emotions (angry, disgust, fear, happy, neutral, surprise, sad). The training and validating accuracy and also lose can be seen in a graph while training the dataset. According to it confusion matrix for model is created. The se

APA, Harvard, Vancouver, ISO, and other styles

19

Hazra, Sumon Kumar, Romana Rahman Ema, Syed Md Galib, Shalauddin Kabir, and Nasim Adnan. "Emotion recognition of human speech using deep learning method and MFCC features." Radioelectronic and Computer Systems, no. 4 (November 29, 2022): 161–72. http://dx.doi.org/10.32620/reks.2022.4.13.

Full text

Abstract:

Subject matter: Speech emotion recognition (SER) is an ongoing interesting research topic. Its purpose is to establish interactions between humans and computers through speech and emotion. To recognize speech emotions, five deep learning models: Convolution Neural Network, Long-Short Term Memory, Artificial Neural Network, Multi-Layer Perceptron, Merged CNN, and LSTM Network (CNN-LSTM) are used in this paper. The Toronto Emotional Speech Set (TESS), Surrey Audio-Visual Expressed Emotion (SAVEE) and Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) datasets were used for this

APA, Harvard, Vancouver, ISO, and other styles

20

Khedkar, Shilpa. "Activity Recommendation System Based on Emotion Recognition." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 06 (2025): 1–9. https://doi.org/10.55041/ijsrem49641.

Full text

Abstract:

Abstract—An Emotion Recognition-Based Activity Recommendation System aims at providing users with adequate activity recommendations based on emotional states using the latest developments within the scope of emotion recognition technology. This project would apply speech emotion recognition techniques, specifically focusing on the most current state of-the-art methods in Triangular Region Cut-Mix augmentation for the enhancement of accuracy of emotion classification while preserving audio spectrogram information related to key emotions. Furthermore, it involves a dual learning framework integr

APA, Harvard, Vancouver, ISO, and other styles

21

Chen, Jing, Haifeng Li, Lin Ma, and Hongjian Bo. "Improving Emotion Analysis for Speech-Induced EEGs Through EEMD-HHT-Based Feature Extraction and Electrode Selection." International Journal of Multimedia Data Engineering and Management 12, no. 2 (2021): 1–18. http://dx.doi.org/10.4018/ijmdem.2021040101.

Full text

Abstract:

Emotion detection using EEG signals has advantages in eliminating social masking to obtain a better understanding of underlying emotions. This paper presents the cognitive response to emotional speech and emotion recognition from EEG signals. A framework is proposed to recognize mental states from EEG signals induced by emotional speech: First, speech-evoked emotion cognitive experiment is designed, and EEG dataset is collected. Second, power-related features are extracted using EEMD-HHT, which is more accurate to reflect the instantaneous frequency of the signal than STFT and WT. An extensive

APA, Harvard, Vancouver, ISO, and other styles

22

Padman, Sweta, and Dhiraj Magare. "Regional language Speech Emotion Detection using Deep Neural Network." ITM Web of Conferences 44 (2022): 03071. http://dx.doi.org/10.1051/itmconf/20224403071.

Full text

Abstract:

Speaking is the most basic and efficient mode of human contact. Emotions assist people in communicating and understanding others’ viewpoints by transmitting sentiments and providing feedback.The basic objective of speech emotion recognition is to enable computers to comprehend human emotional states such as happiness, fury, and disdain through voice cues. Extensive Effective Method Coefficients of Mel cepstral frequency have been proposed for this problem. The characteristics of Mel frequency ceptral coefficients(MFCC) and the audio based textual characteristics are extracted from the audio ch

APA, Harvard, Vancouver, ISO, and other styles

23

Suchitra, R. "Cross Modal Emotion Detection: Leveraging Speech and Facial Expression Features." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 07 (2025): 1–9. https://doi.org/10.55041/ijsrem51197.

Full text

Abstract:

Emotion recognition plays a vital role in enhancing human-computer interaction by enabling machines to interpret human affective states. This project proposes a cross-modal emotion detection system that leverages both speech and facial expressions to accurately classify emotions. The speech modality utilizes the RAVDESS dataset, where Mel-Spectrogram and MFCC features are extracted and processed through a combination of pre-trained DenseNet and custom CNN models. For the visual modality, facial expressions are analyzed using a Convolutional Neural Network trained on the FER2013 dataset, with r

APA, Harvard, Vancouver, ISO, and other styles

24

Majeed, Adil, and Hasan Mujtaba. "UMEDNet: a multimodal approach for emotion detection in the Urdu language." PeerJ Computer Science 11 (May 1, 2025): e2861. https://doi.org/10.7717/peerj-cs.2861.

Full text

Abstract:

Emotion detection is a critical component of interaction between human and computer systems, more especially affective computing, and health screening. Integrating video, speech, and text information provides better coverage of the basic and derived affective states with improved estimation of verbal and non-verbal behavior. However, there is a lack of systematic preferences and models for the detection of emotions in low-resource languages such as Urdu. To this effect, we propose Urdu Multimodal Emotion Detection Network (UMEDNet), a new emotion detection model for Urdu that works with video,

APA, Harvard, Vancouver, ISO, and other styles

25

Davletcharova, Assel, Sherin Sugathan, Bibia Abraham, and Alex Pappachen James. "Detection and Analysis of Emotion from Speech Signals." Procedia Computer Science 58 (2015): 91–96. http://dx.doi.org/10.1016/j.procs.2015.08.032.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Kothuri, Jhansi. "Speech Emotion Recognition: An LSTM Approach." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem45580.

Full text

Abstract:

Abstract – This paper presents a novel approach to Speech Emotion Recognition (SER) utilizing a Long Short-Term Memory (LSTM) network to classify emotions from audio inputs in real-time. The primary goal of this research is to accurately identify various emotions, including happiness, sadness, anger, fear, and surprise, enhancing user experience in applications such as human-computer interaction, virtual assistants, and mental health monitoring. The methodology involves a comprehensive process that begins with the preprocessing of audio signals to ensure clarity and consistency. This is follow

APA, Harvard, Vancouver, ISO, and other styles

27

Pulatov, Ilkhomjon, Rashid Oteniyazov, Fazliddin Makhmudov, and Young-Im Cho. "Enhancing Speech Emotion Recognition Using Dual Feature Extraction Encoders." Sensors 23, no. 14 (2023): 6640. http://dx.doi.org/10.3390/s23146640.

Full text

Abstract:

Understanding and identifying emotional cues in human speech is a crucial aspect of human–computer communication. The application of computer technology in dissecting and deciphering emotions, along with the extraction of relevant emotional characteristics from speech, forms a significant part of this process. The objective of this study was to architect an innovative framework for speech emotion recognition predicated on spectrograms and semantic feature transcribers, aiming to bolster performance precision by acknowledging the conspicuous inadequacies in extant methodologies and rectifying t

APA, Harvard, Vancouver, ISO, and other styles

28

Fathy, Samar, Nahla El-Haggar, and Mohamed H. Haggag. "A Hybrid Model for Emotion Detection from Text." International Journal of Information Retrieval Research 7, no. 1 (2017): 32–48. http://dx.doi.org/10.4018/ijirr.2017010103.

Full text

Abstract:

Emotions can be judged by a combination of cues such as speech facial expressions and actions. Emotions are also articulated by text. This paper shows a new hybrid model for detecting emotion from text which depends on ontology with keywords semantic similarity. The text labelled with one of the six basic Ekman emotion categories. The main idea is to extract ontology from input sentences and match it with the ontology base which created from simple ontologies and the emotion of each ontology. The ontology extracted from the input sentence by using a triplet (subject, predicate, and object) ext

APA, Harvard, Vancouver, ISO, and other styles

29

Alsubai, Shtwai. "Emotion Detection Using Deep Normalized Attention-Based Neural Network and Modified-Random Forest." Sensors 23, no. 1 (2022): 225. http://dx.doi.org/10.3390/s23010225.

Full text

Abstract:

In the contemporary world, emotion detection of humans is procuring huge scope in extensive dimensions such as bio-metric security, HCI (human–computer interaction), etc. Such emotions could be detected from various means, such as information integration from facial expressions, gestures, speech, etc. Though such physical depictions contribute to emotion detection, EEG (electroencephalogram) signals have gained significant focus in emotion detection due to their sensitivity to alterations in emotional states. Hence, such signals could explore significant emotional state features. However, manu

APA, Harvard, Vancouver, ISO, and other styles

30

Singh, Sukhpreet, Mohammad Nazmul Alam, and Suman Lata. "Facial Emotion Detection Using CNN-Based Neural Network." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 07, no. 10 (2023): 1–11. http://dx.doi.org/10.55041/ijsrem26437.

Full text

Abstract:

—While computer vision aims to confound humans through the analysis of digital images, humans rely on their sensory perception to decipher emotions. Unlike the challenge of comprehending virtual images, assessing emotions in speech involves evaluating various aspects such as tone, volume, speed, and more. These novel techniques allow for the modulation of emotional "anxiety" in speech. In this model, our objective is to develop a convolutional neural network (CNN) version capable of segmenting input videos into seven distinct symbols representing emotions: anger, hatred, criticism, happiness,

APA, Harvard, Vancouver, ISO, and other styles

31

Sri Lalitha, Y., Althaf Hussain Basha Sk, and M. V. Aditya Nag. "Neural Network Modelling of Speech Emotion Detection." E3S Web of Conferences 309 (2021): 01139. http://dx.doi.org/10.1051/e3sconf/202130901139.

Full text

Abstract:

In making the Machines Intelligent, and enable them to work as human, Speech recognition is one of the most essential requirement. Human Language conveys various types of information such as the energy, pitch, loudness, rhythm etc., in the sound, the speech and its context such as gender, age and the emotion. Identifying the emotion from a speech pattern is a challenging task and the most useful solution especially in the era of widely developing speech recognition systems with digital assistants. Digital assistants like Bixby, Blackberry assistant are building products that consist of emotion

APA, Harvard, Vancouver, ISO, and other styles

32

Baral, Rojina, Sanjivan Satyal, and Anisha Pokhrel. "CNN-Transformer Based Speech Emotion Detection." Journal of Advanced College of Engineering and Management 10 (March 11, 2025): 135–45. https://doi.org/10.3126/jacem.v10i1.76324.

Full text

Abstract:

In this study, a parallel network technique trained on the Ryerson Audio-Visual Dataset of Speech and Song (RAVDESS) was used to perform an autonomous speech emotion recognition (SER) challenge to categorize four distinct emotions. To capture both spatial and temporal data, the architecture comprised attention-based networks with CNN-based networks that ran in tandem. Additive White Gaussian Noise (AWGN) was used as augmentation techniques for multiple folds to improve the model’s generalization. The model’s input was MFCC, which was created from the raw audio data. The MFCC were represented a

APA, Harvard, Vancouver, ISO, and other styles

33

Venkateswarlu, Sonagiri China, Siva Ramakrishna Jeevakala, Naluguru Udaya Kumar, Pidugu Munaswamy, and Dhanalaxmi Pendyala. "Emotion Recognition From Speech and Text using Long Short-Term Memory." Engineering, Technology & Applied Science Research 13, no. 4 (2023): 11166–69. http://dx.doi.org/10.48084/etasr.6004.

Full text

Abstract:

Everyday interactions depend on more than just rational discourse; they also depend on emotional reactions. Having this information is crucial to making any kind of practical or even rational decision, as it can help to better understand one another by sharing our responses and providing recommendations on how they may feel. Several studies have recently begun to focus on emotion detection and labeling, proposing different methods for organizing feelings and detecting emotions in speech. Determining how emotions are conveyed through speech has been given major emphasis in social interactions d

APA, Harvard, Vancouver, ISO, and other styles

34

Barua, Manya. "Decoding Emotions: Machine Learning Approach to Speech Emotion Recognition." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 06 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem36178.

Full text

Abstract:

Speech Emotion Recognition (SER) stands at the forefront of human-computer interaction, offering profound implications for fields such as healthcare, education, and entertainment. This project report delves into the application of Machine Learning (ML) techniques for SER, aiming to discern the emotional content from speech signals. The report begins with an overview of the significance of SER in various domains, emphasizing the need for accurate and robust emotion detection systems. Following this,a detailed exploration of the methodologies employed in SER is presented, encompassing feature ex

APA, Harvard, Vancouver, ISO, and other styles

35

Dharmendra Kumar Roy, Naga Venkata Gopi Kumbha, Harender Sankhla, G. Teja Alex Raj, and Bashetty Akhilesh. "Deep Learning-Based Feature Extraction for Speech Emotion Recognition." international journal of engineering technology and management sciences 8, no. 3 (2024): 166–74. http://dx.doi.org/10.46647/ijetms.2024.v08i03.020.

Full text

Abstract:

Emotion recognition from speech signals is an important and challenging component of Human-Computer Interaction. In the field of speech emotion recognition (SER), many techniques have been utilized to extract emotions from speech signals, including many well-established speech analysis and classification techniques. This model can be built by using various methods such as RNN, SVM, deep learning, cepstral coefficients, and various other methods, out of which SVM normally gives us the highest accuracy. We propose a model that can identify emotions present in the speech, which can be identified

APA, Harvard, Vancouver, ISO, and other styles

36

Chandurkar, Swati S., Shailaja V. Pede, and Shailesh A. Chandurkar. "System for Prediction of Human Emotions and Depression level with Recommendation of Suitable Therapy." Asian Journal of Computer Science and Technology 6, no. 2 (2017): 5–12. http://dx.doi.org/10.51983/ajcst-2017.6.2.1787.

Full text

Abstract:

In today’s competitive world, an individual needs to act smartly and take rapid steps to make his place in the competition. The ratio of the youngsters to that of the elder people is comparatively more and also they contribute towards the development of the society. This paper presents the methodology to extract emotion from the text at real time and add the expression to the textual contents during speech synthesis by using Corpus , emotion recognition module etc. Along with the emotions recognition from the human textual data the system will analyze the various human body signals such as blo

APA, Harvard, Vancouver, ISO, and other styles

37

Anant Kaulage. "Facial Expression and Speech Pattern Computing using Modern Techniques of Computer Vision." Journal of Information Systems Engineering and Management 10, no. 5s (2025): 602–14. https://doi.org/10.52783/jisem.v10i5s.726.

Full text

Abstract:

Two crucial areas of research in the field of emotional computing are recognition of facial emotions and spoken emotion. In this study, we tackle the challenge of precisely identifying and analyzing emotions from speech and facial expressions. A fascinating problem in human-computer interaction, mental health monitoring, social robots, and other fields is the detection of emotions from facial expressions and verbal inputs. By combining different modalities, we can obtain a more complete picture of emotions by utilizing the complimentary information included in facial expressions and speech pat

APA, Harvard, Vancouver, ISO, and other styles

38

Singla, Chaitanya, and Sukhdev Singh. "PEMO: A New Validated Dataset for Punjabi Speech Emotion Detection." International Journal on Recent and Innovation Trends in Computing and Communication 10, no. 10 (2022): 52–58. http://dx.doi.org/10.17762/ijritcc.v10i10.5734.

Full text

Abstract:

This research work presents a new valid dataset for Punjabi called the Punjabi Emotional Speech Database (PEMO) which has been developed to assess the ability to recognize emotions in speech by both computers and humans. The PEMO includes speech samples from about 60 speakers with an age range between 20 and 45 years, for four fundamental emotions, including anger, sad, happy and neutral. In order to create the data, Punjabi films are retrieved from different multimedia websites such as YouTube. The movies are processed and transformed into utterances with software called PRAAT. The database c

APA, Harvard, Vancouver, ISO, and other styles

39

Makhmudov, Fazliddin, Alpamis Kultimuratov, and Young-Im Cho. "Enhancing Multimodal Emotion Recognition through Attention Mechanisms in BERT and CNN Architectures." Applied Sciences 14, no. 10 (2024): 4199. http://dx.doi.org/10.3390/app14104199.

Full text

Abstract:

Emotion detection holds significant importance in facilitating human–computer interaction, enhancing the depth of engagement. By integrating this capability, we pave the way for forthcoming AI technologies to possess a blend of cognitive and emotional understanding, bridging the divide between machine functionality and human emotional complexity. This progress has the potential to reshape how machines perceive and respond to human emotions, ushering in an era of empathetic and intuitive artificial systems. The primary research challenge involves developing models that can accurately interpret

APA, Harvard, Vancouver, ISO, and other styles

40

Taha, Thaer Mufeed, Zaineb Ben Messaoud, and Mondher Frikha. "Convolutional Neural Network Architectures for Gender, Emotional Detection from Speech and Speaker Diarization." International Journal of Interactive Mobile Technologies (iJIM) 18, no. 03 (2024): 88–103. http://dx.doi.org/10.3991/ijim.v18i03.43013.

Full text

Abstract:

This paper introduces three system architectures for speaker identification that aim to overcome the limitations of diarization and voice-based biometric systems. Diarization systems utilize unsupervised algorithms to segment audio data based on the time boundaries of utterances, but they do not distinguish individual speakers. On the other hand, voice-based biometric systems can only identify individuals in recordings with a single speaker. Identifying speakers in recordings of natural conversations can be challenging, especially when emotional shifts can alter voice characteristics, making g

APA, Harvard, Vancouver, ISO, and other styles

41

I., Venkata Dwaraka Srihith, and Ashok S. "The Voice of Feeling: Exploring Emotions via Machine Learning." Research and Reviews: Advancement in Cyber Security 1, no. 3 (2024): 1–16. https://doi.org/10.5281/zenodo.11171046.

Full text

Abstract:

<em>This paper encapsulates the project 'Emotion Detection From Voice With Machine Learning Techniques', emphasizing the construction of a reliable system adept at discerning and delineating emotions conveyed within audio recordings. Employing sophisticated machine learning methodologies, notably Convolutional Neural Networks (CNNs), the endeavours seeks to furnish intricate emotion classification reliant on the inherent acoustic attributes of human speech. Through the fusion of machine learning prowess and acoustic analysis, the system endeavours to deliver nuanced insights into emotional exp

APA, Harvard, Vancouver, ISO, and other styles

42

Du, Peng. "Emotion detection in artistic creation: A multi-sensor fusion approach leveraging biomechanical cues and enhanced CNN models." Molecular & Cellular Biomechanics 22, no. 4 (2025): 989. https://doi.org/10.62617/mcb989.

Full text

Abstract:

Artistic creation is a means of expressing human emotions. To intuitively capture the emotions conveyed by the artist in their works, we propose an improved CNN-based emotion detection method that incorporates biomechanical elements. Recognizing that emotions are accompanied by physiological and biomechanical responses such as heart rate variations, facial muscle activity, and speech tone fluctuations, we collect and integrate multi-sensor data, including heart rate, facial expression, and verbal expression. This information is processed through a multi-sensor signals fusion method based on an

APA, Harvard, Vancouver, ISO, and other styles

43

Rajdeep, Bhoomi, Hardik B. ,. Patel, and Sailesh Iyer. "Human Emotion Identification from Speech using Neural Network." International Journal of Computers 16 (November 10, 2022): 87–103. http://dx.doi.org/10.46300/9108.2022.16.15.

Full text

Abstract:

Detection of mood and behavior by voice analysis which helps to detect the speaker’s mood by the voice frequency. Here, I aim to present the mood like happy, and sad and behavior detection devices using machine learning and artificial intelligence which can be detected by voice analysis. Using this device, it detects the user’s mood. Moreover, this device detects the frequency by trained model and algorithm. The algorithm is well trained to catch the frequency where it helps to identify the mood happy or sad of the speaker and behavior. On the other hand, behavior can be predicted in form, it

APA, Harvard, Vancouver, ISO, and other styles

44

Burileanu, Dragoş, Șerban Mihalache, Valentin Andrei, Alexandru-Lucian Georgescu, Horia Cucu, and Corneliu Burileanu. "MACHINE LEARNING FOR SPOKEN LANGUAGE TECHNOLOGY." Annals of the Academy of Romanian Scientists Series on Science and Technology of Information 14, no. 1-2 (2021): 25–44. http://dx.doi.org/10.56082/annalsarsciinfo.2021.1-2.25.

Full text

Abstract:

Spoken language technology is one of the domains in which, in our days, machine learning algorithms and especially neural networks are used. Some applications will pe presented in this paper: detecting overlapped speech on short time frames (till 25ms), emotion recognition from speech (including speech stress detection and deceptive speech detection) and the performances of the last large vocabulary continuous speech recognition systems for Romanian developed in the SpeeD Laboratory, from Research Institute “CAMPUS”, University POLITEHNICA of Bucharest

APA, Harvard, Vancouver, ISO, and other styles

45

S*, Manisha, Nafisa H. Saida, Nandita Gopal, and Roshni P. Anand. "Bimodal Emotion Recognition using Machine Learning." International Journal of Engineering and Advanced Technology 10, no. 4 (2021): 189–94. http://dx.doi.org/10.35940/ijeat.d2451.0410421.

Full text

Abstract:

The predominant communication channel to convey relevant and high impact information is the emotions that is embedded on our communications. Researchers have tried to exploit these emotions in recent years for human robot interactions (HRI) and human computer interactions (HCI). Emotion recognition through speech or through facial expression is termed as single mode emotion recognition. The rate of accuracy of these single mode emotion recognitions are improved using the proposed bimodal method by combining the modalities of speech and facing and recognition of emotions using a Convolutional N

APA, Harvard, Vancouver, ISO, and other styles

46

Manisha, S., Saida H. Nafisa, Gopal Nandita, and P. Anand Roshni. "Bimodal Emotion Recognition using Machine Learning." International Journal of Engineering and Advanced Technology (IJEAT) 10, no. 4 (2021): 189–94. https://doi.org/10.35940/ijeat.D2451.0410421.

Full text

Abstract:

The predominant communication channel to convey relevant and high impact information is the emotions that is embedded on our communications. Researchers have tried to exploit these emotions in recent years for human robot interactions (HRI) and human computer interactions (HCI). Emotion recognition through speech or through facial expression is termed as single mode emotion recognition. The rate of accuracy of these single mode emotion recognitions are improved using the proposed bimodal method by combining the modalities of speech and facing and recognition of emotions using a Convolutional N

APA, Harvard, Vancouver, ISO, and other styles

47

Tomar, Divya, Divya Ojha, and Sonali Agarwal. "An Emotion Detection System Based on Multi Least Squares Twin Support Vector Machine." Advances in Artificial Intelligence 2014 (December 23, 2014): 1–11. http://dx.doi.org/10.1155/2014/282659.

Full text

Abstract:

Posttraumatic stress disorder (PTSD), bipolar manic disorder (BMD), obsessive compulsive disorder (OCD), depression, and suicide are some major problems existing in civilian and military life. The change in emotion is responsible for such type of diseases. So, it is essential to develop a robust and reliable emotion detection system which is suitable for real world applications. Apart from healthcare, importance of automatically recognizing emotions from human speech has grown with the increasing role of spoken language interfaces in human-computer interaction applications. Detection of emotio

APA, Harvard, Vancouver, ISO, and other styles

48

Dsouza, Prof Martina, Rohan Adhav, Shivam Dubey, and Sachin Dwivedi. "Speech Based Emotion Detection Using Deep Learning." International Journal for Research in Applied Science and Engineering Technology 10, no. 3 (2022): 2282–91. http://dx.doi.org/10.22214/ijraset.2022.41099.

Full text

Abstract:

Abstract: The Mechanized Discourse Feeling Acknowledgment may be an intense handle because of the hole among acoustic characteristics and human feelings, which depends emphatically on the discriminative acoustic characteristics extricated for a given acknowledgment assignment. Distinctive people have different emotions and through and through a distinctive way to precise it. Discourse feeling do have distinctive energies, pitch variations are emphasized in case considering distinctive subjects. Subsequently, the discourse feeling location may be a requesting assignment in computing vision. Her

APA, Harvard, Vancouver, ISO, and other styles

49

Lim, Jia Zheng, James Mountstephens, and Jason Teo. "Emotion Recognition Using Eye-Tracking: Taxonomy, Review and Current Challenges." Sensors 20, no. 8 (2020): 2384. http://dx.doi.org/10.3390/s20082384.

Full text

Abstract:

The ability to detect users’ emotions for the purpose of emotion engineering is currently one of the main endeavors of machine learning in affective computing. Among the more common approaches to emotion detection are methods that rely on electroencephalography (EEG), facial image processing and speech inflections. Although eye-tracking is fast in becoming one of the most commonly used sensor modalities in affective computing, it is still a relatively new approach for emotion detection, especially when it is used exclusively. In this survey paper, we present a review on emotion recognition usi

APA, Harvard, Vancouver, ISO, and other styles

50

Özer, İlyas. "Biologically-Inspired Speech Emotion Recognition Using Rate Map Representations: An Application to the ShEMO Persian Speech Database." Aintelia Science Notes 2, no. 1 (2023): 24–31. https://doi.org/10.5281/zenodo.10396163.

Full text

Abstract:

This paper presents an innovative Speech Emotion Recognition (SER) model, inspired by the human auditory system, for analyzing and interpreting emotions in speech. Our proposed model utilizes a rate map representation to encode the spectro-temporal characteristics of auditory nerve activity, closely mimicking the intricate processes of human auditory perception. This model comprises several stages: pre-emphasis of the audio signal, cochlear filtering using a Gammatone Filter bank (GTF), neuromechanical transduction modeled by the Dau inner hair cell model, and the asse

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!