Relevant bibliographies by topics / Speech Recognition (SER)

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Speech Recognition (SER)'

Author: Grafiati

Published: 5 June 2025

Last updated: 2 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Speech Recognition (SER).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Speech Recognition (SER)"

A, Prof Swethashree. "Speech Emotion Recognition." International Journal for Research in Applied Science and Engineering Technology 9, no. 8 (2021): 2637–40. http://dx.doi.org/10.22214/ijraset.2021.37375.

Full text

Abstract:

Abstract: Speech Emotion Recognition, abbreviated as SER, the act of trying to identify a person's feelings and relationships. Affected situations from speech. This is because the truth often reflects the basic feelings of tone and tone of voice. Emotional awareness is a fast-growing field of research in recent years. Unlike humans, machines do not have the power to comprehend and express emotions. But human communication with the computer can be improved by using automatic sensory recognition, accordingly reducing the need for human intervention. In this project, basic emotions such as peace,

APA, Harvard, Vancouver, ISO, and other styles

Venkateswarlu, Dr S. China. "Speech Emotion Recognition using Machine Learning." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem48705.

Full text

Abstract:

Abstract -- Speech signals are being considered as most effective means of communication between human beings. Many researchers have found different methods or systems to identify emotions from speech signals. Here, the various features of speech are used to classify emotions. Features like pitch, tone, intensity are essential for classification. Large number of the datasets are available for speech emotion recognition. Firstly, the extraction of features from speech emotion is carried out and then another important part is classification of emotions based upon speech. Hence, different classif

APA, Harvard, Vancouver, ISO, and other styles

Dr. M. Narendra and Lankala Suvarchala. "An Enhanced Human Speech Based Emotion Recognition." International Journal of Scientific Research in Science and Technology 11, no. 3 (2024): 518–28. http://dx.doi.org/10.32628/ijsrst24113128.

Full text

Abstract:

Speech Emotion Recognition (SER) is a Machine Learning (ML) topic that has attracted substantial attention from researchers, particularly in the field of emotional computing. This is because of its growing potential, improvements in algorithms, and real-world applications. Pitch, intensity, and Mel-Frequency Cepstral Coefficients (MFCC) are examples of quantitative variables that can be used to represent the paralinguistic information found in human speech. The three main processes of data processing, feature selection/extraction, and classification based on the underlying emotional traits are

APA, Harvard, Vancouver, ISO, and other styles

Sri Murugharaj B R, Shakthy B, Sabari L, and Kamaraj K. "Speech Based Emotion Recognition System." international journal of engineering technology and management sciences 7, no. 1 (2023): 332–37. http://dx.doi.org/10.46647/ijetms.2023.v07i01.050.

Full text

Abstract:

Emotion reputation from speech alerts is a crucial yet difficult part of human-computer interaction (HCI). Several well-known speech assessment and type processes were employed in the literature on speech emotion reputation (SER) to extract emotions from warnings. Deep learning algorithms have recently been proposed as an alternative to conventional ones for SER. We develop a SER system that is totally based on exclusive classifiers and functions extraction techniques. Features from the speech alerts are utilised to train exclusive classifiers. To identify the broadest feasible appropriate cha

APA, Harvard, Vancouver, ISO, and other styles

Abbaschian, Babak Joze, Daniel Sierra-Sosa, and Adel Elmaghraby. "Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models." Sensors 21, no. 4 (2021): 1249. http://dx.doi.org/10.3390/s21041249.

Full text

Abstract:

The advancements in neural networks and the on-demand need for accurate and near real-time Speech Emotion Recognition (SER) in human–computer interactions make it mandatory to compare available methods and databases in SER to achieve feasible solutions and a firmer understanding of this open-ended problem. The current study reviews deep learning approaches for SER with available datasets, followed by conventional machine learning techniques for speech emotion recognition. Ultimately, we present a multi-aspect comparison between practical neural network approaches in speech emotion recognition.

APA, Harvard, Vancouver, ISO, and other styles

Samyuktha S and Sarwath Unnisa. "Emotional Speech Recognition using CNN model." International Journal of Information Technology, Research and Applications 4, no. 1 (2025): 30–38. https://doi.org/10.59461/ijitra.v4i1.164.

Full text

Abstract:

Speech Emotion Recognition (SER) is a new area of artificial intelligence that deals with recognizing human emotions from speech signals. Emotions are an important aspect of communication, affecting social interactions and decision-making processes. This paper introduces a complete SER system that uses state-of-the-art deep learning methods to recognize emotions like Happy, Sad, Angry, Neutral, Surprise, Calm, Fear, and Disgust. The suggested model uses Mel-Spectrograms, MFCCs, and Chroma features for efficient feature extraction. Convolutional layers are utilized to capture complex patterns i

APA, Harvard, Vancouver, ISO, and other styles

Setyono, Jonathan Christian, and Amalia Zahra. "Data augmentation and enhancement for multimodal speech emotion recognition." Bulletin of Electrical Engineering and Informatics 12, no. 5 (2023): 3008–15. http://dx.doi.org/10.11591/eei.v12i5.5031.

Full text

Abstract:

Humans’ fundamental need is interaction with each other such as using conversation or speech. Therefore, it is crucial to analyze speech using computer technology to determine emotions. The speech emotion recognition (SER) method detects emotions in speech by examining various aspects. SER is a supervised method to decide the emotion class in speech. This research proposed a multimodal SER model using one of the deep learning based enhancement techniques, which is the attention mechanism. Additionally, this research addresses the imbalanced dataset problem in the SER field using generative adv

APA, Harvard, Vancouver, ISO, and other styles

Harikant, Shashidhar, Rakshitha Prasad, Vijaya Lakshmi R, and Sidhramappa H. "SPEECH EMOTION RECOGNITION USING DEEP LEARNING." International Research Journal of Computer Science 9, no. 8 (2022): 267–71. http://dx.doi.org/10.26562/irjcs.2022.v0908.22.

Full text

Abstract:

Speech Emotion Recognition is a present topic of the research since it has wide range of application. SER is a vital part of effective human interaction in the speech processing. Speech Emotion recognition is a domain that is growing rapidly in the recent years. Unlike humans, machines deficit the potential to perceive and express emotions. But the improvisation of human-computer interaction can be done by automated SER thereby turn down the need of human mediation in recent time. The primary goal of SER is to improve man-machine interface. This paper covers Deep Learning to train the model, L

APA, Harvard, Vancouver, ISO, and other styles

Gawali, Swayam. "Audio Aura - Speech Emotion Recognition System." International Journal for Research in Applied Science and Engineering Technology 13, no. 4 (2025): 7082–88. https://doi.org/10.22214/ijraset.2025.70092.

Full text

Abstract:

Abstract: Speech emotion recognition (SER) plays a crucial role in human-computer interaction, enabling systems to in- terpret and respond to user emotions effectively. In human- computer interaction, speech emotion recognition (SER) is es- sential because it allows systems to efficiently understand and react to user emotions. In this research, we introduce Audio Aura, a machine learning-based system for voice signal emotion classification. To improve classification accuracy and extractrich speech representations, the system uses a transformer-based model called Wav2Vec2. By leveraging Wav2Vec

APA, Harvard, Vancouver, ISO, and other styles

Kumar, Balbant. "Speech Emotion Recognition using CNN." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem45881.

Full text

Abstract:

Abstract Speech Emotion Recognition (SER) is a growing area in affective computing that aims to detect and understand human emotions through speech signals. It finds extensive use in human-computer interaction, virtual assistants, mental health tracking, and automating customer service. This project introduced a deep learning method for SER utilizing Convolutional Neural Networks (CNNs). The system extracts mel-frequency cepstral coefficients (MFCCs) and spectrograms from raw audio inputs and transforms speech signals into two-dimensional images. These images were then processed by a CNN frame

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Speech Recognition (SER)"

Rintala, Jonathan. "Speech Emotion Recognition from Raw Audio using Deep Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-278858.

Full text

Abstract:

Traditionally, in Speech Emotion Recognition, models require a large number of manually engineered features and intermediate representations such as spectrograms for training. However, to hand-engineer such features often requires both expert domain knowledge and resources. Recently, with the emerging paradigm of deep-learning, end-to-end models that extract features themselves and learn from the raw speech signal directly have been explored. A previous approach has been to combine multiple parallel CNNs with different filter lengths to extract multiple temporal features from the audio signal,

APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Speech Recognition (SER)"

1940-, Hess Wolfgang, and Sendlmeier Walter F, eds. Speech and signals: Aspects of speech synthesis and automatic speech recognition : dedicated to Wolfgang Hess on his 60th birthday. T. Hector, 2000.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

N, Lea Robert, Villarreal James, and United States. National Aeronautics and Space Administration. Scientific and Technical Information Program., eds. Proceedings of the Second Joint Technology Workshop on Neural Networks and Fuzzy Logic: Proceedings of a workshop sponsored by the National Aeronautics and Space Administration ... and cosponsored by Lyndon B. Johnson Space Center and the University of Houston, Clear Lake, Houston, Texas, April 10-13, 1990. National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Program, 1991.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Leibo, Joel Z., and Tomaso Poggio. Perception. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780199674923.003.0025.

Full text

Abstract:

This chapter provides an overview of biological perceptual systems and their underlying computational principles focusing on the sensory sheets of the retina and cochlea and exploring how complex feature detection emerges by combining simple feature detectors in a hierarchical fashion. We also explore how the microcircuits of the neocortex implement such schemes pointing out similarities to progress in the field of machine vision driven deep learning algorithms. We see signs that engineered systems are catching up with the brain. For example, vision-based pedestrian detection systems are now a

APA, Harvard, Vancouver, ISO, and other styles

Hilgurt, S. Ya, and O. A. Chemerys. Reconfigurable signature-based information security tools of computer systems. PH “Akademperiodyka”, 2022. http://dx.doi.org/10.15407/akademperiodyka.458.297.

Full text

Abstract:

The book is devoted to the research and development of methods for combining computational structures for reconfigurable signature-based information protection tools for computer systems and networks in order to increase their efficiency. Network security tools based, among others, on such AI-based approaches as deep neural networking, despite the great progress shown in recent years, still suffer from nonzero recognition error probability. Even a low probability of such an error in a critical infrastructure can be disastrous. Therefore, signature-based recognition methods with their theoretic

APA, Harvard, Vancouver, ISO, and other styles

Kasabov, Nikola. Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering. The MIT Press, 1996. http://dx.doi.org/10.7551/mitpress/3071.001.0001.

Full text

Abstract:

In a clear and accessible style, Kasabov describes rule-based and connectionist techniques and then their combinations, with fuzzy logic included, showing the application of the different techniques to a set of simple prototype problems, which makes comparisons possible. A particularly strong feature of the text is that it is filled with applications in engineering, business, and finance. AI problems that cover most of the application-oriented research in the field (pattern recognition, speech and image processing, classification, planning, optimization, prediction, control, decision making, a

APA, Harvard, Vancouver, ISO, and other styles

Bleakley, Chris. Poems That Solve Puzzles. Oxford University Press, 2020. http://dx.doi.org/10.1093/oso/9780198853732.001.0001.

Full text

Abstract:

Algorithms are the hidden methods that computers apply to process information and make decisions. The book tells the story of algorithms from their ancient origins to the present day and beyond. The book introduces readers to the inventors and events behind the genesis of the world’s most important algorithms. Along the way, it explains, with the aid of examples and illustrations, how the most influential algorithms work. The first algorithms were invented in Mesopotamia 4,000 years ago. The ancient Greeks refined the concept, creating algorithms for finding prime numbers and enumerating Pi. A

APA, Harvard, Vancouver, ISO, and other styles

Crespo Miguel, Mario. Automatic corpus-based translation of a spanish framenet medical glossary. 2020th ed. Editorial Universidad de Sevilla, 2020. http://dx.doi.org/10.12795/9788447230051.

Full text

Abstract:

Computational linguistics is the scientific study of language from a computational perspective. It aims is to provide computational models of natural language processing (NLP) and incorporate them into practical applications such as speech synthesis, speech recognition, automatic translation and many others where automatic processing of language is required. The use of good linguistic resources is crucial for the development of computational linguistics systems. Real world applications need resources which systematize the way linguistic information is structured in a certain language. There is

APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Speech Recognition (SER)"

SaiSree, Rampelly, Battula Pranavi, Chandhu Pullannagari, N. Srinivasa Reddy, and C. N. Sujatha. "Speech Emotion Recognition (SER) on Live Calls While Creating Events." In Advances in Computational Intelligence and Its Applications. CRC Press, 2024. http://dx.doi.org/10.1201/9781003488682-23.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Patil, Ashvini, Krishnanjan Bhattacharjee, Archana Chougule, and Swati Mehta. "LSTM-Based Speech Emotion Recognition (SER) for Analyzing Patient’s Verbal Feedback." In Lecture Notes in Networks and Systems. Springer Nature Singapore, 2025. https://doi.org/10.1007/978-981-97-7190-5_7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Palo, Hemanta Kumar, Debasis Behera, and Bikash Chandra Rout. "Comparison of Classifiers for Speech Emotion Recognition (SER) with Discriminative Spectral Features." In Lecture Notes in Networks and Systems. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-2774-6_10.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hafeez, Yasir, Syed Hasan Adil, Mansoor Ebrahim, and Mirzan Izzfitri Bin Mahadir. "Speech Emotion Recognition-Based Music Recommender." In Advances in Computational Intelligence and Robotics. IGI Global, 2025. https://doi.org/10.4018/979-8-3693-9057-3.ch008.

Full text

Abstract:

The objective of this chapter is to provide details implementation of a research project; song recommendations based on speech emotions through Speech Emotion Recognition (SER). This involves developing a Speech Emotion Recognition model utilizing neural network algorithms or deep learning techniques. The selected algorithms include a Convolutional Neural Network (CNN), a Long Short-Term Memory (LSTM) network, a Dense Neural Network (DNN), and a custom hybrid algorithm combining CNN and LSTM. A PyQT5 application framework was implemented to facilitate song recommendations. Users can record the

APA, Harvard, Vancouver, ISO, and other styles

Deeb, Bashar M., Andrey Savchenko, and Ilya Makarov. "CA-SER: Cross-Attention Feature Fusion for Speech Emotion Recognition." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2024. http://dx.doi.org/10.3233/faia241034.

Full text

Abstract:

In this paper, we introduce a novel tool for speech emotion recognition, CA-SER, that borrows self-supervised learning to extract semantic speech representations from a pre-trained wav2vec 2.0 model and combine them with spectral audio features to improve speech emotion recognition. Our approach involves a self-attention encoder on MFCC features to capture meaningful patterns in audio sequences. These MFCC features are combined with high-level representations using a multi-head cross-attention mechanism. Evaluation of speech emotion recognition on the IEMOCAP dataset shows that our system achi

APA, Harvard, Vancouver, ISO, and other styles

Yang Ningning and Shi Fuqian. "Speech Emotion Recognition Based on Back Propagation Neural Network." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2019. https://doi.org/10.3233/978-1-61499-939-3-216.

Full text

Abstract:

Speech Emotion Recognition (SER) has become a hot topic recently. In this paper, Back Propagation Neural Network (BPNN) was used as a training system for SER classification, and four emotional speeches of the German Berlin Emotional Database (EMO-DB) were selected as the experimental data-set. The recognition accuracy was compared under different number of nodes in the hidden layer, and the best classification model was determined by combining the training time and the mean squared error (MSE). The experimental results showed that when the number of nodes in the hidden layer is 14, the MSE is

APA, Harvard, Vancouver, ISO, and other styles

Li, Xiaoke, and Zufan Zhang. "SF-SER: An Efficient Speech-Only Model with Semantic Funnel for Speech Emotion Recognition." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2024. http://dx.doi.org/10.3233/faia240929.

Full text

Abstract:

In practical life, the transcription of speech into text via automatic speech recognition (ASR) models has become very common due to the essential semantic information contained in speech. However, in speech emotion recognition, multimodal models that combine speech and text significantly outperform speech-only models. For this phenomenon, this paper provides an explanation that the existing speech emotion datasets are insufficient for speech-only models to effectively extract crucial semantic information, thereby affecting generalization capability. Based on this explanation, this paper propo

APA, Harvard, Vancouver, ISO, and other styles

Gu Yu, Postma Eric, Lin Hai-Xiang, and van den Herik Jaap. "Speech Emotion Recognition Using Voiced Segment Selection Algorithm." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2016. https://doi.org/10.3233/978-1-61499-672-9-1682.

Full text

Abstract:

Speech emotion recognition (SER) poses one of the major challenges in human-machine interaction. We propose a new algorithm, the Voiced Segment Selection (VSS) algorithm, which can produce an accurate segmentation of speech signals. The VSS algorithm deals with the voiced signal segment as the texture image processing feature which is different from the traditional method. It uses the Log-Gabor filters to extract the voiced and unvoiced features from spectrogram to make the classification. The finding shows that the VSS method is a more accurate algorithm for voiced segment detection. Therefor

APA, Harvard, Vancouver, ISO, and other styles

Fu, Chen, Yang Yu, Zhiqiang Zhang, Wei Weng, and Jinjia Zhou. "Dynamic Talking-Head Generation with Speech Emotion Recognition and Intensity Detection." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2024. http://dx.doi.org/10.3233/faia231260.

Full text

Abstract:

In recent years, speech emotion recognition (SER) and emotive talking head generation have drawn significant attention. However, existing methods either focus predominantly on emotion recognition without considering its intensity or produce emotionally consistent talking heads, neglecting the rich expressiveness of human interactions. Addressing these limitations, our study aims to bridge these two areas, proposing an innovative talking-head generation approach that further enhances the realism of the generated video. Within the SER domain, we develop a dual-level model that not only improves

APA, Harvard, Vancouver, ISO, and other styles

Echim, Sebastian-Vasile, Răzvan-Alexandru Smădu, and Dumitru-Clementin Cercel. "Benchmarking Adversarial Robustness in Speech Emotion Recognition: Insights into Low-Resource Romanian and German Languages." In Frontiers in Artificial Intelligence and Applications. IOS Press, 2024. http://dx.doi.org/10.3233/faia240774.

Full text

Abstract:

Therapy, interviews, and emergency services assisted by artificial intelligence (AI) are applications where speech emotion recognition (SER) plays an essential role, for which performance and robustness are subject to improvement. Deep learning approaches have proven effective in SER; nevertheless, they can underperform when exposed to adversarial attacks. In this paper, we explore and enhance architectures, such as convolutional neural networks with long short-term memory (CNN-LSTM), AlexNet, VGG16, Convolutional Vision Transformer (CvT), Vision Transformer (ViT), and LeViT, by finding the su

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Speech Recognition (SER)"

Chou, Huang-Cheng. "A Tiny Whisper-SER: Unifying Automatic Speech Recognition and Multi-label Speech Emotion Recognition Tasks." In 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2024. https://doi.org/10.1109/apsipaasc63619.2025.10848651.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Gomes, Vanessa M., Ana Patrícia F. M. Mascarenhas, Ivanoé J. Rodowanski, et al. "Using Speech and Text in Emotions Recognition." In 2024 Brazilian Symposium on Robotics (SBR), and 2024 Workshop on Robotics in Education (WRE). IEEE, 2024. https://doi.org/10.1109/sbr/wre63066.2024.10838143.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Teng, Haoqun. "SED-Net: Speedy Encoding-Decoding Network for Artistic Style Transfer." In 2024 7th International Conference on Pattern Recognition and Artificial Intelligence (PRAI). IEEE, 2024. https://doi.org/10.1109/prai62207.2024.10827207.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Oh, Qi Qi, Chee Kiat Seow, Mulliana Yusuff, Sugiri Pranata, and Qi Cao. "The Impact of Face Mask and Emotion on Automatic Speech Recognition (ASR) and Speech Emotion Recognition (SER)." In 2023 8th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA). IEEE, 2023. http://dx.doi.org/10.1109/icccbda56900.2023.10154691.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Li, Runnan, Zhiyong Wu, Jia Jia, Yaohua Bu, Sheng Zhao, and Helen Meng. "Towards Discriminative Representation Learning for Speech Emotion Recognition." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/703.

Full text

Abstract:

In intelligent speech interaction, automatic speech emotion recognition (SER) plays an important role in understanding user intention. While sentimental speech has different speaker characteristics but similar acoustic attributes, one vital challenge in SER is how to learn robust and discriminative representations for emotion inferring. In this paper, inspired by human emotion perception, we propose a novel representation learning component (RLC) for SER system, which is constructed with Multi-head Self-attention and Global Context-aware Attention Long Short-Term Memory Recurrent Neutral Netwo

APA, Harvard, Vancouver, ISO, and other styles

Ainurrochman, Irfanur Ilham Febriansyah, and Umi Laili Yuhana. "SER: Speech Emotion Recognition Application Based on Extreme Learning Machine." In 2021 13th International Conference on Information & Communication Technology and System (ICTS). IEEE, 2021. http://dx.doi.org/10.1109/icts52701.2021.9609016.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Ferreira, Gabriel Gonçalves, and Johnny Marques. "A Speech Emotion Recognition Model to Detect Aggressive Behavior in Dialogues." In Anais Estendidos do Simpósio Brasileiro de Sistemas de Informação. Sociedade Brasileira de Computação (SBC), 2024. http://dx.doi.org/10.5753/sbsi_estendido.2024.238648.

Full text

Abstract:

Speech Emotion Recognition (SER) is a multidisciplinary field that involves the development of computational models to automatically detect and analyze emotional states conveyed through speech signals. Utilizing techniques from signal processing, machine learning, and natural language processing, SER systems extract relevant features from audio data and classify emotions into distinct categories such as happiness, sadness, anger, and more. This work aims to leverage the latest SER techniques to build a robust model that can detect aggressive behavior in dialogues solely based on audio input si

APA, Harvard, Vancouver, ISO, and other styles

Sinha, Arryan, and G. Suseela. "Deep Learning-Based Speech Emotion Recognition." In International Research Conference on IOT, Cloud and Data Science. Trans Tech Publications Ltd, 2023. http://dx.doi.org/10.4028/p-0892re.

Full text

Abstract:

Speech Emotion Recognition, as described in this study, uses Neural Networks to classify the emotions expressed in each speech (SER). It’s centered upon concept where voice tone and pitch frequently reflect underlying emotion. Speech Emotion Recognition aids in the classification of elicited emotions. The MLP-Classifier is a tool for classifying emotions in a circumstance. As wave signal, allowing for flexible learning rate selection. RAVDESS (Ryerson Audio-Visual Dataset Emotional Speech and Song Database data) will be used. To extract the characteristics from particular audio input, Contrast

APA, Harvard, Vancouver, ISO, and other styles

Lorenzo Bautista, John, Yun Kyung Lee, Seungyoon Nam, Chanki Park, and Hyun Soon Shin. "Utilizing Dimensional Emotion Representations in Speech Emotion Recognition." In AHFE 2023 Hawaii Edition. AHFE International, 2023. http://dx.doi.org/10.54941/ahfe1004283.

Full text

Abstract:

Speech is a natural way of communication amongst humans and advancements in speech emotion recognition (SER) technology allow further improvement of human-computer interactions (HCI) with speech by understanding human emotions. SER systems are traditionally focused on categorizing emotions into discrete classes. However, discrete classes often overlook some subtleties between each emotion as they are prone to individual differences and cultures. In this study, we focused on the use of dimensional emotional values: valence, arousal, and dominance as outputs for an SER instead of the traditional

APA, Harvard, Vancouver, ISO, and other styles

Da Silva, Ronnypetson, Valter M. Filho, and Mario Souza. "Interaffection of Multiple Datasets with Neural Networks in Speech Emotion Recognition." In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2020. http://dx.doi.org/10.5753/eniac.2020.12141.

Full text

Abstract:

Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show tha

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Speech Recognition (SER)"

African Open Science Platform Part 1: Landscape Study. Academy of Science of South Africa (ASSAf), 2019. http://dx.doi.org/10.17159/assaf.2019/0047.

Full text

Abstract:

This report maps the African landscape of Open Science – with a focus on Open Data as a sub-set of Open Science. Data to inform the landscape study were collected through a variety of methods, including surveys, desk research, engagement with a community of practice, networking with stakeholders, participation in conferences, case study presentations, and workshops hosted. Although the majority of African countries (35 of 54) demonstrates commitment to science through its investment in research and development (R&D), academies of science, ministries of science and technology, policies, rec

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Contents

Academic literature on the topic 'Speech Recognition (SER)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Journal articles on the topic "Speech Recognition (SER)"

Dissertations / Theses on the topic "Speech Recognition (SER)"

Books on the topic "Speech Recognition (SER)"

Book chapters on the topic "Speech Recognition (SER)"

Conference papers on the topic "Speech Recognition (SER)"

Reports on the topic "Speech Recognition (SER)"