Готові списки джерел за темами / Emotional speech database

Добірка наукової літератури з теми "Emotional speech database"

Автор: Grafiati

Опубліковано: 20 червня 2021

Оновлено: 15 лютого 2022

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Emotional speech database".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Emotional speech database":

Tank, Vishal P., and S. K. Hadia. "Creation of speech corpus for emotion analysis in Gujarati language and its evaluation by various speech parameters." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 5 (October 1, 2020): 4752. http://dx.doi.org/10.11591/ijece.v10i5.pp4752-4758.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In the last couple of years emotion recognition has proven its significance in the area of artificial intelligence and man machine communication. Emotion recognition can be done using speech and image (facial expression), this paper deals with SER (speech emotion recognition) only. For emotion recognition emotional speech database is essential. In this paper we have proposed emotional database which is developed in Gujarati language, one of the official’s language of India. The proposed speech corpus bifurcate six emotional states as: sadness, surprise, anger, disgust, fear, happiness. To observe effect of different emotions, analysis of proposed Gujarati speech database is carried out using efficient speech parameters like pitch, energy and MFCC using MATLAB Software.

Byun, Sung-Woo, and Seok-Pil Lee. "A Study on a Speech Emotion Recognition System with Effective Acoustic Features Using Deep Learning Algorithms." Applied Sciences 11, no. 4 (February 21, 2021): 1890. http://dx.doi.org/10.3390/app11041890.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The goal of the human interface is to recognize the user’s emotional state precisely. In the speech emotion recognition study, the most important issue is the effective parallel use of the extraction of proper speech features and an appropriate classification engine. Well defined speech databases are also needed to accurately recognize and analyze emotions from speech signals. In this work, we constructed a Korean emotional speech database for speech emotion analysis and proposed a feature combination that can improve emotion recognition performance using a recurrent neural network model. To investigate the acoustic features, which can reflect distinct momentary changes in emotional expression, we extracted F0, Mel-frequency cepstrum coefficients, spectral features, harmonic features, and others. Statistical analysis was performed to select an optimal combination of acoustic features that affect the emotion from speech. We used a recurrent neural network model to classify emotions from speech. The results show the proposed system has more accurate performance than previous studies.

손남호, Hwang Hyosung, and Ho-Young Lee. "Emotional Speech Database and the Acoustic Analysis of Emotional Speech." EONEOHAG ll, no. 72 (August 2015): 175–99. http://dx.doi.org/10.17290/jlsk.2015..72.175.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Vicsi, Klára, and Dávid Sztahó. "Recognition of Emotions on the Basis of Different Levels of Speech Segments." Journal of Advanced Computational Intelligence and Intelligent Informatics 16, no. 2 (March 20, 2012): 335–40. http://dx.doi.org/10.20965/jaciii.2012.p0335.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Emotions play a very important role in human-human and human-machine communication. They can be expressed by voice, bodily gestures, and facial movements. People’s acceptance of any kind of intelligent device depends, to a large extent, on how the device reflects emotions. This is the reason why automatic emotion recognition is a recent research topic. In this paper we deal with automatic emotion recognition from human voice. Numerous papers in this field deal with database creation and with the examination of acoustic features appropriate for such recognition, but only few attempts were made to compare different emotional segmentation units that are needed to recognize the emotions in spontaneous speech properly. In the Laboratory of Speech Acoustics experiments were ran to examine the effect of diverse speech segment lengths on recognition performance. An emotional database was prepared on the basis of three different segmentation levels: word, intonational phrase and sentence. Automatic recognition tests were conducted using support vector machines with four basic emotions: neutral, anger, sadness, and joy. The analysis of the results clearly shows that intonation phrase-sized speech units give the best performance in emotional recognition in continuous speech.

Quan, Changqin, Bin Zhang, Xiao Sun, and Fuji Ren. "A combined cepstral distance method for emotional speech recognition." International Journal of Advanced Robotic Systems 14, no. 4 (July 1, 2017): 172988141771983. http://dx.doi.org/10.1177/1729881417719836.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Affective computing is not only the direction of reform in artificial intelligence but also exemplification of the advanced intelligent machines. Emotion is the biggest difference between human and machine. If the machine behaves with emotion, then the machine will be accepted by more people. Voice is the most natural and can be easily understood and accepted manner in daily communication. The recognition of emotional voice is an important field of artificial intelligence. However, in recognition of emotions, there often exists the phenomenon that two emotions are particularly vulnerable to confusion. This article presents a combined cepstral distance method in two-group multi-class emotion classification for emotional speech recognition. Cepstral distance combined with speech energy is well used as speech signal endpoint detection in speech recognition. In this work, the use of cepstral distance aims to measure the similarity between frames in emotional signals and in neutral signals. These features are input for directed acyclic graph support vector machine classification. Finally, a two-group classification strategy is adopted to solve confusion in multi-emotion recognition. In the experiments, Chinese mandarin emotion database is used and a large training set (1134 + 378 utterances) ensures a powerful modelling capability for predicting emotion. The experimental results show that cepstral distance increases the recognition rate of emotion sad and can balance the recognition results with eliminating the over fitting. And for the German corpus Berlin emotional speech database, the recognition rate between sad and boring, which are very difficult to distinguish, is up to 95.45%.

Shahin, Ismail. "Employing Emotion Cues to Verify Speakers in Emotional Talking Environments." Journal of Intelligent Systems 25, no. 1 (January 1, 2016): 3–17. http://dx.doi.org/10.1515/jisys-2014-0118.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

AbstractUsually, people talk neutrally in environments where there are no abnormal talking conditions such as stress and emotion. Other emotional conditions that might affect people’s talking tone include happiness, anger, and sadness. Such emotions are directly affected by the patient’s health status. In neutral talking environments, speakers can be easily verified; however, in emotional talking environments, speakers cannot be easily verified as in neutral talking ones. Consequently, speaker verification systems do not perform well in emotional talking environments as they do in neutral talking environments. In this work, a two-stage approach has been employed and evaluated to improve speaker verification performance in emotional talking environments. This approach employs speaker’s emotion cues (text-independent and emotion-dependent speaker verification problem) based on both hidden Markov models (HMMs) and suprasegmental HMMs as classifiers. The approach is composed of two cascaded stages that combine and integrate an emotion recognizer and a speaker recognizer into one recognizer. The architecture has been tested on two different and separate emotional speech databases: our collected database and the Emotional Prosody Speech and Transcripts database. The results of this work show that the proposed approach gives promising results with a significant improvement over previous studies and other approaches such as emotion-independent speaker verification approach and emotion-dependent speaker verification approach based completely on HMMs.

Caballero-Morales, Santiago-Omar. "Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels." Scientific World Journal 2013 (2013): 1–13. http://dx.doi.org/10.1155/2013/162093.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

An approach for the recognition of emotions in speech is presented. The target language is Mexican Spanish, and for this purpose a speech database was created. The approach consists in the phoneme acoustic modelling of emotion-specific vowels. For this, a standard phoneme-based Automatic Speech Recognition (ASR) system was built with Hidden Markov Models (HMMs), where different phoneme HMMs were built for the consonants and emotion-specific vowels associated with four emotional states (anger, happiness, neutral, sadness). Then, estimation of the emotional state from a spoken sentence is performed by counting the number of emotion-specific vowels found in the ASR’s output for the sentence. With this approach, accuracy of 87–100% was achieved for the recognition of emotional state of Mexican Spanish speech.

Sultana, Sadia, M. Shahidur Rahman, M. Reza Selim, and M. Zafar Iqbal. "SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla." PLOS ONE 16, no. 4 (April 30, 2021): e0250173. http://dx.doi.org/10.1371/journal.pone.0250173.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

SUBESCO is an audio-only emotional speech corpus for Bangla language. The total duration of the corpus is in excess of 7 hours containing 7000 utterances, and it is the largest emotional speech corpus available for this language. Twenty native speakers participated in the gender-balanced set, each recording of 10 sentences simulating seven targeted emotions. Fifty university students participated in the evaluation of this corpus. Each audio clip of this corpus, except those of Disgust emotion, was validated four times by male and female raters. Raw hit rates and unbiased rates were calculated producing scores above chance level of responses. Overall recognition rate was reported to be above 70% for human perception tests. Kappa statistics and intra-class correlation coefficient scores indicated high-level of inter-rater reliability and consistency of this corpus evaluation. SUBESCO is an Open Access database, licensed under Creative Common Attribution 4.0 International, and can be downloaded free of charge from the web link: https://doi.org/10.5281/zenodo.4526477.

Keshtiari, Niloofar, Michael Kuhlmann, Moharram Eslami, and Gisela Klann-Delius. "Recognizing emotional speech in Persian: A validated database of Persian emotional speech (Persian ESD)." Behavior Research Methods 47, no. 1 (May 23, 2014): 275–94. http://dx.doi.org/10.3758/s13428-014-0467-x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Werner, S., and G. N. Petrenko. "Speech Emotion Recognition: Humans vs Machines." Discourse 5, no. 5 (December 18, 2019): 136–52. http://dx.doi.org/10.32603/2412-8562-2019-5-5-136-152.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Introduction. The study focuses on emotional speech perception and speech emotion recognition using prosodic clues alone. Theoretical problems of defining prosody, intonation and emotion along with the challenges of emotion classification are discussed. An overview of acoustic and perceptional correlates of emotions found in speech is provided. Technical approaches to speech emotion recognition are also considered in the light of the latest emotional speech automatic classification experiments.Methodology and sources. The typical “big six” classification commonly used in technical applications is chosen and modified to include such emotions as disgust and shame. A database of emotional speech in Russian is created under sound laboratory conditions. A perception experiment is run using Praat software’s experimental environment.Results and discussion. Cross-cultural emotion recognition possibilities are revealed, as the Finnish and international participants recognised about a half of samples correctly. Nonetheless, native speakers of Russian appear to distinguish a larger proportion of emotions correctly. The effects of foreign languages knowledge, musical training and gender on the performance in the experiment were insufficiently prominent. The most commonly confused pairs of emotions, such as shame and sadness, surprise and fear, anger and disgust as well as confusions with neutral emotion were also given due attention.Conclusion. The work can contribute to psychological studies, clarifying emotion classification and gender aspect of emotionality, linguistic research, providing new evidence for prosodic and comparative language studies, and language technology, deepening the understanding of possible challenges for SER systems.

Більше джерел

Дисертації з теми "Emotional speech database":

Sun, Rui. "The evaluation of the stability of acoustic features in affective conveyance across multiple emotional databases." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49041.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The objective of the research presented in this thesis was to systematically investigate the computational structure for cross-database emotion recognition. The research consisted of evaluating the stability of acoustic features, particularly the glottal and Teager Energy based features, and investigating three normalization methods and two data fusion techniques. One of the challenges of cross-database training and testing is accounting for the potential variation in the types of emotions expressed as well as the recording conditions. In an attempt to alleviate the impact of these types of variations, three normalization methods on the acoustic data were studied. Motivated by the lack of large and diverse enough emotional database to train the classifier, using multiple databases to train posed another challenge: data fusion. This thesis proposed two data fusion techniques, pre-classification SDS and post-classification ROVER to study the issue. Using the glottal, TEO and TECC features, of which the stability of emotion distinguishing ability has been highlighted on multiple databases, the systematic computational structure proposed in this thesis could improve the performance of cross-database binary-emotion recognition by up to 23% for neutral vs. emotional and 10% for positive vs. negative.

CHENG, KUAN-JUNG, and 程冠融. "Cross-Lingual Speech Emotion Recognition Based on Speech Recognition Technology in An Emotional Speech Database in Mandarin, Taiwanese, and Hakka." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/6c4m2x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
國立雲林科技大學
資訊管理系
107
With the development of artificial intelligence, machine learning and deep learning, there are considerable breakthroughs in recognition techniques such as image recognition and speech recognition. Especially in the speech recognition technology, whether it is everyone's smart phone, or the current popular smart speakers, these products are equipped with voice assistants to provide a convenient voice interactive interface. Allowing machines to understand human emotions helps to increase interaction with machines, so speech emotion recognition is an important issue. An important research direction in speech emotion recognition technology is cross-corpus and cross-lingual speech emotion recognition. Most existing speech emotion recognition researches focus on using the same corpus to train and test the speech emotion recognition system. In the context of cross-corpus and cross-lingual, the effectiveness of such systems is significantly reduced. In order to solve this problem, this study uses the cascaded normalization approach proposed by previous research to eliminate the difference as much as possible, and observe whether the extreme learning machine can improve the rate of cross-lingual speech emotion recognition in An Emotional Speech Database In Mandarin, Taiwanese, and Hakka. Besides An Emotional Speech Database In Mandarin, Taiwanese, and Hakka, we add the Berlin Database of Emotional Speech (Emo-DB) to conduct experiments of cross-corpus speech emotion recognition.

Lu, Jhih-Jheng, and 陸至正. "Construction and Testing of a Mandarin Emotional Speech Database and Its Application." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/54267308290823890882.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

碩士
大同大學
資訊工程學系(所)
92
Automatic emotional speech recognition is a hot topic in signal processing. In this thesis, we build a Mandarin emotional speech database which includes anger, happiness, sadness, boredom, and neutral emotion utterances. We extract the Mel-frequency cepstrum coefficients from each speech as the emotion feature vector. We use K-nearest neighbor method to be our classifier, and obtained 74.6% recognition accuracy. We also proposed a modified K-nearest neighbor method for emotion evaluation. For training the hearing-impaired people to speak naturally, we design an emotion radar chart to present the intensity of each emotion. With the techniques stated above, we implement a computer-assisted speech training system.

Manamela, Phuti John. "The automatic recognition of emotions in speech." Thesis, 2020. http://hdl.handle.net/10386/3347.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Thesis(M.Sc.(Computer Science)) -- University of Limpopo, 2020
Speech emotion recognition (SER) refers to a technology that enables machines to detect and recognise human emotions from spoken phrases. In the literature, numerous attempts have been made to develop systems that can recognise human emotions from their voice, however, not much work has been done in the context of South African indigenous languages. The aim of this study was to develop an SER system that can classify and recognise six basic human emotions (i.e., sadness, fear, anger, disgust, happiness, and neutral) from speech spoken in Sepedi language (one of South Africa’s official languages). One of the major challenges encountered, in this study, was the lack of a proper corpus of emotional speech. Therefore, three different Sepedi emotional speech corpora consisting of acted speech data have been developed. These include a RecordedSepedi corpus collected from recruited native speakers (9 participants), a TV broadcast corpus collected from professional Sepedi actors, and an Extended-Sepedi corpus which is a combination of Recorded-Sepedi and TV broadcast emotional speech corpora. Features were extracted from the speech corpora and a data file was constructed. This file was used to train four machine learning (ML) algorithms (i.e., SVM, KNN, MLP and Auto-WEKA) based on 10 folds validation method. Three experiments were then performed on the developed speech corpora and the performance of the algorithms was compared. The best results were achieved when Auto-WEKA was applied in all the experiments. We may have expected good results for the TV broadcast speech corpus since it was collected from professional actors, however, the results showed differently. From the findings of this study, one can conclude that there are no precise or exact techniques for the development of SER systems, it is a matter of experimenting and finding the best technique for the study at hand. The study has also highlighted the scarcity of SER resources for South African indigenous languages. The quality of the dataset plays a vital role in the performance of SER systems.
National research foundation (NRF) and Telkom Center of Excellence (CoE)

Ferro, Adelino Rafael Mendes. "Speech emotion recognition through statistical classification." Master's thesis, 2017. http://hdl.handle.net/10400.14/22817.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

O propósito desta dissertação é a discussão do reconhecimento de emoção na voz. Para este fim, criou-se uma base de dados validada de discurso emocional simulado Português, intitulada European Portuguese Emotional Discourse Database (EPEDD) e foram operados algoritmos de classificação estatística nessa base de dados. EPEDD é uma base de dados simulada, caracterizada por pequenos discursos (5 frases longas, 5 frases curtas e duas palavras), todos eles pronunciados por 8 atores—ambos os sexos igualmente representados—em 9 diferentes emoções (raiva, alegria, nojo, excitação, apatia, medo, surpresa, tristeza e neutro), baseadas no modelo de emoções de Lövheim. Concretizou-se uma avaliação de 40% da base de dados por avaliadores inexperientes, filtrando 60% dos pequenos discursos, com o intuito de criar uma base de dados validada. A base de dados completa contem 718 instâncias, enquanto que a base de dados validada contém 116 instâncias. A qualidade média de representação teatral, numa escala de a 5 foi avaliada como 2,3. A base de dados validada é composta por discurso emocional cujas emoções são reconhecidas com uma taxa média de 69,6%, por avaliadores inexperientes. A raiva tem a taxa de reconhecimento mais elevada com 79,7%, enquanto que o nojo, a emoção cuja taxa de reconhecimento é a mais baixa, consta com 40,5%. A extração de características e a classificação estatística foi realizada respetivamente através dos softwares Opensmile e Weka. Os algoritmos foram operados na base dados original e na base de dados avaliada, tendo sido obtidos os melhores resultados através de SVMs, respetivamente com 48,7% e 44,0%. A apatia obteve a taxa de reconhecimento mais elevada com 79,0%, enquanto que a excitação obteve a taxa de reconhecimento mais baixa com 32,9%.
The purpose of this dissertation is to discuss speech emotion recognition. It was created a validated acted Portuguese emotional speech database, named European Portuguese Emotional Discourse Database (EPEDD), and statistical classification algorithms have been applied on it. EPEDD is an acted database, featuring 12 utterances (2 single-words, 5 short sentences and 5 long sentences) per actor and per emotion, 8 actors, both genders equally represented, and 9 emotions (anger, joy, disgust, excitement, fear, apathy, surprise, sadness and neutral), based on Lövheim’s emotion model. We had 40% of the database evaluated by unexperienced evaluators, enabling us to produce a validated one, filtering 60% of the evaluated utterances. The full database contains 718 instances, while the validated one contains 116 instances. The average acting quality of the original database was evaluated, in a scale from 1 to 5, as 2,3. The validated database is composed by emotional utterances that have their emotions recognized on average at a 69,6% rate, by unexperienced judges. Anger had the highest recognition rate at 79,7%, while disgust had the lowest recognition rate at 40,5%. Feature extraction and statistical classification algorithms were performed respectively applying Opensmile and Weka software. Statistical classification algorithms operated in the full database and in the validated one, best results being obtained by SVMs, respectively the emotion recognition rates being 48,7% and 44,0%. Apathy had the highest recognition rate: 79.0%, while excitement had the lowest emotion recognition rate: 32.9%.

Частини книг з теми "Emotional speech database":

Gajšek, Rok, Vitomir Štruc, Boštjan Vesnicer, Anja Podlesek, Luka Komidar, and France Mihelič. "Analysis and Assessment of AvID: Multi-Modal Emotional Database." In Text, Speech and Dialogue, 266–73. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-04208-9_38.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Justin, Tadej, Vitomir Štruc, Janez Žibert, and France Mihelič. "Development and Evaluation of the Emotional Slovenian Speech Database - EmoLUKS." In Text, Speech, and Dialogue, 351–59. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24033-6_40.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Geethashree, A., and D. J. Ravi. "Kannada Emotional Speech Database: Design, Development and Evaluation." In Proceedings of International Conference on Cognition and Recognition, 135–43. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-5146-3_14.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Sapiński, Tomasz, Dorota Kamińska, Adam Pelikant, Cagri Ozcinar, Egils Avots, and Gholamreza Anbarjafari. "Multimodal Database of Emotional Speech, Video and Gestures." In Pattern Recognition and Information Forensics, 153–63. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-05792-3_15.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Staroniewicz, Piotr, and Wojciech Majewski. "Polish Emotional Speech Database – Recording and Preliminary Validation." In Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions, 42–49. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-03320-9_5.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Jokić, Ivan, Stevan Jokić, Vlado Delić, and Zoran Perić. "Impact of Emotional Speech to Automatic Speaker Recognition - Experiments on GEES Speech Database." In Speech and Computer, 268–75. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-11581-8_33.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Navas, Eva, Inmaculada Hernáez, Amaia Castelruiz, and Iker Luengo. "Obtaining and Evaluating an Emotional Database for Prosody Modelling in Standard Basque." In Text, Speech and Dialogue, 393–400. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-30120-2_50.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Pérez-Espinosa, Humberto, Carlos Aleberto Reyes-García, and Luis Villaseñor-Pineda. "EmoWisconsin: An Emotional Children Speech Database in Mexican Spanish." In Affective Computing and Intelligent Interaction, 62–71. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24571-8_7.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Gahlawat, Mukta, Amita Malik, and Poonam Bansal. "Phonetic Transcription Comparison for Emotional Database for Speech Synthesis." In Advances in Intelligent Systems and Computing, 187–94. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-6626-9_21.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Atassi, Hicham, Maria Teresa Riviello, Zdeněk Smékal, Amir Hussain, and Anna Esposito. "Emotional Vocal Expressions Recognition Using the COST 2102 Italian Database of Emotional Speech." In Development of Multimodal Interfaces: Active Listening and Synchrony, 255–67. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-12397-9_21.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Тези доповідей конференцій з теми "Emotional speech database":

Bansal, Sweeta, and Amita Dev. "Emotional hindi speech database." In 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE). IEEE, 2013. http://dx.doi.org/10.1109/icsda.2013.6709867.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Oflazoglu, Caglar, and Serdar Yildirim. "Turkish emotional speech database." In 2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU). IEEE, 2011. http://dx.doi.org/10.1109/siu.2011.5929860.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Burkhardt, Felix, A. Paeschke, M. Rolfes, Walter F. Sendlmeier, and Benjamin Weiss. "A database of German emotional speech." In Interspeech 2005. ISCA: ISCA, 2005. http://dx.doi.org/10.21437/interspeech.2005-446.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Zen, Heiga, Tadashi Kitamura, Murtaza Bulut, Shrikanth Narayanan, Ryosuke Tsuzuki, and Keiichi Tokuda. "Constructing emotional speech synthesizers with limited speech database." In Interspeech 2004. ISCA: ISCA, 2004. http://dx.doi.org/10.21437/interspeech.2004-442.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Sato, Ryota, Ryohei Sasaki, Norisato Suga, and Toshihiro Furukawa. "Creation and Analysis of Emotional Speech Database for Multiple Emotions Recognition." In 2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA). IEEE, 2020. http://dx.doi.org/10.1109/o-cocosda50338.2020.9295041.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Grimm, Michael, Kristian Kroschel, and Shrikanth Narayanan. "The Vera am Mittag German audio-visual emotional speech database." In 2008 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2008. http://dx.doi.org/10.1109/icme.2008.4607572.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Ko, Youjung, Insuk Hong, Hyunsoon Shin, and Yoonjoong Kim. "Construction of a database of emotional speech using emotion sounds from movies and dramas." In 2017 International Conference on Information and Communications (ICIC). IEEE, 2017. http://dx.doi.org/10.1109/infoc.2017.8001672.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Mustafa, Mumtaz B., Raja N. Ainon, Roziati Zainuddin, Zuraidah M. Don, and Gerry Knowles. "Assessing the naturalness of malay emotional voice corpora." In 2011 Oriental COCOSDA 2011 - International Conference on Speech Database and Assessments. IEEE, 2011. http://dx.doi.org/10.1109/icsda.2011.6086002.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Pandharipande, Meghna A., Rupayan Chakraborty, and Sunil Kumar Kopparapu. "Methods and challenges for creating an emotional audio-visual database." In 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA). IEEE, 2017. http://dx.doi.org/10.1109/icsda.2017.8384466.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Li, Runnan, Zhiyong Wu, Jia Jia, Yaohua Bu, Sheng Zhao, and Helen Meng. "Towards Discriminative Representation Learning for Speech Emotion Recognition." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/703.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In intelligent speech interaction, automatic speech emotion recognition (SER) plays an important role in understanding user intention. While sentimental speech has different speaker characteristics but similar acoustic attributes, one vital challenge in SER is how to learn robust and discriminative representations for emotion inferring. In this paper, inspired by human emotion perception, we propose a novel representation learning component (RLC) for SER system, which is constructed with Multi-head Self-attention and Global Context-aware Attention Long Short-Term Memory Recurrent Neutral Network (GCA-LSTM). With the ability of Multi-head Self-attention mechanism in modeling the element-wise correlative dependencies, RLC can exploit the common patterns of sentimental speech features to enhance emotion-salient information importing in representation learning. By employing GCA-LSTM, RLC can selectively focus on emotion-salient factors with the consideration of entire utterance context, and gradually produce discriminative representation for emotion inferring. Experiments on public emotional benchmark database IEMOCAP and a tremendous realistic interaction database demonstrate the outperformance of the proposed SER framework, with 6.6% to 26.7% relative improvement on unweighted accuracy compared to state-of-the-art techniques.