Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Automatic speech recognition – Statistical methods.

Artykuły w czasopismach na temat „Automatic speech recognition – Statistical methods”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Automatic speech recognition – Statistical methods”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Boyer, A., J. Di Martino, P. Divoux, J. P. Haton, J. F. Mari, and K. Smaili. "Statistical methods in multi-speaker automatic speech recognition." Applied Stochastic Models and Data Analysis 6, no. 3 (September 1990): 143–55. http://dx.doi.org/10.1002/asm.3150060302.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Kłosowski, Piotr. "A Rule-Based Grapheme-to-Phoneme Conversion System." Applied Sciences 12, no. 5 (March 7, 2022): 2758. http://dx.doi.org/10.3390/app12052758.

Pełny tekst źródła
Streszczenie:
This article presents a rule-based grapheme-to-phoneme conversion method and algorithm for Polish. It should be noted that the fundamental grapheme-to-phoneme conversion rules have been developed by Maria Steffen-Batóg and presented in her set of monographs dedicated to the automatic grapheme-to-phoneme conversion of texts in Polish. The author used previously developed rules and independently developed the grapheme-to-phoneme conversion algorithm.The algorithm has been implemented as a software application called TransFon, which allows the user to convert any text in Polish orthography to cor
Style APA, Harvard, Vancouver, ISO itp.
3

Toth, Laszlo, Ildiko Hoffmann, Gabor Gosztolya, Veronika Vincze, Greta Szatloczki, Zoltan Banreti, Magdolna Pakaski, and Janos Kalman. "A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech." Current Alzheimer Research 15, no. 2 (January 3, 2018): 130–38. http://dx.doi.org/10.2174/1567205014666171121114930.

Pełny tekst źródła
Streszczenie:
Background: Even today the reliable diagnosis of the prodromal stages of Alzheimer's disease (AD) remains a great challenge. Our research focuses on the earliest detectable indicators of cognitive decline in mild cognitive impairment (MCI). Since the presence of language impairment has been reported even in the mild stage of AD, the aim of this study is to develop a sensitive neuropsychological screening method which is based on the analysis of spontaneous speech production during performing a memory task. In the future, this can form the basis of an Internet-based interactive screening softwa
Style APA, Harvard, Vancouver, ISO itp.
4

Gellatly, Andrew W., and Thomas A. Dingus. "Speech Recognition and Automotive Applications: Using Speech to Perform in-Vehicle Tasks." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 42, no. 17 (October 1998): 1247–51. http://dx.doi.org/10.1177/154193129804201715.

Pełny tekst źródła
Streszczenie:
An experiment was conducted to investigate the effects of automatic speech recognition (ASR) system design, driver input-modality, and driver age on driving performance during in-vehicle task execution and in-vehicle task usability. Results showed that ASR system design (i.e., recognition accuracy and recognition error type) and driver input-modality (i.e., manual or speech) significantly affected certain dependent measures. However, the differences found were small, suggesting that less than ideal ASR system design/performance can be considered for use in automobiles without substantially imp
Style APA, Harvard, Vancouver, ISO itp.
5

Seman, Noraini, and Ahmad Firdaus Norazam. "Hybrid methods of brandt’s generalised likelihood ratio and short-term energy for malay word speech segmentation." Indonesian Journal of Electrical Engineering and Computer Science 16, no. 1 (October 1, 2019): 283. http://dx.doi.org/10.11591/ijeecs.v16.i1.pp283-291.

Pełny tekst źródła
Streszczenie:
<p>Speech segmentation is an important part for speech recognition, synthesizing and coding. Statistical based approach detects segmentation points via computing spectral distortion of the signal without prior knowledge of the acoustic information proved to be able to give good match, less omission but lot of insertion. In this study the segmentation is done both manually and automatically using Malay words in traditional Malay poetry. This study proposed a hybrid method of Brandt’s generalized likelihood ratio (GLR) and short-term energy algorithm. The Brandt’s algorithm tries to estima
Style APA, Harvard, Vancouver, ISO itp.
6

Cabral, Frederico Soares, Hidekazu Fukai, and Satoshi Tamura. "Feature Extraction Methods Proposed for Speech Recognition Are Effective on Road Condition Monitoring Using Smartphone Inertial Sensors." Sensors 19, no. 16 (August 9, 2019): 3481. http://dx.doi.org/10.3390/s19163481.

Pełny tekst źródła
Streszczenie:
The objective of our project is to develop an automatic survey system for road condition monitoring using smartphone devices. One of the main tasks of our project is the classification of paved and unpaved roads. Assuming recordings will be archived by using various types of vehicle suspension system and speeds in practice, hence, we use the multiple sensors found in smartphones and state-of-the-art machine learning techniques for signal processing. Despite usually not being paid much attention, the results of the classification are dependent on the feature extraction step. Therefore, we have
Style APA, Harvard, Vancouver, ISO itp.
7

Hai, Yanfei. "Computer-aided teaching mode of oral English intelligent learning based on speech recognition and network assistance." Journal of Intelligent & Fuzzy Systems 39, no. 4 (October 21, 2020): 5749–60. http://dx.doi.org/10.3233/jifs-189052.

Pełny tekst źródła
Streszczenie:
The purpose of this paper is to use English specific syllables and prosodic features in spoken speech data to carry out English spoken recognition, and to explore effective methods for the design and application of English speech detection and automatic recognition systems. The method proposed by this study is a combination of SVM_FF based classifier, SVM_IER based classifier and syllable classifier. Compared with the method based on the combination of other phonological characteristics such as phonological rate, intensity, formant and energy statistics and pronunciation rate, and the syllable
Style APA, Harvard, Vancouver, ISO itp.
8

Markovnikov, Nikita, and Irina Kipyatkova. "Encoder-decoder models for recognition of Russian speech." Information and Control Systems, no. 4 (October 4, 2019): 45–53. http://dx.doi.org/10.31799/1684-8853-2019-4-45-53.

Pełny tekst źródła
Streszczenie:
Problem: Classical systems of automatic speech recognition are traditionally built using an acoustic model based on hidden Markovmodels and a statistical language model. Such systems demonstrate high recognition accuracy, but consist of several independentcomplex parts, which can cause problems when building models. Recently, an end-to-end recognition method has been spread, usingdeep artificial neural networks. This approach makes it easy to implement models using just one neural network. End-to-end modelsoften demonstrate better performance in terms of speed and accuracy of speech recognitio
Style APA, Harvard, Vancouver, ISO itp.
9

AFLI, HAITHEM, LOÏC BARRAULT, and HOLGER SCHWENK. "Building and using multimodal comparable corpora for machine translation." Natural Language Engineering 22, no. 4 (June 15, 2016): 603–25. http://dx.doi.org/10.1017/s1351324916000152.

Pełny tekst źródła
Streszczenie:
AbstractIn recent decades, statistical approaches have significantly advanced the development of machine translation systems. However, the applicability of these methods directly depends on the availability of very large quantities of parallel data. Recent works have demonstrated that a comparable corpus can compensate for the shortage of parallel corpora. In this paper, we propose an alternative to comparable corpora containing text documents as resources for extracting parallel data: a multimodal comparable corpus with audio documents in source language and text document in target language,
Style APA, Harvard, Vancouver, ISO itp.
10

Kozlova, A. T. "Temporal Characteristics of Prosody in Imperative Utterances and the Phenomenon of Emphatic Length in the English Language." Bulletin of Kemerovo State University, no. 3 (October 27, 2018): 192–96. http://dx.doi.org/10.21603/2078-8975-2018-3-192-196.

Pełny tekst źródła
Streszczenie:
The paper focuses on one of the most effective factors of linguistic manipulation, i.e. imperative utterance. The subject of the study was direct contact appeals, whose structures corresponded to the literary norms of the English language. The research determined and described the temporal component of imperative prosody. The author employed electro-acoustic, mathematical and statistical methods. The phonetic experiment revealed four prosodic structures, as well as their inter-structural and inter-style levels, the degree of temporal fluctuation and the phenomenon of emphatic length, the latte
Style APA, Harvard, Vancouver, ISO itp.
11

Ling, Xufeng, Jie Yang, Jingxin Liang, Huaizhong Zhu, and Hui Sun. "A Deep-Learning Based Method for Analysis of Students’ Attention in Offline Class." Electronics 11, no. 17 (August 25, 2022): 2663. http://dx.doi.org/10.3390/electronics11172663.

Pełny tekst źródła
Streszczenie:
Students’ actual learning engagement in class, which we call learning attention, is a major indicator used to measure learning outcomes. Obtaining and analyzing students’ attention accurately in offline classes is important empirical research that can improve teachers’ teaching methods. This paper proposes a method to obtain and measure students’ attention in class by applying a variety of deep-learning models and initiatively divides a whole class into a series of time durations, which are categorized into four states: lecturing, interaction, practice, and transcription. After video and audio
Style APA, Harvard, Vancouver, ISO itp.
12

Asgari, Meysam, Robert Gale, Katherine Wild, Hiroko Dodge, and Jeffrey Kaye. "Automatic Assessment of Cognitive Tests for Differentiating Mild Cognitive Impairment: A Proof of Concept Study of the Digit Span Task." Current Alzheimer Research 17, no. 7 (November 16, 2020): 658–66. http://dx.doi.org/10.2174/1567205017666201008110854.

Pełny tekst źródła
Streszczenie:
Background: Current conventional cognitive assessments are limited in their efficiency and sensitivity, often relying on a single score such as the total correct items. Typically, multiple features of response go uncaptured. Objectives: We aim to explore a new set of automatically derived features from the Digit Span (DS) task that address some of the drawbacks in the conventional scoring and are also useful for distinguishing subjects with Mild Cognitive Impairment (MCI) from those with intact cognition. Methods: Audio-recordings of the DS tests administered to 85 subjects (22 MCI and 63 heal
Style APA, Harvard, Vancouver, ISO itp.
13

Woo, MinJae, Prabodh Mishra, Ju Lin, Snigdhaswin Kar, Nicholas Deas, Caleb Linduff, Sufeng Niu, et al. "Complete and Resilient Documentation for Operational Medical Environments Leveraging Mobile Hands-free Technology in a Systems Approach: Experimental Study." JMIR mHealth and uHealth 9, no. 10 (October 12, 2021): e32301. http://dx.doi.org/10.2196/32301.

Pełny tekst źródła
Streszczenie:
Background Prehospitalization documentation is a challenging task and prone to loss of information, as paramedics operate under disruptive environments requiring their constant attention to the patients. Objective The aim of this study is to develop a mobile platform for hands-free prehospitalization documentation to assist first responders in operational medical environments by aggregating all existing solutions for noise resiliency and domain adaptation. Methods The platform was built to extract meaningful medical information from the real-time audio streaming at the point of injury and tran
Style APA, Harvard, Vancouver, ISO itp.
14

Levinson, S. E. "Structural methods in automatic speech recognition." Proceedings of the IEEE 73, no. 11 (1985): 1625–50. http://dx.doi.org/10.1109/proc.1985.13344.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
15

Sun, Don X., and Frederick Jelinek. "Statistical Methods for Speech Recognition." Journal of the American Statistical Association 94, no. 446 (June 1999): 650. http://dx.doi.org/10.2307/2670189.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
16

Rigazio, Luca. "Disciminative clustering methods for automatic speech recognition." Journal of the Acoustical Society of America 114, no. 4 (2003): 1719. http://dx.doi.org/10.1121/1.1627548.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
17

Russell, M. J., R. K. Moore, and M. J. Tomlinson. "Dynamic Programming and Statistical Modelling in Automatic Speech Recognition." Journal of the Operational Research Society 37, no. 1 (January 1986): 21. http://dx.doi.org/10.2307/2582543.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
18

Russell, M. J., R. K. Moore, and M. J. Tomlinson. "Dynamic Programming and Statistical Modelling in Automatic Speech Recognition." Journal of the Operational Research Society 37, no. 1 (January 1986): 21–30. http://dx.doi.org/10.1057/jors.1986.4.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
19

Bourlard, H., and N. Morgan. "Continuous speech recognition by connectionist statistical methods." IEEE Transactions on Neural Networks 4, no. 6 (1993): 893–909. http://dx.doi.org/10.1109/72.286885.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
20

Bojanic, Milana, Vlado Delic, and Milan Secujski. "Relevance of the types and the statistical properties of features in the recognition of basic emotions in speech." Facta universitatis - series: Electronics and Energetics 27, no. 3 (2014): 425–33. http://dx.doi.org/10.2298/fuee1403425b.

Pełny tekst źródła
Streszczenie:
Due to the advance of speech technologies and their increasing usage in various applications, automatic recognition of emotions in speech represents one of the emerging fields in human-computer interaction. This paper deals with several topics related to automatic emotional speech recognition, most notably with the improvement of recognition accuracy by lowering the dimensionality of the feature space and evaluation of the relevance of particular feature types. The research is focused on the classification of emotional speech into five basic emotional classes (anger, joy, fear, sadness and neu
Style APA, Harvard, Vancouver, ISO itp.
21

Kundegorski, Mikolaj, Philip J. B. Jackson, and Bartosz Ziółko. "Two-Microphone Dereverberation for Automatic Speech Recognition of Polish." Archives of Acoustics 39, no. 3 (March 1, 2015): 411–20. http://dx.doi.org/10.2478/aoa-2014-0045.

Pełny tekst źródła
Streszczenie:
Abstract Reverberation is a common problem for many speech technologies, such as automatic speech recognition (ASR) systems. This paper investigates the novel combination of precedence, binaural and statistical independence cues for enhancing reverberant speech, prior to ASR, under these adverse acoustical conditions when two microphone signals are available. Results of the enhancement are evaluated in terms of relevant signal measures and accuracy for both English and Polish ASR tasks. These show inconsistencies between the signal and recognition measures, although in recognition the proposed
Style APA, Harvard, Vancouver, ISO itp.
22

Schultz, Benjamin G., Venkata S. Aditya Tarigoppula, Gustavo Noffs, Sandra Rojas, Anneke van der Walt, David B. Grayden, and Adam P. Vogel. "Automatic speech recognition in neurodegenerative disease." International Journal of Speech Technology 24, no. 3 (May 4, 2021): 771–79. http://dx.doi.org/10.1007/s10772-021-09836-w.

Pełny tekst źródła
Streszczenie:
AbstractAutomatic speech recognition (ASR) could potentially improve communication by providing transcriptions of speech in real time. ASR is particularly useful for people with progressive disorders that lead to reduced speech intelligibility or difficulties performing motor tasks. ASR services are usually trained on healthy speech and may not be optimized for impaired speech, creating a barrier for accessing augmented assistance devices. We tested the performance of three state-of-the-art ASR platforms on two groups of people with neurodegenerative disease and healthy controls. We further ex
Style APA, Harvard, Vancouver, ISO itp.
23

O’Shaughnessy, Douglas. "Invited paper: Automatic speech recognition: History, methods and challenges." Pattern Recognition 41, no. 10 (October 2008): 2965–79. http://dx.doi.org/10.1016/j.patcog.2008.05.008.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
24

O’Shaughnessy, Douglas D., and T. Nagarajan Li. "Better model and decoding methods for automatic speech recognition." Journal of the Acoustical Society of America 119, no. 5 (May 2006): 3441–42. http://dx.doi.org/10.1121/1.4786938.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
25

Debnath, Saswati, and Pinki Roy. "Audio-Visual Automatic Speech Recognition Using PZM, MFCC and Statistical Analysis." International Journal of Interactive Multimedia and Artificial Intelligence 7, no. 2 (2021): 121. http://dx.doi.org/10.9781/ijimai.2021.09.001.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
26

Dashtaki, Parnyan Bahrami. "An Investigation into Methodology and Metrics Employed to Evaluate the (Speech-to-Speech) Way in Translation Systems." Modern Applied Science 11, no. 4 (February 8, 2017): 55. http://dx.doi.org/10.5539/mas.v11n4p55.

Pełny tekst źródła
Streszczenie:
Speech-to-speech translation is a challenging problem, due to poor sentence planning typically associated with spontaneous speech, as well as errors caused by automatic speech recognition. Based upon a statistically trained speech translation system, in this study, we try to investigate methodologies and metrics employed to assess the (speech-to-speech) way in translation systems. The speech translation is performed incrementally based on generation of partial hypotheses from speech recognition. Speech-input translation can be properly approached as a pattern recognition problem by means of st
Style APA, Harvard, Vancouver, ISO itp.
27

Stolcke, Andreas, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, Paul Taylor, Rachel Martin, Carol Van Ess-Dykema, and Marie Meteer. "Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech." Computational Linguistics 26, no. 3 (September 2000): 339–73. http://dx.doi.org/10.1162/089120100561737.

Pełny tekst źródła
Streszczenie:
We describe a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as STATEMENT, Question, BACKCHANNEL, Agreement, Disagreement, and Apology. Our model detects and predicts dialogue acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialogue act sequence. The dialogue model is based on treating the discourse structure of a conversation as a hidden Markov model and the individual dialogue acts as observations emanating from the model states. Constraints on the likely sequence of dialogue act
Style APA, Harvard, Vancouver, ISO itp.
28

Steeneken, Herman J. M., and Andrew Varga. "Assessment for automatic speech recognition: I. Comparison of assessment methods." Speech Communication 12, no. 3 (July 1993): 241–46. http://dx.doi.org/10.1016/0167-6393(93)90094-2.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
29

Hadiwinoto, P. N., and D. P. Lestari. "Data augmentation on spontaneous Indonesian automatic speech recognition using statistical machine translation." IOP Conference Series: Materials Science and Engineering 803 (May 28, 2020): 012030. http://dx.doi.org/10.1088/1757-899x/803/1/012030.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
30

Partila, Pavol, Miroslav Voznak, and Jaromir Tovarek. "Pattern Recognition Methods and Features Selection for Speech Emotion Recognition System." Scientific World Journal 2015 (2015): 1–7. http://dx.doi.org/10.1155/2015/573068.

Pełny tekst źródła
Streszczenie:
The impact of the classification method and features selection for the speech emotion recognition accuracy is discussed in this paper. Selecting the correct parameters in combination with the classifier is an important part of reducing the complexity of system computing. This step is necessary especially for systems that will be deployed in real-time applications. The reason for the development and improvement of speech emotion recognition systems is wide usability in nowadays automatic voice controlled systems. Berlin database of emotional recordings was used in this experiment. Classificatio
Style APA, Harvard, Vancouver, ISO itp.
31

Singh, Satyanand. "High level speaker specific features modeling in automatic speaker recognition system." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 2 (April 1, 2020): 1859. http://dx.doi.org/10.11591/ijece.v10i2.pp1859-1867.

Pełny tekst źródła
Streszczenie:
Spoken words convey several levels of information. At the primary level, the speech conveys words or spoken messages, but at the secondary level, the speech also reveals information about the speakers. This work is based on the high-level speaker-specific features on statistical speaker modeling techniques that express the characteristic sound of the human voice. Using Hidden Markov model (HMM), Gaussian mixture model (GMM), and Linear Discriminant Analysis (LDA) models build Automatic Speaker Recognition (ASR) system that are computational inexpensive can recognize speakers regardless of what
Style APA, Harvard, Vancouver, ISO itp.
32

Skowronski, Mark D., and John G. Harris. "Statistical automatic species identification of microchiroptera from echolocation calls: Lessons learned from human automatic speech recognition." Journal of the Acoustical Society of America 116, no. 4 (October 2004): 2639. http://dx.doi.org/10.1121/1.4808665.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
33

Ding, Ing-Jr, and Yen-Ming Hsu. "An HMM-Like Dynamic Time Warping Scheme for Automatic Speech Recognition." Mathematical Problems in Engineering 2014 (2014): 1–8. http://dx.doi.org/10.1155/2014/898729.

Pełny tekst źródła
Streszczenie:
In the past, the kernel of automatic speech recognition (ASR) is dynamic time warping (DTW), which is feature-based template matching and belongs to the category technique of dynamic programming (DP). Although DTW is an early developed ASR technique, DTW has been popular in lots of applications. DTW is playing an important role for the known Kinect-based gesture recognition application now. This paper proposed an intelligent speech recognition system using an improved DTW approach for multimedia and home automation services. The improved DTW presented in this work, called HMM-like DTW, is esse
Style APA, Harvard, Vancouver, ISO itp.
34

Kawahara, Tatsuya. "Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet)." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 2 (July 22, 2012): 2224–28. http://dx.doi.org/10.1609/aaai.v26i2.18962.

Pełny tekst źródła
Streszczenie:
This article describes a new automatic transcription system in the Japanese Parliament which deploys our automatic speech recognition (ASR) technology. To achieve high recognition performance in spontaneous meeting speech, we have investigated an efficient training scheme with minimal supervision which can exploit a huge amount of real data. Specifically, we have proposed a lightly-supervised training scheme based on statistical language model transformation, which fills the gap between faithful transcripts of spoken utterances and final texts for documentation. Once this mapping is trained, w
Style APA, Harvard, Vancouver, ISO itp.
35

JAFARI, AYYOOB, and FARSHAD ALMASGANJ. "USING NONLINEAR MODELING OF RECONSTRUCTED PHASE SPACE AND FREQUENCY DOMAIN ANALYSIS TO IMPROVE AUTOMATIC SPEECH RECOGNITION PERFORMANCE." International Journal of Bifurcation and Chaos 22, no. 03 (March 2012): 1250053. http://dx.doi.org/10.1142/s0218127412500538.

Pełny tekst źródła
Streszczenie:
This paper introduces a combinational feature extraction approach to improve speech recognition systems. The main idea is to simultaneously benefit from some features obtained from nonlinear modeling applied to speech reconstructed phase space (RPS) and typical Mel frequency Cepstral coefficients (MFCCs) which have a proved role in speech recognition field. With an appropriate dimension, the reconstructed phase space of speech signal is assured to be topologically equivalent to the dynamics of the speech production system, and could therefore include information that may be absent in linear an
Style APA, Harvard, Vancouver, ISO itp.
36

Rojathai, S., and M. Venkatesulu. "Investigation of ANFIS and FFBNN Recognition Methods Performance in Tamil Speech Word Recognition." International Journal of Software Innovation 2, no. 2 (April 2014): 43–53. http://dx.doi.org/10.4018/ijsi.2014040103.

Pełny tekst źródła
Streszczenie:
In speech word recognition systems, feature extraction and recognition plays a most significant role. More number of feature extraction and recognition methods are available in the existing speech word recognition systems. In most recent Tamil speech word recognition system has given high speech word recognition performance with PAC-ANFIS compared to the earlier Tamil speech word recognition systems. So the investigation of speech word recognition by various recognition methods is needed to prove their performance in the speech word recognition. This paper presents the investigation process wi
Style APA, Harvard, Vancouver, ISO itp.
37

Dua, Mohit, Rajesh Kumar Aggarwal, and Mantosh Biswas. "Optimizing Integrated Features for Hindi Automatic Speech Recognition System." Journal of Intelligent Systems 29, no. 1 (October 1, 2018): 959–76. http://dx.doi.org/10.1515/jisys-2018-0057.

Pełny tekst źródła
Streszczenie:
Abstract An automatic speech recognition (ASR) system translates spoken words or utterances (isolated, connected, continuous, and spontaneous) into text format. State-of-the-art ASR systems mainly use Mel frequency (MF) cepstral coefficient (MFCC), perceptual linear prediction (PLP), and Gammatone frequency (GF) cepstral coefficient (GFCC) for extracting features in the training phase of the ASR system. Initially, the paper proposes a sequential combination of all three feature extraction methods, taking two at a time. Six combinations, MF-PLP, PLP-MFCC, MF-GFCC, GF-MFCC, GF-PLP, and PLP-GFCC,
Style APA, Harvard, Vancouver, ISO itp.
38

Liu, Chang, Pengyuan Zhang, Ta Li, and Yonghong Yan. "Semantic Features Based N-Best Rescoring Methods for Automatic Speech Recognition." Applied Sciences 9, no. 23 (November 22, 2019): 5053. http://dx.doi.org/10.3390/app9235053.

Pełny tekst źródła
Streszczenie:
In this work, we aim to re-rank the n-best hypotheses of an automatic speech recognition system by punishing the sentences which have words that are semantically different from the context and rewarding the sentences where all words are in semantical harmony. To achieve this, we proposed a topic similarity score that measures the difference between topic distribution of words and the corresponding sentence. We also proposed another word-discourse score that quantifies the likeliness for a word to appear in the sentence by the inner production of word vector and discourse vector. Besides, we us
Style APA, Harvard, Vancouver, ISO itp.
39

Stern, Richard, and Nelson Morgan. "Hearing Is Believing: Biologically Inspired Methods for Robust Automatic Speech Recognition." IEEE Signal Processing Magazine 29, no. 6 (November 2012): 34–43. http://dx.doi.org/10.1109/msp.2012.2207989.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
40

Deng, Li, and Don X. Sun. "A statistical approach to automatic speech recognition using the atomic speech units constructed from overlapping articulatory features." Journal of the Acoustical Society of America 95, no. 5 (May 1994): 2702–19. http://dx.doi.org/10.1121/1.409839.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
41

Mamyrbayev, Orken, Keylan Alimhan, Dina Oralbekova, Akbayan Bekarystankyzy, and Bagashar Zhumazhanov. "Identifying the influence of transfer learning method in developing an end-to-end automatic speech recognition system with a low data level." Eastern-European Journal of Enterprise Technologies 1, no. 9(115) (February 28, 2022): 84–92. http://dx.doi.org/10.15587/1729-4061.2022.252801.

Pełny tekst źródła
Streszczenie:
Ensuring the best quality and performance of modern speech technologies, today, is possible based on the widespread use of machine learning methods. The idea of this project is to study and implement an end-to-end system of automatic speech recognition using machine learning methods, as well as to develop new mathematical models and algorithms for solving the problem of automatic speech recognition for agglutinative (Turkic) languages. Many research papers have shown that deep learning methods make it easier to train automatic speech recognition systems that use an end-to-end approach. This me
Style APA, Harvard, Vancouver, ISO itp.
42

Raval, Deepang, Vyom Pathak, Muktan Patel, and Brijesh Bhatt. "Improving Deep Learning based Automatic Speech Recognition for Gujarati." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 3 (May 31, 2022): 1–18. http://dx.doi.org/10.1145/3483446.

Pełny tekst źródła
Streszczenie:
We present a novel approach for improving the performance of an End-to-End speech recognition system for the Gujarati language. We follow a deep learning-based approach that includes Convolutional Neural Network, Bi-directional Long Short Term Memory layers, Dense layers, and Connectionist Temporal Classification as a loss function. To improve the performance of the system with the limited size of the dataset, we present a combined language model (Word-level language Model and Character-level language model)-based prefix decoding technique and Bidirectional Encoder Representations from Transfo
Style APA, Harvard, Vancouver, ISO itp.
43

Matveev, Yuri, Anton Matveev, Olga Frolova, Elena Lyakso, and Nersisson Ruban. "Automatic Speech Emotion Recognition of Younger School Age Children." Mathematics 10, no. 14 (July 6, 2022): 2373. http://dx.doi.org/10.3390/math10142373.

Pełny tekst źródła
Streszczenie:
This paper introduces the extended description of a database that contains emotional speech in the Russian language of younger school age (8–12-year-old) children and describes the results of validation of the database based on classical machine learning algorithms, such as Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). The validation is performed using standard procedures and scenarios of the validation similar to other well-known databases of children’s emotional acting speech. Performance evaluation of automatic multiclass recognition on four emotion classes “Neutral (Calm)—
Style APA, Harvard, Vancouver, ISO itp.
44

Liao, Lyuchao, Francis Afedzie Kwofie, Zhifeng Chen, Guangjie Han, Yongqiang Wang, Yuyuan Lin, and Dongmei Hu. "A Bidirectional Context Embedding Transformer for Automatic Speech Recognition." Information 13, no. 2 (January 29, 2022): 69. http://dx.doi.org/10.3390/info13020069.

Pełny tekst źródła
Streszczenie:
Transformers have become popular in building end-to-end automatic speech recognition (ASR) systems. However, transformer ASR systems are usually trained to give output sequences in the left-to-right order, disregarding the right-to-left context. Currently, the existing transformer-based ASR systems that employ two decoders for bidirectional decoding are complex in terms of computation and optimization. The existing ASR transformer with a single decoder for bidirectional decoding requires extra methods (such as a self-mask) to resolve the problem of information leakage in the attention mechanis
Style APA, Harvard, Vancouver, ISO itp.
45

Garberg, Roger B. "Automatic Speech Recognition Applications: A Study of Methods for Defining Command Vocabularies." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 39, no. 3 (October 1995): 203–7. http://dx.doi.org/10.1177/154193129503900307.

Pełny tekst źródła
Streszczenie:
Phoneme-based automatic speech recognition (ASR) technology enables designers to easily create custom command words or phrases that users can employ to request service operations. In this paper, I report results from two experiments concerning important dimensions of these ASR command vocabularies, including command naturalness/appropriateness and command recallability. Ease of recall is a critical dimension for assessing ASR commands used in multi-step applications since service subscribers may be engaged in several different cognitive activities that divide attention. Yet techniques for meas
Style APA, Harvard, Vancouver, ISO itp.
46

Aggarwal, Rajesh Kumar, and Mayank Dave. "Acoustic modeling problem for automatic speech recognition system: conventional methods (Part I)." International Journal of Speech Technology 14, no. 4 (September 23, 2011): 297–308. http://dx.doi.org/10.1007/s10772-011-9108-2.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
47

Haider, Fasih, Pierre Albert, and Saturnino Luz. "User Identity Protection in Automatic Emotion Recognition through Disguised Speech." AI 2, no. 4 (November 25, 2021): 636–49. http://dx.doi.org/10.3390/ai2040038.

Pełny tekst źródła
Streszczenie:
Ambient Assisted Living (AAL) technologies are being developed which could assist elderly people to live healthy and active lives. These technologies have been used to monitor people’s daily exercises, consumption of calories and sleep patterns, and to provide coaching interventions to foster positive behaviour. Speech and audio processing can be used to complement such AAL technologies to inform interventions for healthy ageing by analyzing speech data captured in the user’s home. However, collection of data in home settings presents challenges. One of the most pressing challenges concerns ho
Style APA, Harvard, Vancouver, ISO itp.
48

Pipiras, Laurynas, Rytis Maskeliūnas, and Robertas Damaševičius. "Lithuanian Speech Recognition Using Purely Phonetic Deep Learning." Computers 8, no. 4 (October 18, 2019): 76. http://dx.doi.org/10.3390/computers8040076.

Pełny tekst źródła
Streszczenie:
Automatic speech recognition (ASR) has been one of the biggest and hardest challenges in the field. A large majority of research in this area focuses on widely spoken languages such as English. The problems of automatic Lithuanian speech recognition have attracted little attention so far. Due to complicated language structure and scarcity of data, models proposed for other languages such as English cannot be directly adopted for Lithuanian. In this paper we propose an ASR system for the Lithuanian language, which is based on deep learning methods and can identify spoken words purely from their
Style APA, Harvard, Vancouver, ISO itp.
49

Proksch, Sven-Oliver, Christopher Wratil, and Jens Wäckerle. "Testing the Validity of Automatic Speech Recognition for Political Text Analysis." Political Analysis 27, no. 3 (February 19, 2019): 339–59. http://dx.doi.org/10.1017/pan.2018.62.

Pełny tekst źródła
Streszczenie:
The analysis of political texts from parliamentary speeches, party manifestos, social media, or press releases forms the basis of major and growing fields in political science, not least since advances in “text-as-data” methods have rendered the analysis of large text corpora straightforward. However, a lot of sources of political speech are not regularly transcribed, and their on-demand transcription by humans is prohibitively expensive for research purposes. This class includes political speech in certain legislatures, during political party conferences as well as television interviews and t
Style APA, Harvard, Vancouver, ISO itp.
50

Şchiopu, Daniela. "Using Statistical Methods in a Speech Recognition System for Romanian Language." IFAC Proceedings Volumes 46, no. 28 (2013): 99–103. http://dx.doi.org/10.3182/20130925-3-cz-3023.00078.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!