Academic literature on the topic 'Speech synthesis/recognition'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Speech synthesis/recognition.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Speech synthesis/recognition"

1

Taylor, H. Rosemary. "Book Review: Speech Synthesis and Recognition Systems, Speech Synthesis and Recognition." International Journal of Electrical Engineering & Education 26, no. 4 (1989): 366. http://dx.doi.org/10.1177/002072098902600409.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Jassem, Wiktor. "Speech Synthesis and Recognition." Journal of Phonetics 17, no. 3 (1989): 245–47. http://dx.doi.org/10.1016/s0095-4470(19)30433-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Talkin, David. "Fundamentals of Speech Synthesis and Speech Recognition." Language and Speech 39, no. 1 (1996): 91–94. http://dx.doi.org/10.1177/002383099603900105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sanjay, Gaikwad Vijayendra. "DICTIONARY APPLICATION WITH SPEECH RECOGNITION AND SPEECH SYNTHESIS." International Journal of Advanced Research in Computer Science 9, no. 1 (2018): 27–29. http://dx.doi.org/10.26483/ijarcs.v9i1.5155.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Järvinen, Kari. "Digital speech processing: Speech coding, synthesis, and recognition." Signal Processing 30, no. 1 (1993): 133–34. http://dx.doi.org/10.1016/0165-1684(93)90056-g.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Montlick, Terry. "Combination speech synthesis and recognition apparatus." Journal of the Acoustical Society of America 85, no. 6 (1989): 2693. http://dx.doi.org/10.1121/1.397292.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Rebai, Ilyes, and Yassine BenAyed. "Arabic speech synthesis and diacritic recognition." International Journal of Speech Technology 19, no. 3 (2016): 485–94. http://dx.doi.org/10.1007/s10772-016-9342-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Asiedu Asante, Bismark, and Hiroki Imamura. "Speech Recognition and Speech Synthesis Models for Micro Devices." ITM Web of Conferences 27 (2019): 05001. http://dx.doi.org/10.1051/itmconf/20192705001.

Full text
Abstract:
With the advent and breakthrough of interaction between humans and electronic devices using speech in communication, we have seen a lot of applications using speech recognition and speech synthesis technology. There are some limitations we have identified to these applications. Availability of a lot of resources and internet connectivity have made it possible in making case but with limited resources it is quite difficult to achieve this feat. As a result, it limits the application of the technology into micro devices and deploying them into areas where there are no internet connectivity. In this article, we developed a smaller Deep Neural Network models for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) for communication on micro devices such as Raspberry Pi. We tested and evaluated the models of the system. The accuracy and the performance of the models to be implemented on micro devices shows that they are good for application development in micro devices.
APA, Harvard, Vancouver, ISO, and other styles
9

Janai, Siddhanna, Shreekanth T., Chandan M., and Ajish K. Abraham. "Speech-to-Speech Conversion." International Journal of Ambient Computing and Intelligence 12, no. 1 (2021): 184–206. http://dx.doi.org/10.4018/ijaci.2021010108.

Full text
Abstract:
A novel approach to build a speech-to-speech conversion (STSC) system for individuals with speech impairment dysarthria is described. STSC system takes impaired speech having inherent disturbance as input and produces a synthesized output speech with good pronunciation and noise free utterance. The STSC system involves two stages, namely automatic speech recognition (ASR) and automatic speech synthesis. ASR transforms speech into text, while automatic speech synthesis (or text-to-speech [TTS]) performs the reverse task. At present, the recognition system is developed for a small vocabulary of 50 words and the accuracy of 94% is achieved for normal speakers and 88% for speakers with dysarthria. The output speech of TTS system has achieved a MOS value of 4.5 out of 5 as obtained by averaging the response of 20 listeners. This method of STSC would be an augmentative and alternative communication aid for speakers with dysarthria.
APA, Harvard, Vancouver, ISO, and other styles
10

Pisoni, David B. "A Brief Overview of Speech Synthesis and Recognition Technologies." Proceedings of the Human Factors Society Annual Meeting 30, no. 13 (1986): 1326–30. http://dx.doi.org/10.1177/154193128603001320.

Full text
Abstract:
An overview of several aspects of speech synthesis and recognition technologies is provided as background for subsequent speakers in this session. Specifically, we discuss speech synthesis by rule using automatic text-to-speech conversion and speaker-dependent isolated word recognition. Both of these speech I/O technologies have been developed sufficiently to the point where commercial products are now available for a number of applications. Some of the limitations of these devices are described and suggestions for future research in both synthesis and recognition are outlined.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Speech synthesis/recognition"

1

Sun, Felix (Felix W. ). "Speech Representation Models for Speech Synthesis and Multimodal Speech Recognition." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106378.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (pages 59-63).<br>The field of speech recognition has seen steady advances over the last two decades, leading to the accurate, real-time recognition systems available on mobile phones today. In this thesis, I apply speech modeling techniques developed for recognition to two other speech problems: speech synthesis and multimodal speech recognition with images. In both problems, there is a need to learn a relationship between speech sounds and another source of information. For speech synthesis, I show that using a neural network acoustic model results in a synthesizer that is more tolerant of noisy training data than previous work. For multimodal recognition, I show how information from images can be effectively integrated into the recognition search framework, resulting in improved accuracy when image data is available.<br>by Felix Sun.<br>M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
2

Fekkai, Souhila. "Fractal based speech recognition and synthesis." Thesis, De Montfort University, 2002. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.269246.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Cummings, Kathleen E. "Analysis, synthesis, and recognition of stressed speech." Diss., Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/15673.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

McCulloch, Neil Andrew. "Neural network approaches to speech recognition and synthesis." Thesis, Keele University, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.387255.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Scott, Simon David. "A data-driven approach to visual speech synthesis." Thesis, University of Bath, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.307116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Devaney, Jason Wayne. "A study of articulatory gestures for speech synthesis." Thesis, University of Liverpool, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.284254.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Haque, Serajul. "Perceptual features for speech recognition." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0187.

Full text
Abstract:
Automatic speech recognition (ASR) is one of the most important research areas in the field of speech technology and research. It is also known as the recognition of speech by a machine or, by some artificial intelligence. However, in spite of focused research in this field for the past several decades, robust speech recognition with high reliability has not been achieved as it degrades in presence of speaker variabilities, channel mismatch condi- tions, and in noisy environments. The superb ability of the human auditory system has motivated researchers to include features of human perception in the speech recognition process. This dissertation investigates the roles of perceptual features of human hearing in automatic speech recognition in clean and noisy environments. Methods of simplified synaptic adaptation and two-tone suppression by companding are introduced by temporal processing of speech using a zero-crossing algorithm. It is observed that a high frequency enhancement technique such as synaptic adaptation performs better in stationary Gaussian white noise, whereas a low frequency enhancement technique such as the two-tone sup- pression performs better in non-Gaussian non-stationary noise types. The effects of static compression on ASR parametrization are investigated as observed in the psychoacoustic input/output (I/O) perception curves. A method of frequency dependent asymmetric compression technique, that is, higher compression in the higher frequency regions than the lower frequency regions, is proposed. By asymmetric compression, degradation of the spectral contrast of the low frequency formants due to the added compression is avoided. A novel feature extraction method for ASR based on the auditory processing in the cochlear nucleus is presented. The processings for synchrony detection, average discharge (mean rate) processing and the two tone suppression are segregated and processed separately at the feature extraction level according to the differential processing scheme as observed in the AVCN, PVCN and the DCN, respectively, of the cochlear nucleus. It is further observed that improved ASR performances can be achieved by separating the synchrony detection from the synaptic processing. A time-frequency perceptual spectral subtraction method based on several psychoacoustic properties of human audition is developed and evaluated by an ASR front-end. An auditory masking threshold is determined based on these psychoacoustic e?ects. It is observed that in speech recognition applications, spec- tral subtraction utilizing psychoacoustics may be used for improved performance in noisy conditions. The performance may be further improved if masking of noise by the tonal components is augmented by spectral subtraction in the masked region.
APA, Harvard, Vancouver, ISO, and other styles
8

Peters, Richard Alan II. "A LINEAR PREDICTION CODING MODEL OF SPEECH (SYNTHESIS, LPC, COMPUTER, ELECTRONIC)." Thesis, The University of Arizona, 1985. http://hdl.handle.net/10150/291240.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Benkrid, A. "Real time TLM vocal tract modelling." Thesis, University of Nottingham, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.352958.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Hu, Hongwei. "Towards an improved model of dynamics for speech recognition and synthesis." Thesis, University of Birmingham, 2012. http://etheses.bham.ac.uk//id/eprint/3704/.

Full text
Abstract:
This thesis describes the research on the use of non-linear formant trajectories to model speech dynamics under the framework of a multiple-level segmental hidden Markov model (MSHMM). The particular type of intermediate-layer model investigated in this study is based on the 12-dimensional parallel formant synthesiser (PFS) control parameters, which can be directly used to synthesise speech with a formant synthesiser. The non-linear formant trajectories are generated by using the speech parameter generation algorithm proposed by Tokuda and colleagues. The performance of the newly developed non-linear trajectory model of dynamics is tested against the piecewise linear trajectory model in both speech recognition and speech synthesis. In speech synthesis experiments, the 12 PFS control parameters and their time derivatives are used as the feature vectors in the HMM-based text-to-speech system. The human listening test and objective test results show that, despite the low overall quality of the synthetic speech, the non-linear trajectory model of dynamics can significantly improve the intelligibility and naturalness of the synthetic speech. Moreover, the generated non-linear formant trajectories match actual formant trajectories in real human speech fairly well. The \(\char{cmmi10}{0x4e}\)-best list rescoring paradigm is employed for the speech recognition experiments. Both context-independent and context-dependent MSHMMs, based on different formant-to-acoustic mapping schemes, are used to rescore an \(\char{cmmi10}{0x4e}\)-best list. The rescoring results show that the introduction of the non-linear trajectory model of formant dynamics results in statistically significant improvement under certain mapping schemes. In addition, the smoothing in the non-linear formant trajectories has been shown to be able to account for contextual effects such as coarticulation.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Speech synthesis/recognition"

1

Wendy, Holmes, ed. Speech synthesis and recognition. 2nd ed. Taylor & Francis, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Speech Synthesis and Recognition. Taylor & Francis Inc, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Holmes, J. N. Speech synthesis and recognition. 2nd ed. Taylor & Francis, 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Speech synthesis and recognition. Van Nostrand Reinhold, 1987.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Yannakoudakis, E. J. Speech synthesis and recognition systems. E. Horwood, 1987.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Schroeder, Manfred R. Computer Speech: Recognition, Compression, Synthesis. Springer Berlin Heidelberg, 2004.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Schroeder, Manfred R. Computer Speech: Recognition, Compression, Synthesis. Springer Berlin Heidelberg, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

United States. Social Security Administration. Technology Assessment and Forecasting Group. ADP voice technology: Speech recognition and speech synthesis. U.S. Social Security Administration, 1985.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

United States. Social Security Administration. Technology Assessment and Forecasting Group. ADP voice technology: Speech recognition and speech synthesis. U.S. Social Security Administration, 1985.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

İnce, A. Nejat. Digital Speech Processing: Speech Coding, Synthesis and Recognition. Springer US, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Speech synthesis/recognition"

1

Granström, Björn. "Multi-modal Speech Synthesis with Applications." In Speech Processing, Recognition and Artificial Neural Networks. Springer London, 1999. http://dx.doi.org/10.1007/978-1-4471-0845-0_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Deshmukh, Om D. "Embedded Automatic Speech Recognition and Text-To-Speech Synthesis." In Speech in Mobile and Pervasive Environments. John Wiley & Sons, Ltd, 2012. http://dx.doi.org/10.1002/9781119961710.ch3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Frison, Patrice, and Patrice Quinton. "Systolic Architectures for Connected Speech Recognition." In New Systems and Architectures for Automatic Speech Recognition and Synthesis. Springer Berlin Heidelberg, 1985. http://dx.doi.org/10.1007/978-3-642-82447-0_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Khomitsevich, Olga, Valentin Mendelev, Natalia Tomashenko, Sergey Rybin, Ivan Medennikov, and Saule Kudubayeva. "A Bilingual Kazakh-Russian System for Automatic Speech Recognition and Synthesis." In Speech and Computer. Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23132-7_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dettweiler, Helmut, and Wolfang Hess. "Concatenation Rules for Demisyllable Speech Synthesis." In New Systems and Architectures for Automatic Speech Recognition and Synthesis. Springer Berlin Heidelberg, 1985. http://dx.doi.org/10.1007/978-3-642-82447-0_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bisiani, Roberto. "Computer Systems for High-Performance Speech Recognition." In New Systems and Architectures for Automatic Speech Recognition and Synthesis. Springer Berlin Heidelberg, 1985. http://dx.doi.org/10.1007/978-3-642-82447-0_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Siegert, Ingo, Alicia Flores Lotz, Olga Egorow, and Andreas Wendemuth. "Improving Speech-Based Emotion Recognition by Using Psychoacoustic Modeling and Analysis-by-Synthesis." In Speech and Computer. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-66429-3_44.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Suen, Ching Y., and Stephen B. Stein. "Synthesis of Speech by Computers and Chips." In New Systems and Architectures for Automatic Speech Recognition and Synthesis. Springer Berlin Heidelberg, 1985. http://dx.doi.org/10.1007/978-3-642-82447-0_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Vintsiuk, Taras K. "Generative Models for Automatic Speech Recognition, Understanding and Synthesis." In Speech Processing, Recognition and Artificial Neural Networks. Springer London, 1999. http://dx.doi.org/10.1007/978-1-4471-0845-0_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Vidal, Enrique, Francisco Casacuberta, Emilio Sanchis, and Jose M. Benedi. "A General Fuzzy-Parsing Scheme for Speech Recognition." In New Systems and Architectures for Automatic Speech Recognition and Synthesis. Springer Berlin Heidelberg, 1985. http://dx.doi.org/10.1007/978-3-642-82447-0_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Speech synthesis/recognition"

1

ElAarag, Hala, and Laura Schindler. "A speech recognition and synthesis tool." In the 44th annual southeast regional conference. ACM Press, 2006. http://dx.doi.org/10.1145/1185448.1185459.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Takashina, Masashi, Shingo Kuroiwa, Satoru Tsuge, and Fuji Ren. "Speech bandwidth extension method using speech recognition and speech synthesis." In 2006 International Conference on Communication Technology. IEEE, 2006. http://dx.doi.org/10.1109/icct.2006.341940.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wan, Moquan, Gilles Degottex, and Mark J. F. Gales. "Integrated speaker-adaptive speech synthesis." In 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2017. http://dx.doi.org/10.1109/asru.2017.8269006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Bargmann, Robert, Volker Blanz, and Hans-Peter Seidel. "A nonlinear viseme model for triphone-based speech synthesis." In Gesture Recognition (FG). IEEE, 2008. http://dx.doi.org/10.1109/afgr.2008.4813362.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Al Bawab, Ziad, Bhiksha Raj, and Richard M. Stern. "Analysis-by-synthesis features for speech recognition." In ICASSP 2008. IEEE International Conference on Acoustic, Speech and Signal Processes. IEEE, 2008. http://dx.doi.org/10.1109/icassp.2008.4518577.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Benoit, C. "Synthesis and automatic recognition of audio-visual speech." In IEE Colloquium on Integrated Audio-Visual Processing for Recognition, Synthesis and Communication. IEE, 1996. http://dx.doi.org/10.1049/ic:19961145.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Liu, Alexander H., Tao Tu, Hung-yi Lee, and Lin-shan Lee. "Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning." In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053571.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Cheng En and Huang Jing. "Application of speech recognition and synthesis on underwater acoustic speech transmission." In Proceedings of 2003 International Conference on Neural Networks and Signal Processing. IEEE, 2003. http://dx.doi.org/10.1109/icnnsp.2003.1280739.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Saito, Tatsuhiko, Takashi Nose, Takao Kobayashi, Yohei Okato, and Akio Horii. "Performance prediction of speech recognition using average-voice-based speech synthesis." In Interspeech 2011. ISCA, 2011. http://dx.doi.org/10.21437/interspeech.2011-366.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Matthews, I. A. "Scale based features for audiovisual speech recognition." In IEE Colloquium on Integrated Audio-Visual Processing for Recognition, Synthesis and Communication. IEE, 1996. http://dx.doi.org/10.1049/ic:19961152.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Speech synthesis/recognition"

1

Ore, Brian M. Speech Recognition, Articulatory Feature Detection, and Speech Synthesis in Multiple Languages. Defense Technical Information Center, 2009. http://dx.doi.org/10.21236/ada519140.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography