Log in

Relevant bibliographies by topics / Speech synthesis/recognition / Journal articles

To see the other types of publications on this topic, follow the link: Speech synthesis/recognition.

Journal articles on the topic 'Speech synthesis/recognition'

Author: Grafiati

Published: 4 June 2021

Last updated: 7 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Speech synthesis/recognition.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Taylor, H. Rosemary. "Book Review: Speech Synthesis and Recognition Systems, Speech Synthesis and Recognition." International Journal of Electrical Engineering & Education 26, no. 4 (1989): 366. http://dx.doi.org/10.1177/002072098902600409.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Jassem, Wiktor. "Speech Synthesis and Recognition." Journal of Phonetics 17, no. 3 (1989): 245–47. http://dx.doi.org/10.1016/s0095-4470(19)30433-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Talkin, David. "Fundamentals of Speech Synthesis and Speech Recognition." Language and Speech 39, no. 1 (1996): 91–94. http://dx.doi.org/10.1177/002383099603900105.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Sanjay, Gaikwad Vijayendra. "DICTIONARY APPLICATION WITH SPEECH RECOGNITION AND SPEECH SYNTHESIS." International Journal of Advanced Research in Computer Science 9, no. 1 (2018): 27–29. http://dx.doi.org/10.26483/ijarcs.v9i1.5155.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Järvinen, Kari. "Digital speech processing: Speech coding, synthesis, and recognition." Signal Processing 30, no. 1 (1993): 133–34. http://dx.doi.org/10.1016/0165-1684(93)90056-g.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Montlick, Terry. "Combination speech synthesis and recognition apparatus." Journal of the Acoustical Society of America 85, no. 6 (1989): 2693. http://dx.doi.org/10.1121/1.397292.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Rebai, Ilyes, and Yassine BenAyed. "Arabic speech synthesis and diacritic recognition." International Journal of Speech Technology 19, no. 3 (2016): 485–94. http://dx.doi.org/10.1007/s10772-016-9342-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Asiedu Asante, Bismark, and Hiroki Imamura. "Speech Recognition and Speech Synthesis Models for Micro Devices." ITM Web of Conferences 27 (2019): 05001. http://dx.doi.org/10.1051/itmconf/20192705001.

Full text

Abstract:

With the advent and breakthrough of interaction between humans and electronic devices using speech in communication, we have seen a lot of applications using speech recognition and speech synthesis technology. There are some limitations we have identified to these applications. Availability of a lot of resources and internet connectivity have made it possible in making case but with limited resources it is quite difficult to achieve this feat. As a result, it limits the application of the technology into micro devices and deploying them into areas where there are no internet connectivity. In this article, we developed a smaller Deep Neural Network models for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) for communication on micro devices such as Raspberry Pi. We tested and evaluated the models of the system. The accuracy and the performance of the models to be implemented on micro devices shows that they are good for application development in micro devices.

APA, Harvard, Vancouver, ISO, and other styles

9

Janai, Siddhanna, Shreekanth T., Chandan M., and Ajish K. Abraham. "Speech-to-Speech Conversion." International Journal of Ambient Computing and Intelligence 12, no. 1 (2021): 184–206. http://dx.doi.org/10.4018/ijaci.2021010108.

Full text

Abstract:

A novel approach to build a speech-to-speech conversion (STSC) system for individuals with speech impairment dysarthria is described. STSC system takes impaired speech having inherent disturbance as input and produces a synthesized output speech with good pronunciation and noise free utterance. The STSC system involves two stages, namely automatic speech recognition (ASR) and automatic speech synthesis. ASR transforms speech into text, while automatic speech synthesis (or text-to-speech [TTS]) performs the reverse task. At present, the recognition system is developed for a small vocabulary of 50 words and the accuracy of 94% is achieved for normal speakers and 88% for speakers with dysarthria. The output speech of TTS system has achieved a MOS value of 4.5 out of 5 as obtained by averaging the response of 20 listeners. This method of STSC would be an augmentative and alternative communication aid for speakers with dysarthria.

APA, Harvard, Vancouver, ISO, and other styles

10

Pisoni, David B. "A Brief Overview of Speech Synthesis and Recognition Technologies." Proceedings of the Human Factors Society Annual Meeting 30, no. 13 (1986): 1326–30. http://dx.doi.org/10.1177/154193128603001320.

Full text

Abstract:

An overview of several aspects of speech synthesis and recognition technologies is provided as background for subsequent speakers in this session. Specifically, we discuss speech synthesis by rule using automatic text-to-speech conversion and speaker-dependent isolated word recognition. Both of these speech I/O technologies have been developed sufficiently to the point where commercial products are now available for a number of applications. Some of the limitations of these devices are described and suggestions for future research in both synthesis and recognition are outlined.

APA, Harvard, Vancouver, ISO, and other styles

11

Leyden, Klaske van. "Eric Keller (ed.) Fundamentals of Speech Synthesis and Speech Recognition." Functions of Language 3, no. 1 (1996): 146–47. http://dx.doi.org/10.1075/fol.3.1.14ley.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Taylor, H. Rosemary. "Book Review: Microcomputer Speech Synthesis and Recognition." International Journal of Electrical Engineering & Education 22, no. 1 (1985): 20. http://dx.doi.org/10.1177/002072098502200106.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

King, Robin W. "Speech Signal Analysis, Synthesis and Recognition Exercises Using Matlab." International Journal of Electrical Engineering & Education 34, no. 2 (1997): 161–72. http://dx.doi.org/10.1177/002072099703400206.

Full text

Abstract:

Three MATLAB exercises covering speech signal analysis and principles of linear prediction, formant synthesis and speech recognition are described. These exercises, which are assessed components in an elective course on speech and language processing, enable undergraduate electrical engineering students to explore fundamentally important concepts in speech science and signal processing.

APA, Harvard, Vancouver, ISO, and other styles

14

Tatham, M. A. A. "An integrated knowledge base for speech synthesis and automatic speech recognition." Journal of Phonetics 13, no. 2 (1985): 175–88. http://dx.doi.org/10.1016/s0095-4470(19)30745-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Eriss Eisa Babikir Adam. "Deep Learning based NLP Techniques In Text to Speech Synthesis for Communication Recognition." December 2020 2, no. 4 (2020): 209–15. http://dx.doi.org/10.36548/jscp.2020.4.002.

Full text

Abstract:

The computer system is developing the model for speech synthesis of various aspects for natural language processing. The speech synthesis explores by articulatory, formant and concatenate synthesis. These techniques lead more aperiodic distortion and give exponentially increasing error rate during process of the system. Recently, advances on speech synthesis are tremendously moves towards deep learning process in order to achieve better performance. Due to leverage of large scale data gives effective feature representations to speech synthesis. The main objective of this research article is that implements deep learning techniques into speech synthesis and compares the performance in terms of aperiodic distortion with prior model of algorithms in natural language processing.

APA, Harvard, Vancouver, ISO, and other styles

16

Martinčić–Ipšić, Sanda, Slobodan Ribarić, and Ivo Ipšić. "Acoustic Modelling for Croatian Speech Recognition and Synthesis." Informatica 19, no. 2 (2008): 227–54. http://dx.doi.org/10.15388/informatica.2008.211.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Modi, Rohan. "Transcript Anatomization with Multi-Linguistic and Speech Synthesis Features." International Journal for Research in Applied Science and Engineering Technology 9, no. VI (2021): 1755–58. http://dx.doi.org/10.22214/ijraset.2021.35371.

Full text

Abstract:

Handwriting Detection is a process or potential of a computer program to collect and analyze comprehensible input that is written by hand from various types of media such as photographs, newspapers, paper reports etc. Handwritten Text Recognition is a sub-discipline of Pattern Recognition. Pattern Recognition is refers to the classification of datasets or objects into various categories or classes. Handwriting Recognition is the process of transforming a handwritten text in a specific language into its digitally expressible script represented by a set of icons known as letters or characters. Speech synthesis is the artificial production of human speech using Machine Learning based software and audio output based computer hardware. While there are many systems which convert normal language text in to speech, the aim of this paper is to study Optical Character Recognition with speech synthesis technology and to develop a cost effective user friendly image based offline text to speech conversion system using CRNN neural networks model and Hidden Markov Model. The automated interpretation of text that has been written by hand can be very useful in various instances where processing of great amounts of handwritten data is required, such as signature verification, analysis of various types of documents and recognition of amounts written on bank cheques by hand.

APA, Harvard, Vancouver, ISO, and other styles

18

M Tasbolatov, N. Mekebayev, O. Mamyrbayev, M. Turdalyuly, D. Oralbekova,. "Algorithms and architectures of speech recognition systems." Psychology and Education Journal 58, no. 2 (2021): 6497–501. http://dx.doi.org/10.17762/pae.v58i2.3182.

Full text

Abstract:

Digital processing of speech signal and the voice recognition algorithm is very important for fast and accurate automatic scoring of the recognition technology. A voice is a signal of infinite information. The direct analysis and synthesis of a complex speech signal is due to the fact that the information is contained in the signal. Speech is the most natural way of communicating people. The task of speech recognition is to convert speech into a sequence of words using a computer program. This article presents an algorithm of extracting MFCC for speech recognition. The MFCC algorithm reduces the processing power by 53% compared to the conventional algorithm. Automatic speech recognition using Matlab.

APA, Harvard, Vancouver, ISO, and other styles

19

Kurematsu, Akira, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, and Kiyohiro Shikano. "ATR Japanese speech database as a tool of speech recognition and synthesis." Speech Communication 9, no. 4 (1990): 357–63. http://dx.doi.org/10.1016/0167-6393(90)90011-w.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Pisoni, David B., and Howard C. Nusbaum. "Developing Methods for Assessing the Performance of Speech Synthesis and Recognition Systems." Proceedings of the Human Factors Society Annual Meeting 30, no. 13 (1986): 1344–48. http://dx.doi.org/10.1177/154193128603001324.

Full text

Abstract:

As speech I/O technology develops and improves, there is an increased need for standardized methods to systematically assess the performance of these systems. At the present time, speech synthesis and speech recognition technologies are at different levels of maturation and, accordingly, the procedures for testing the performance of these systems are at different stages of development. In the present paper, we describe the results of testing several text-to-speech systems using traditional intelligibility measures. In addition, we outline the design and philosophy of an automated testing procedure for measuring the performance of isolated utterance speaker-dependent speech recognition systems.

APA, Harvard, Vancouver, ISO, and other styles

21

Dashtaki, Parnyan Bahrami. "An Investigation into Methodology and Metrics Employed to Evaluate the (Speech-to-Speech) Way in Translation Systems." Modern Applied Science 11, no. 4 (2017): 55. http://dx.doi.org/10.5539/mas.v11n4p55.

Full text

Abstract:

Speech-to-speech translation is a challenging problem, due to poor sentence planning typically associated with spontaneous speech, as well as errors caused by automatic speech recognition. Based upon a statistically trained speech translation system, in this study, we try to investigate methodologies and metrics employed to assess the (speech-to-speech) way in translation systems. The speech translation is performed incrementally based on generation of partial hypotheses from speech recognition. Speech-input translation can be properly approached as a pattern recognition problem by means of statistical alignment models and stochastic finite-state transducers. Under this general framework, some specific models are presented. One of the features of such models is their capability of automatically learning from training examples. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. In this research, we want explore methodologies and metrics employed to assess the (speech-to-speech) way in translation systems.

APA, Harvard, Vancouver, ISO, and other styles

22

Terashima, Ryuta, Takayoshi Yoshimura, Toshihiro Wakita, Keiichi Tokuda, and Tadashi Kitamura. "Prediction Method of Speech Recognition Performance Based on HMM-based Speech Synthesis Technique." IEEJ Transactions on Electronics, Information and Systems 130, no. 4 (2010): 557–64. http://dx.doi.org/10.1541/ieejeiss.130.557.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Stevens, Kenneth N. "Understanding variability in speech: A requisite for advances in speech synthesis and recognition." Journal of the Acoustical Society of America 100, no. 4 (1996): 2634. http://dx.doi.org/10.1121/1.417755.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Pobar, Miran, Sanda Martinčić-Ipšić, and Ivo Ipšić. "OPTIMIZATION OF COST FUNCTION WEIGHTS FOR UNIT SELECTION SPEECH SYNTHESIS USING SPEECH RECOGNITION." Neural Network World 22, no. 5 (2012): 429–41. http://dx.doi.org/10.14311/nnw.2012.22.026.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Valencia, C. Roncancio, J. Gomez Garcia-Bermejo, and E. Zalama Casanova. "Combined Gesture-Speech Recognition and Synthesis Using Neural Networks." IFAC Proceedings Volumes 41, no. 2 (2008): 2968–73. http://dx.doi.org/10.3182/20080706-5-kr-1001.00499.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Singla, S. K., and R. K. Yadav. "Optical Character Recognition Based Speech Synthesis System Using LabVIEW." Journal of Applied Research and Technology 12, no. 5 (2014): 919–26. http://dx.doi.org/10.1016/s1665-6423(14)70598-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

King, Simon. "Dependence and independence in automatic speech recognition and synthesis." Journal of Phonetics 31, no. 3-4 (2003): 407–11. http://dx.doi.org/10.1016/j.wocn.2003.09.002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Asada, Taro, Ruka Adachi, Syuhei Takada, Yasunari Yoshitomi, and Masayoshi Tabuse. "Facial Expression Synthesis Using Vowel Recognition for Synthesized Speech." Proceedings of International Conference on Artificial Life and Robotics 25 (January 13, 2020): 398–402. http://dx.doi.org/10.5954/icarob.2020.os16-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Vojtech, Jennifer M., Michael D. Chan, Bhawna Shiwani, et al. "Surface Electromyography–Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech." Journal of Speech, Language, and Hearing Research 64, no. 6S (2021): 2134–53. http://dx.doi.org/10.1044/2021_jslhr-20-00257.

Full text

Abstract:

Purpose This study aimed to evaluate a novel communication system designed to translate surface electromyographic (sEMG) signals from articulatory muscles into speech using a personalized, digital voice. The system was evaluated for word recognition, prosodic classification, and listener perception of synthesized speech. Method sEMG signals were recorded from the face and neck as speakers with ( n = 4) and without ( n = 4) laryngectomy subvocally recited (silently mouthed) a speech corpus comprising 750 phrases (150 phrases with variable phrase-level stress). Corpus tokens were then translated into speech via personalized voice synthesis ( n = 8 synthetic voices) and compared against phrases produced by each speaker when using their typical mode of communication ( n = 4 natural voices, n = 4 electrolaryngeal [EL] voices). Naïve listeners ( n = 12) evaluated synthetic, natural, and EL speech for acceptability and intelligibility in a visual sort-and-rate task, as well as phrasal stress discriminability via a classification mechanism. Results Recorded sEMG signals were processed to translate sEMG muscle activity into lexical content and categorize variations in phrase-level stress, achieving a mean accuracy of 96.3% ( SD = 3.10%) and 91.2% ( SD = 4.46%), respectively. Synthetic speech was significantly higher in acceptability and intelligibility than EL speech, also leading to greater phrasal stress classification accuracy, whereas natural speech was rated as the most acceptable and intelligible, with the greatest phrasal stress classification accuracy. Conclusion This proof-of-concept study establishes the feasibility of using subvocal sEMG-based alternative communication not only for lexical recognition but also for prosodic communication in healthy individuals, as well as those living with vocal impairments and residual articulatory function. Supplemental Material https://doi.org/10.23641/asha.14558481

APA, Harvard, Vancouver, ISO, and other styles

30

Dhanalakshmi, M., T. A. Mariya Celin, T. Nagarajan, and P. Vijayalakshmi. "Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System." Circuits, Systems, and Signal Processing 37, no. 2 (2017): 674–703. http://dx.doi.org/10.1007/s00034-017-0567-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Yamaguchi, Hiraki, Nobuo Sugi, Hiroshi Masumoto, and Hiroshi Furuya. "Speech Recognition and Speech Synthesis Unit and Application for Lighting Equipment in TV Studio." Journal of the Institute of Television Engineers of Japan 43, no. 6 (1989): 613–19. http://dx.doi.org/10.3169/itej1978.43.613.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Ciobanu, Dragoş, Valentina Ragni, and Alina Secară. "Speech Synthesis in the Translation Revision Process: Evidence from Error Analysis, Questionnaire, and Eye-Tracking." Informatics 6, no. 4 (2019): 51. http://dx.doi.org/10.3390/informatics6040051.

Full text

Abstract:

Translation revision is a relevant topic for translator training and research. Recent technological developments justify increased focus on embedding speech technologies—speech synthesis (text-to-speech) and speech recognition (speech-to-text)—into revision workflows. Despite some integration of speech recognition into computer-assisted translation (CAT)/translation environment tools (TEnT)/Revision tools, to date we are unaware of any CAT/TEnT/Revision tool that includes speech synthesis. This paper addresses this issue by presenting initial results of a case study with 11 participants exploring if and how the presence of sound, specifically in the source text (ST), affects revisers’ revision quality, preference and viewing behaviour. Our findings suggest an improvement in revision quality, especially regarding Accuracy errors, when sound was present. The majority of participants preferred listening to the ST while revising, but their self-reported gains on concentration and productivity were not conclusive. For viewing behaviour, a subset of eye-tracking data shows that participants focused more on the target text (TT) than the source regardless of the revising condition, though with differences in fixation counts, dwell time and mean fixation duration (MDF). Orientation and finalisation phases were also identified. Finally, speech synthesis appears to increase perceived alertness, and may prompt revisers to consult external resources more frequently.

APA, Harvard, Vancouver, ISO, and other styles

33

O'Shaughnessy, Douglas, Louis Barbeau, David Bernardi, and Danièle Archambault. "Diphone speech synthesis." Speech Communication 7, no. 1 (1988): 55–65. http://dx.doi.org/10.1016/0167-6393(88)90021-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Vintsuk, T. K., M. M. Sazhok, R. A. Selukh, D. Ya Fedorin, O. A. Yukhimenko, and V. V. Robeyko. ". Automatic recognition, understanding and synthesis of speech signals in Ukraine." Upravlâûŝie sistemy i mašiny, no. 6 (278) (December 2018): 7–24. http://dx.doi.org/10.15407/usim.2018.06.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

O'Shaughnessy, D. "Interacting with computers by voice: automatic speech recognition and synthesis." Proceedings of the IEEE 91, no. 9 (2003): 1272–305. http://dx.doi.org/10.1109/jproc.2003.817117.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Hogden, John, David Nix, Vincent Gracco, and Philip Rubin. "Stochastic word models for articulatorily constrained speech recognition and synthesis." Journal of the Acoustical Society of America 103, no. 5 (1998): 2774. http://dx.doi.org/10.1121/1.421411.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Rebai, Ilyes, and Yassine BenAyed. "Text-to-speech synthesis system with Arabic diacritic recognition system." Computer Speech & Language 34, no. 1 (2015): 43–60. http://dx.doi.org/10.1016/j.csl.2015.04.002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Zhou, Feng Yu, Jin Huan Li, Guo Hui Tian, Bao Ye Song, and Cai Hong Li. "Research and Implementation of Embedded Voice Interaction System Based on ARM in Intelligent Space." Advanced Materials Research 433-440 (January 2012): 5620–27. http://dx.doi.org/10.4028/www.scientific.net/amr.433-440.5620.

Full text

Abstract:

This paper presents the design and implementation of voice interaction system based on ARM in service robots intelligent space. ARM Cortex-M3 core based STM32F103 is used for main controller. And real-time embedded operating system μC/OS-II is used to schedule different tasks and manage peripheral devices. LD3320 and XFS4041CN are used for speech recognition and speech synthesis, respectively. Establish a dialogue set by defining the set of two-dimensional array, and then intelligent space updates the dialogue set dynamically via ZigBee wireless network for multiple scenes. If the recognition result is got by speech recognition module, dialogue management module will send corresponding texts to speech synthesis module, enabling text-to-speech output. At the same time, intelligent space decision support system will send corresponding commands to equipments by ZigBee. From lots of experiments and practical applications, we can conclude that voice interaction systems designed in the paper can gracefully satisfy the current request of voice interaction in service robots intelligent space, having great application value.

APA, Harvard, Vancouver, ISO, and other styles

39

Kitazume, Yoshiaki, Takeyuki Endo, and Toshio Kamimura. "Word Recognition, Speech Analysis and Synthesis System Using for a Large Vocabulary Speech Recognition LSI and High Speed Signal Processor." IEEJ Transactions on Electronics, Information and Systems 108, no. 10 (1988): 850–57. http://dx.doi.org/10.1541/ieejeiss1987.108.10_850.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Li, Jing, Ting Xu, and Nan Yan Shen. "Design and Implementation of Voice Control System in Flexible Manufacturing Cell." Applied Mechanics and Materials 415 (September 2013): 9–13. http://dx.doi.org/10.4028/www.scientific.net/amm.415.9.

Full text

Abstract:

On the voice development platform of Microsoft Speech SDK, speech recognition and speech synthesis modules based on command control mode is built in this paper. Ethernet-based remote voice control system of intelligent flexible manufacturing cell is developed for machine tools and industrial robots. This paper designs an intelligent voice control system based on LabVIEW development environment, which realizes the human-machine voice interaction of flexible manufacturing cell and remote voice control. Experimental studies have shown that intelligent voice control system has high speech recognition rate and system reliability in a relatively quiet environment.

APA, Harvard, Vancouver, ISO, and other styles

41

Li, Wei, Bai Hui Cui, Fa Wei Zhang, and Xing Guo. "A Smart Home System Based on Speech Recognition Technology." Applied Mechanics and Materials 713-715 (January 2015): 2123–25. http://dx.doi.org/10.4028/www.scientific.net/amm.713-715.2123.

Full text

Abstract:

In order to enhance the self-care ability of persons with disabilities and satisfy people's demand for intelligent control home appliance, a smart home system based on Microsoft speech synthesis and speech recognition technology is proposed. After the initialization of the system, it receive voice commands which send by users, while the system distinguish the voice signal, the system will call voice feedback module and request users to confirm the voice instructions, after users’ confirmation, the system will record the voice command and convert it to electrical control code which can be recognized by the general household appliances’ control system. One voice command recognition time was within 30 ms and one speech interaction process was within 3 second which shows the simply and efficiently control for appliances based on speech recognition technology.

APA, Harvard, Vancouver, ISO, and other styles

42

Sorin, C., D. Jouvet, C. Gagnoulet, D. Dubois, D. Sadek, and M. Toularhoat. "Operational and experimental French telecommunication services using CNET speech recognition and text-to-speech synthesis." Speech Communication 17, no. 3-4 (1995): 273–86. http://dx.doi.org/10.1016/0167-6393(95)00035-m.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Werner, Lauren, Gaojian Huang, and Brandon J. Pitts. "Automated Speech Recognition Systems and Older Adults: A Literature Review and Synthesis." Proceedings of the Human Factors and Ergonomics Society Annual Meeting 63, no. 1 (2019): 42–46. http://dx.doi.org/10.1177/1071181319631121.

Full text

Abstract:

The number of older adults is growing significantly worldwide. At the same time, technological developments are rapidly evolving, and older populations are expected to interact more frequently with such sophisticated systems. Automated speech recognition (ASR) systems is an example of one technology that is increasingly present in daily life. However, age-related physical changes may alter speech production and limit the effectiveness of ASR systems for older individuals. The goal of this paper was to summarize the current knowledge on ASR systems and older adults. The PRISMA method was employed and 17 studies were compared on the basis of word error rate (WER). Overall, WER was found to be influenced by age, gender, and the number of speech samples used to train ASR systems. This work has implications for the development of future human-machine technologies that will be used by a wide range of age groups.

APA, Harvard, Vancouver, ISO, and other styles

44

Korsun, O. N., and A. V. Poliyev. "Optimal pattern synthesis for speech recognition based on principal component analysis." IOP Conference Series: Materials Science and Engineering 312 (February 2018): 012014. http://dx.doi.org/10.1088/1757-899x/312/1/012014.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Hakoda, Kazuo, Mikio Kitai, and Shigeki Sagayama. "Speech recognition and synthesis technology development at NTT for telecommunications services." International Journal of Speech Technology 2, no. 2 (1997): 145–53. http://dx.doi.org/10.1007/bf02208826.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Rao, P. V. S. "VOICE: An integrated speech recognition synthesis system for the Hindi language." Speech Communication 13, no. 1-2 (1993): 197–205. http://dx.doi.org/10.1016/0167-6393(93)90071-r.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Tejaswi, Ch V. "Virtual Voice Assistant." International Journal for Research in Applied Science and Engineering Technology 9, no. VI (2021): 4097–101. http://dx.doi.org/10.22214/ijraset.2021.35868.

Full text

Abstract:

This is desktop application which can assist people with basic tasks using natural language. Virtual Voice Assistants can go online and search for an answer to a user’s question. Actions can be triggered using text or voice. Voice is the key. A virtual voice assistant is a personal assistant which uses natural language processing (NLP) , voice recognition and speech synthesis to provide a service through a particular application. Natural Language Processing in short is called as NLP. It is basically a branch of artificial intelligence which mainly deals with the interaction between personal computers and human beings using the natural language. The main objective of NLP is to read, convert, understand, and make use of the human languages in a manner that is valuable. Voice recognition is a hardware device or computer software program with the potential to decode the voice of human beings. Voice recognition is usually used to operate a gadget, execute commands, or write without making use of any mouse, keyboard, or press any buttons. Artificial production of human speech is called as Speech Synthesis. A system used for this purpose is called a speech computer or speech synthesizer and can be implemented in many products of software’s and hardware’s.

APA, Harvard, Vancouver, ISO, and other styles

48

Slaby, Christina A., Craig S. Hartley, and Robert Pulliam. "Speech Recognition, Speech Synthesis and Touch-Panel Technology Applied to the Control of Remotely Piloted Spacecraft." Proceedings of the Human Factors Society Annual Meeting 29, no. 12 (1985): 1140–43. http://dx.doi.org/10.1177/154193128502901214.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Gupta, Manish, Shambhu Shankar Bharti, and Suneeta Agarwal. "Gender-based speaker recognition from speech signals using GMM model." Modern Physics Letters B 33, no. 35 (2019): 1950438. http://dx.doi.org/10.1142/s0217984919504384.

Full text

Abstract:

Speech is a convenient medium for communication among human beings. Speaker recognition is a process of automatically recognizing the speaker by processing the information included in the speech signal. In this paper, a new approach is proposed for speaker recognition through speech signal. Here, a two-level approach is proposed. In the first-level, the gender of the speaker is recognized, and in the second-level speaker is recognized based on recognized gender at first-level. After recognizing the gender of the speaker, search space is reduced to half for the second-level as speaker recognition system searches only in a set of speech signals belonging to identified gender. To identify gender, gender-specific features: Mel Frequency Cepstral Coefficients (MFCC) and pitch are used. Speaker is recognized by using speaker specific features: MFCC, Pitch and RASTA-PLP. Support Vector Machine (SVM) and Gaussian Mixture Model (GMM) classifiers are used for identifying the gender and recognizing the speaker, respectively. Experiments are performed on speech signals of two databases: “IIT-Madras speech synthesis and recognition” (containing speech samples spoken by eight male and eight female speakers of eight different regions in English language) and “ELSDSR” (containing speech samples spoken by five male and five female in English language). Experimentally, it is observed that by using two-level approach, time taken for speaker recognition is reduced by 30–32% as compared to the approach when speaker is recognized without identifying the gender (single-level approach). The accuracy of speaker recognition in this proposed approach is also improved from 99.7% to 99.9% as compared to single-level approach. It is concluded through the experiments that speech signal of a minimum 1.12 duration (after neglecting silence parts) is sufficient for recognizing the speaker.

APA, Harvard, Vancouver, ISO, and other styles

50

Delic, Vlado, Darko Pekar, Radovan Obradovic, and Milan Secujski. "Speech signal processing in ASR&TTS algorithms." Facta universitatis - series: Electronics and Energetics 16, no. 3 (2003): 355–64. http://dx.doi.org/10.2298/fuee0303355d.

Full text

Abstract:

Speech signal processing and modeling in systems for continuous speech recognition and Text-to-Speech synthesis in Serbian language are described in this paper. Both systems are fully developed by the authors and do not use any third party software. Accuracy of the speech recognizer and intelligibility of the TTS system are in the range of the best solutions in the world, and all conditions are met for commercial use of these solutions.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!