Log in

Relevant bibliographies by topics / Speech-to-Text / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Speech-to-Text.

Dissertations / Theses on the topic 'Speech-to-Text'

Author: Grafiati

Published: 4 June 2021

Last updated: 26 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Speech-to-Text.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Read, Ian Harvey. "Approaches to prosody prediction for text-to-text speech synthesis." Thesis, University of East Anglia, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.436699.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

陳我智 and Ngor-chi Chan. "Text-to-speech conversion for Putonghua." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1990. http://hub.hku.hk/bib/B31209580.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Chan, Ngor-chi. "Text-to-speech conversion for Putonghua /." [Hong Kong : University of Hong Kong], 1990. http://sunzi.lib.hku.hk/hkuto/record.jsp?B12929475.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Romsdorfer, Harald. "Polyglot text to speech synthesis text analysis & prosody control." Aachen Shaker, 2009. http://d-nb.info/993448836/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Mwanyoha, Sadiki Pili 1974. "A speech recognition module for speech-to-text language translation." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9862.

Full text

Abstract:

Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.<br>Includes bibliographical references (leaves 47-48).<br>by Sadiki Pili Mwanyoha.<br>S.B.and M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

6

Watts, Oliver Samuel. "Unsupervised learning for text-to-speech synthesis." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/7982.

Full text

Abstract:

This thesis introduces a general method for incorporating the distributional analysis of textual and linguistic objects into text-to-speech (TTS) conversion systems. Conventional TTS conversion uses intermediate layers of representation to bridge the gap between text and speech. Collecting the annotated data needed to produce these intermediate layers is a far from trivial task, possibly prohibitively so for languages in which no such resources are in existence. Distributional analysis, in contrast, proceeds in an unsupervised manner, and so enables the creation of systems using textual data t

APA, Harvard, Vancouver, ISO, and other styles

7

Vine, Daniel Samuel Gordon. "Time-domain concatenative text-to-speech synthesis." Thesis, Bournemouth University, 1998. http://eprints.bournemouth.ac.uk/351/.

Full text

Abstract:

A concatenation framework for time-domain concatenative speech synthesis (TDCSS) is presented and evaluated. In this framework, speech segments are extracted from CV, VC, CVC and CC waveforms, and abutted. Speech rhythm is controlled via a single duration parameter, which specifies the initial portion of each stored waveform to be output. An appropriate choice of segmental durations reduces spectral discontinuity problems at points of concatenation, thus reducing reliance upon smoothing procedures. For text-to-speech considerations, a segmental timing system is described, which predicts segmen

APA, Harvard, Vancouver, ISO, and other styles

8

SOLEWICZ, JOSE ALBERTO. "TEXT-TO-SPEECH SYNTHESIS FOR BRAZILIAN PORTUGUESE." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 1993. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=8690@1.

Full text

Abstract:

Este trabalho apresenta um sistema de síntese de voz a partir de texto irrestrito para a língua portuguesa falada no Brasil. O sistema é baseado na técnica de concatenação, por regras, de unidades de voz previamente codificadas. Propõe-se um inventário de unidades de síntese extremamente reduzido (149 unidades) composto, basicamente, por transições consoante-vogal (CV), que representam segmentos acústicos cruciais no processo de produção da fala. Mostrou-se ser possível produzir voz altamente inteligível através da concatenação destas unidades. É proposto, também, o uso de um modelo

APA, Harvard, Vancouver, ISO, and other styles

9

Kullmann, Emelie. "Speech to Text for Swedish using KALDI." Thesis, KTH, Optimeringslära och systemteori, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-189890.

Full text

Abstract:

The field of speech recognition has during the last decade left the re- search stage and found its way in to the public market. Most computers and mobile phones sold today support dictation and transcription in a number of chosen languages. Swedish is often not one of them. In this thesis, which is executed on behalf of the Swedish Radio, an Automatic Speech Recognition model for Swedish is trained and the performance evaluated. The model is built using the open source toolkit Kaldi. Two approaches of training the acoustic part of the model is investigated. Firstly, using Hidden Markov Model

APA, Harvard, Vancouver, ISO, and other styles

10

Romsdorfer, Harald [Verfasser]. "Polyglot Text-to-Speech Synthesis : Text Analysis & Prosody Control / Harald Romsdorfer." Aachen : Shaker, 2009. http://d-nb.info/1156517354/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Garrido, Almiñana Juan María. "Modelling Spanish Intonation for Text-to-Speech Applications." Doctoral thesis, Universitat Autònoma de Barcelona, 1996. http://hdl.handle.net/10803/4885.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Micallef, Paul. "A text to speech synthesis system for Maltese." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/842702/.

Full text

Abstract:

The subject of this thesis covers a considerably varied multidisciplinary area which needs to be addressed to be able to achieve a text-to-speech synthesis system of high quality, in any language. This is the first time that such a system has been built for Maltese, and therefore, there was the additional problem of no computerised sources or corpora. However many problems and much of the system designs are common to all languages. This thesis focuses on two general problems. The first is that of automatic labelling of phonemic data, since this is crucial for the setting up of Maltese speech c

APA, Harvard, Vancouver, ISO, and other styles

13

Camargo, Ana Paula Leite de. "A ação vocal nos leitores text-to-speech." Pontifícia Universidade Católica de São Paulo, 2012. https://tede2.pucsp.br/handle/handle/18107.

Full text

Abstract:

Made available in DSpace on 2016-04-29T14:23:12Z (GMT). No. of bitstreams: 1 Ana Paula Leite de Camargo.pdf: 1657880 bytes, checksum: 4e40f38dc007b1827b6fbeeb8d5f2a21 (MD5) Previous issue date: 2012-10-18<br>This paper aims to investigate the main characteristics of the vocal action (intonations of speech) and its possible application in a text-to-speech software (screen readers) according to the score of the voice in Gayotto (2002). Thus, we observed: orality in Classical Greece and the transformation that literacy gave the Western world supported by the speech of McLuhan; the evolution of

APA, Harvard, Vancouver, ISO, and other styles

14

Monaghan, Alexander Ian Campbell. "Intonation in a text-to-speech conversion system." Thesis, University of Edinburgh, 1991. http://hdl.handle.net/1842/20023.

Full text

Abstract:

This thesis presents the development and implementation of a set of rules to generate intonational specifications for unrestricted text. The theoretical assumptions which motivate this work are outlined, and the performance of the rules is discussed with reference to various test corpora and formal evaluation experiments. The development of our rules is seen as a cycle involving the implementation of theoretical ideas about intonation in a text-to-speech conversion system, the testing of that implementation against some relevant body of data, and the refinement of the theory on the basis of th

APA, Harvard, Vancouver, ISO, and other styles

15

Larreategui, Mikel. "High-quality text-to-speech synthesis using sinusoidal techniques." Thesis, Staffordshire University, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.309790.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Guennec, David. "Study of unit selection text-to-speech synthesis algorithms." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S055/document.

Full text

Abstract:

La synthèse de la parole par corpus (sélection d'unités) est le sujet principal de cette thèse. Tout d'abord, une analyse approfondie et un diagnostic de l'algorithme de sélection d'unités (algorithme de recherche dans le treillis d'unités) sont présentés. L'importance de l'optimalité de la solution est discutée et une nouvelle mise en œuvre de la sélection basée sur un algorithme A* est présenté. Trois améliorations de la fonction de coût sont également présentées. La première est une nouvelle façon – dans le coût cible – de minimiser les différences spectrales en sélectionnant des séquences

APA, Harvard, Vancouver, ISO, and other styles

17

Rivera, Perez Jean F. "The Use of Text-to-Speech to Teach Vocabulary to English Language Learners." University of Cincinnati / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470753301.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Badino, Leonardo. "Identifying prosodic prominence patterns for English text-to-speech synthesis." Thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/4744.

Full text

Abstract:

This thesis proposes to improve and enrich the expressiveness of English Text-to-Speech (TTS) synthesis by identifying and generating natural patterns of prosodic prominence. In most state-of-the-art TTS systems the prediction from text of prosodic prominence relations between words in an utterance relies on features that very loosely account for the combined effects of syntax, semantics, word informativeness and salience, on prosodic prominence. To improve prosodic prominence prediction we first follow up the classic approach in which prosodic prominence patterns are flattened into binary seq

APA, Harvard, Vancouver, ISO, and other styles

19

Cohen, Aaron Seth 1974. "Automatic generation of fundamental frequency for text-to-speech synthesis." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43501.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.<br>Includes bibliographical references (p. 82-86).<br>by Aaron Seth Cohen.<br>M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

20

Rousseau, Francois. "Design of an advanced Text-To-Speech system for Afrikaans." Master's thesis, University of Cape Town, 2006. http://hdl.handle.net/11427/5112.

Full text

Abstract:

Word processed copy.<br>Includes bibliographical references (leaves 87-92).<br>Afrikaans is the home language to approximately six million people in South Africa. The need for an Afrikaans TTS system comes with the growing interest in integrating speech technology in all eleven languages of the country. The ultimate goal here is to enable communication between man and machine using speech. This can be achieved with the use of speech technology by implementing multilingual technological systems that all the people in South Africa can understand and relate to. Understandability, flexibility, nat

APA, Harvard, Vancouver, ISO, and other styles

21

Kulkarni, Ajinkya. "Expressivity transfer in deep learning based text-to-speech synthesis." Electronic Thesis or Diss., Université de Lorraine, 2022. http://www.theses.fr/2022LORR0122.

Full text

Abstract:

Bien que la synthèse de parole à partir du texte ait connu ces dernières années un immense succès dans le domaine de l'interaction homme-machine, les systèmes actuels sont perçus comme monotones en raison de l'absence d'expressivité. L'expressivité dans la parole réfère généralement aux caractéristiques suprasegmentales représentées par les émotions, les styles d'expression, les gestes et expressions faciales, etc. Une synthèse vocale expressive devrait permettre d'améliorer considérablement l'expérience utilisateur avec les machines. Le développement d'un système de synthèse de parole express

APA, Harvard, Vancouver, ISO, and other styles

22

Pollard, Matthew Peter. "Waveform interpolation methods for pitch and time-scale modification of speech." Thesis, University of Liverpool, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.263905.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Valentini, Botinhão Cássia. "Intelligibility enhancement of synthetic speech in noise." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/8877.

Full text

Abstract:

Speech technology can facilitate human-machine interaction and create new communication interfaces. Text-To-Speech (TTS) systems provide speech output for dialogue, notification and reading applications as well as personalized voices for people that have lost the use of their own. TTS systems are built to produce synthetic voices that should sound as natural, expressive and intelligible as possible and if necessary be similar to a particular speaker. Although naturalness is an important requirement, providing the correct information in adverse conditions can be crucial to certain applications.

APA, Harvard, Vancouver, ISO, and other styles

24

Sullivan, Kirk Patrick Haig. "Synthesis-by-analogy : a psychologically-motivated approach to computer text-to-speech conversion." Thesis, University of Southampton, 1992. https://eprints.soton.ac.uk/250078/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Muldoon, Paul. "Processing of English text with a view to automatic speech synthesis." Thesis, Queen's University Belfast, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.329543.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Slott, Jordan Matthew. "A general platform and markup language for text to speech synthesis." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/38811.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Falai, Alessio. "Conditioning Text-to-Speech synthesis on dialect accent: a case study." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25805/.

Full text

Abstract:

Modern text-to-speech systems are modular in many different ways. In recent years, end-users gained the ability to control speech attributes such as degree of emotion, rhythm and timbre, along with other suprasegmental features. More ambitious objectives are related to modelling a combination of speakers and languages, e.g. to enable cross-speaker language transfer. Though, no prior work has been done on the more fine-grained analysis of regional accents. To fill this gap, in this thesis we present practical end-to-end solutions to synthesise speech while controlling within-country variations

APA, Harvard, Vancouver, ISO, and other styles

28

Odéjobí, Odétùnjí A. "A computational model of prosody for Yorøbá text-to-speech synthesis." Thesis, Aston University, 2005. http://publications.aston.ac.uk/10683/.

Full text

Abstract:

This work examines prosody modelling for the Standard Yorøbá (SY) language in the context of computer text-to-speech synthesis applications. The thesis of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combines acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. Our prosody model is conceptualised around a modular holistic framework. The framework is implemented using the Relational Tree (R-Tree) techniques (Ehrich and Foith, 1976). R-Tree is a sophisticated data

APA, Harvard, Vancouver, ISO, and other styles

29

Breitenbücher, Mark. "Textvorverarbeitung zur deutschen Version des Festival Text-to-Speech Synthese Systems." [S.l.] : Universität Stuttgart , Fakultät Philosophie, 1997. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB6783514.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Engell, Trond Bøe. "TaleTUC: Text-to-Speech and Other Enhancements to Existing Bus Route Information Systems." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2012. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-18920.

Full text

Abstract:

As smartphone sales increase, the demand for content for these devices alsoincreases. Service providers that want to reach out to as many users as possibleneed to create smartphone applications that satisfy people that do not fall intothe "normal user" category. People that require non-visual feedback, such asvisually impaired persons, need output in form of auditory signals. Text-tospeechsynthesis provides this functionality, giving the smartphone the abilityto convey messages in the form of speech.This thesis describes TaleTUC: Text-to-speech, a proof of concept text-to-speechsyste

APA, Harvard, Vancouver, ISO, and other styles

31

Baloyi, Ntsako. "A text-to-speech synthesis system for Xitsonga using hidden Markov models." Thesis, University of Limpopo (Turfloop Campus), 2012. http://hdl.handle.net/10386/1021.

Full text

Abstract:

Thesis (M.Sc. (Computer Science) --University of Limpopo, 2013<br>This research study focuses on building a general-purpose working Xitsonga speech synthesis system that is as far as can be possible reasonably intelligible, natural sounding, and flexible. The system built has to be able to model some of the desirable speaker characteristics and speaking styles. This research project forms part of the broader national speech technology project that aims at developing spoken language systems for human-machine interaction using the eleven official languages of South Africa (SA). Speech synthesis

APA, Harvard, Vancouver, ISO, and other styles

32

Cohen, Andrew Dight. "The use of learnable phonetic representations in connectionist text-to-speech system." Thesis, University of Reading, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360787.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Low, Phuay Hui. "Statistical analysis, modelling and synthesis of voice for text to speech synthesis." Thesis, Brunel University, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.401342.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

OdeÌjobiÌ, OdeÌtuÌnjiÌ AÌ€jaÌ€diÌ. "A computational model of prosody for YoruÌbaÌ text-to-speech synthesis." Thesis, Aston University, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.420173.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Lameris, Harm. "Homograph Disambiguation and Diacritization for Arabic Text-to-Speech Using Neural Networks." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446509.

Full text

Abstract:

Pre-processing Arabic text for Text-to-Speech (TTS) systems poses major challenges, as Arabic omits short vowels in writing. This omission leads to a large number of homographs, and means that Arabic text needs to be diacritized to disambiguate these homographs, in order to be matched up with the intended pronunciation. Diacritizing Arabic has generally been achieved by using rule-based, statistical, or hybrid methods that combine rule-based and statistical methods. Recently, diacritization methods involving deep learning have shown promise in reducing error rates. These deep-learning methods

APA, Harvard, Vancouver, ISO, and other styles

36

Igeland, Viktor. "Generating Facial Animation With Emotions In A Neural Text-To-Speech Pipeline." Thesis, Linköpings universitet, Medie- och Informationsteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160535.

Full text

Abstract:

This thesis presents the work of incorporating facial animation with emotions into a neural text-to-speech pipeline. The project aims to allow for a digital human to utter sentences given only text, removing the need for video input. Our solution consists of a neural network able to generate blend shape weights from speech which is placed in a neural text-to-speech pipeline. We build on ideas from previous work and implement a recurrent neural network using four LSTM layers and later extend this implementation by incorporating emotions. The emotions are learned by the network itself via the em

APA, Harvard, Vancouver, ISO, and other styles

37

Shukla, Sunil Ravindra. "Improving High Quality Concatenative Text-to-Speech Using the Circular Linear Prediction Model." Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/14481.

Full text

Abstract:

Current high quality text-to-speech (TTS) systems are based on unit selection from a large database that is both contextually and prosodically rich. These systems, albeit capable of natural voice quality, are computationally expensive and require a very large footprint. Their success is attributed to the dramatic reduction of storage costs in recent times. However, for many TTS applications a smaller footprint is becoming a standard requirement. This thesis presents a new method for representing speech segments that can improve the quality and/or reduce the footprint current concatenative TTS

APA, Harvard, Vancouver, ISO, and other styles

38

Schlünz, Georg Isaac. "The effects of part–of–speech tagging on text–to–speech synthesis for resource–scarce languages / G.I. Schlünz." Thesis, North-West University, 2010. http://hdl.handle.net/10394/4944.

Full text

Abstract:

In the world of human language technology, resource–scarce languages (RSLs) suffer from the problem of little available electronic data and linguistic expertise. The Lwazi project in South Africa is a large–scale endeavour to collect and apply such resources for all eleven of the official South African languages. One of the deliverables of the project is more natural text–to–speech (TTS) voices. Naturalness is primarily determined by prosody and it is shown that many aspects of prosodic modelling is, in turn, dependent on part–of–speech (POS) information. Solving the POS problem is, therefore,

APA, Harvard, Vancouver, ISO, and other styles

39

deVille, Camille Rae. "Effect of digital highlighting on reading comprehension given text-to-speech technology for people with aphasia." Miami University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=miami158629144312354.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Mhlana, Siphe. "Development of isiXhosa text-to-speech modules to support e-Services in marginalized rural areas." Thesis, University of Fort Hare, 2011. http://hdl.handle.net/10353/495.

Full text

Abstract:

Information and Communication Technology (ICT) projects are being initiated and deployed in marginalized areas to help improve the standard of living for community members. This has lead to a new field, which is responsible for information processing and knowledge development in rural areas, called Information and Communication Technology for Development (ICT4D). An ICT4D projects has been implemented in a marginalized area called Dwesa; this is a rural area situated in the wild coast of the former homelandof Transkei, in the Eastern Cape Province of South Africa. In this rural community there

APA, Harvard, Vancouver, ISO, and other styles

41

Mohasi, Lehlohonolo. "Prosody modelling for a Sesotho text-to-speech system using the Fujisaki model." Thesis, Stellenbosch : Stellenbosch University, 2015. http://hdl.handle.net/10019.1/97050.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Swart, Philippa H. "Prosodic features of imperatives in Xhosa : implications for a text-to-speech system." Thesis, Stellenbosch : Stellenbosch University, 2000. http://hdl.handle.net/10019.1/51891.

Full text

Abstract:

Thesis (MA)--University of Stellenbosch, 2000.<br>ENGLISH ABSTRACT: This study focuses on the prosodic features of imperatives and the role of prosodies in the development of a text-to-speech (TIS) system for Xhosa, an African tone language. The perception of prosody is manifested in suprasegmental features such as fundamental frequency (pitch), intensity (loudness) and duration (length). Very little experimental research has been done on the prosodic features of any grammatical structures (moods and tenses) in Xhosa, therefore it has not yet been determined how and to what degree the d

APA, Harvard, Vancouver, ISO, and other styles

43

Mohasi, Lehlohonolo. "Design of an advanced and fluent Sesotho text-to-speech system through intonation." Master's thesis, University of Cape Town, 2006. http://hdl.handle.net/11427/5155.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Atterer, Michaela. "Experiments on the prediction of prosodic phrasing for german text to speech synthesis /." Stuttgart : Univ., AIMS, 2005. http://swbplus.bsz-bw.de/bsz116719958abs.htm.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Lai, Chun Han, and 賴俊翰. "A Python Implementation of Automatic Speech-text Synchronization Using Speech Recognition and Text-to-Speech Technology." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/53806441331969263004.

Full text

Abstract:

碩士<br>長庚大學<br>資訊工程學系<br>103<br>With the advent of the global village, "language learning" has become an important issue. Now, the variety of language ability is an indicator of competitiveness. Especially the listening and speaking ability are considered more important. In this study, we establish a method to create speech and text synchronized audiobooks with “speech recognition” and “cloud text-to-speech” technology. The user can prepare his own arbitrary articles to create the learning materials for "Shadowing technique" with this method. Besides, the materials are made by "word-level" spee

APA, Harvard, Vancouver, ISO, and other styles

46

Tsai, Yu-lin, and 蔡育霖. "Taiwanese Text-to-Speech forAncient Poems." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/zpe42k.

Full text

Abstract:

碩士<br>國立臺灣海洋大學<br>資訊工程學系<br>106<br>There are many studies dealing with Taiwanese, but not yet a text-to-speech for ancient poems. This thesis wants to propose a system which can read Chinese ancient poems in Taiwanese. We will start from the poems in Tang Dynasty, for these data are what we have found in the Internet, and the number of words and sentences are more regular. This thesis deals with two major issue in reading Chinese ancient poems. The first issue is pronunciation choosing, especially when a Chinese character has more than one reading provided in a dictionary. The second issue

APA, Harvard, Vancouver, ISO, and other styles

47

WANG, SHAO-CHUAN, and 王紹全. "Mandarin End-To-End Text-To-Speech Research." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/budcna.

Full text

Abstract:

碩士<br>國立中央大學<br>資訊工程學系在職專班<br>107<br>Speech synthesis refers to the technique of synthesizing text into speech,In the past a speech synthesis system usually has multiple stages of processing, and it also related to phonetics, acoustics or other related domain knowledge, which creates high technical threshold. Due to the advancement of hardware technology in recent years, the deep learning methods based on neural network architecture have been widely used by researchers recently. This paper also applies deep learning technology to text-to-speech. (TTS) system , by using End-To-End speech synthe

APA, Harvard, Vancouver, ISO, and other styles

48

Huang, Wei Jay, and 黃偉杰. "An Application of Speech-Text Alignment in Speech Recognition to the Creation of Audio-Books with Speech-Text Synchronization." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/56499100456762437913.

Full text

Abstract:

碩士<br>長庚大學<br>資訊工程學系<br>100<br>This thesis is the use of speech-text alignment within the speech recognition technology to establish speech-text synchronized with audio books function. To handle part of the use of speech technology to HTK. HTK is a Cambridge University of developed a set of free software for speech recognition and speech training, but the HTK still has a restriction on duration of speech. Can not handle the speech of a long article, so we found a SailAlign. SailAlign is Southern California University in January 2011 to develop a set of software. Stressed that it would deal wit

APA, Harvard, Vancouver, ISO, and other styles

49

Shih, Wen-Li, and 石文俐. "Prosodic Modeling for Mandarin Text-To-Speech." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/80358440579150855515.

Full text

Abstract:

碩士<br>國立清華大學<br>資訊工程學系<br>94<br>A corpus-based TTS system is likely to have degradation in naturalness due to the acoustic mismatch of between selected synthesis units. Moreover, the collection of the speech corpus is also a labor-intensive task. Therefore, we have developed a carrier-sentence-based TTS system for Mandarin Chinese. Our lab is consistently trying to improve the TTS system such that a balance can be achieved considering synthesis speed, corpus size, and naturalness of the output utterances. In this thesis, several methods that generate the prosodic parameters of a Mandarin TTS s

APA, Harvard, Vancouver, ISO, and other styles

50

Barry, Betsy L. "Transcription as speech-to-text data transformation." 2008. http://purl.galileo.usg.edu/uga%5Fetd/barry%5Fbetsy%5Fl%5F200812%5Fphd.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!