Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Synthesis of texts.

Rozprawy doktorskie na temat „Synthesis of texts”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Synthesis of texts”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.

1

Gilmore, Ian. "An abstract configuration of the epistemology of potentiality paradigm therapy : a qualitative meta-synthesis of theoretical texts." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/an-abstract-configuration-of-the-epistemology-of-potentiality-paradigm-therapy-a-qualitative-metasynthesis-of-theoretical-texts(cfe211bb-a414-4e27-ae0a-4348efc04aed).html.

Pełny tekst źródła
Streszczenie:
The first step that I took in preparing myself to undertake what is in essence a piece of epistemological research was to divide the psychological therapies into two: the potentiality paradigm and the pathology paradigm. The former is based upon the potentiality model articulated by person-centred theorists like Dave Mearns and Brian Thorne, which is essentially a growth model, whilst the latter reflects a form of therapy that recognises people according to what may be considered 'wrong with' or 'deficient about' them, such as operates in the disciplines of medicine and clinical psychology. The main focus of this piece of research was to determine the epistemology that is at work with what actually goes on in the practice of potentiality paradigm therapy. In order to achieve this, I set about identifying, reading, analysing and eventually coding the most epistemologically rich writings that I could find from mainstream authors on potentiality paradigm therapy from the professional and the academic literature. It became clear from this analysis that the heart of what was actually going on in the practice of potentiality paradigm therapy as articulated in these theoretical writings could be coded into three main discourses: an experiential discourse, a relational discourse and a hermeneutic discourse, each of which I have considered to represent an epistemological discourse for the purposes of this piece of research. My next question was to ask myself how these discourses set about articulating the potentiality paradigm with respect to the practice of the psychological therapies, and the answer came back that they articulated the potentiality paradigm best when they worked concertedly rather than discretely. Indeed, it soon became apparent that the human brain integrates and synthesises the data that it receives by way of these three central discourses, and so it seemed only appropriate that I should work towards expressing these findings by creating a qualitative meta-synthesis of these three discourses: the experiential, the relational and the hermeneutic, which is exactly what I did. The epistemological mechanism by which these three discourses are integrated and synthesised needs to reflect the way in which the human brain integrates and synthesises the data that it receives, and the name given to this epistemological mechanism is dialectical constructivism. This is included along with the three epistemological discourses - the experiential, the relational and the hermeneutic - in the creative and interpretive synthesis in which this piece of research culminates, and is followed by an illustrative worked example showing how these discourses articulate the potentiality paradigm - concertedly - with respect to the practice of the psychological therapies. One of the advantages of applying this meta-model to the way in which we look at potentiality paradigm therapy is that it may be used to free us up to practice in the more dialogical ways which have been becoming increasingly favoured by practitioners in recent times. With our view of potentiality paradigm therapy mediated by this meta-model, we may find it easier to traverse across what many practitioners have tended to view as theoretical boundaries. It could also be viewed as a move towards a more functional and less structural form of governance or regulation, as expressed by Mearns and Thorne.
Style APA, Harvard, Vancouver, ISO itp.
2

Read, Ian Harvey. "Approaches to prosody prediction for text-to-text speech synthesis." Thesis, University of East Anglia, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.436699.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Romsdorfer, Harald. "Polyglot text to speech synthesis text analysis & prosody control." Aachen Shaker, 2009. http://d-nb.info/993448836/04.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Le, Goff Bertrand. "Synthèse à partir du texte de visage 3D parlant français." Grenoble INPG, 1997. http://www.theses.fr/1997INPG0140.

Pełny tekst źródła
Streszczenie:
Les recherches presentees dans cette these sont axees sur la bimodalite de la parole. Afin de disposer d'un outil de recherche sur la parole visuelle, un synthetiseur visuel de parole a ete developpe pour le francais. Il permet de predire les commandes temporelles d'un modele de visage a partir d'une entree phonetique. Dans un premier temps, nous presentons le modele de visage que nous avons adapte afin qu'il puisse etre anime par des parametres directement mesurables sur la face et le profil d'un locuteur de reference. La qualite de la modelisation du visage a ete evaluee par un ensemble de tests de perception. Puis, nous avons dresse une liste des differents modeles permettant de trouver une solution au probleme essentiel de la parole : la coarticulation. L'approche que nous avons choisie s'appuie sur le principe de fonctions de dominance qui reproduisent temporellement l'influence de la production de chaque unite phonetique sur ses voisines. Une methodologie - generalisable a d'autres langues - a ete elaboree afin de determiner automatiquement les coefficients caracteristiques de ces fonctions de dominance a partir des donnees mesurees sur un locuteur de reference. Cette synthese visuelle a ete synchronisee avec un synthetiseur acoustique, permettant ainsi l'animation audiovisuelle du modele de visage a partir d'un texte quelconque en francais. Cette synthese audiovisuelle a ete evaluee a travers plusieurs tests. Une comparaison quantitative des trajectoires des parametres produits par le synthetiseur visuel a ete faite avec les trajectoires observees sur le locuteur de reference. Le synthetiseur visuel a egalement ete evalue en termes d'intelligibilite, et compare a l'intelligibilite du meme modele de visage commande par analyse/synthese. Cette evaluation a montre que l'intelligibilite du modele anime par le synthetiseur visuel est equivalente a celle du modele anime par analyse/synthese.
Style APA, Harvard, Vancouver, ISO itp.
5

Watts, Oliver Samuel. "Unsupervised learning for text-to-speech synthesis." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/7982.

Pełny tekst źródła
Streszczenie:
This thesis introduces a general method for incorporating the distributional analysis of textual and linguistic objects into text-to-speech (TTS) conversion systems. Conventional TTS conversion uses intermediate layers of representation to bridge the gap between text and speech. Collecting the annotated data needed to produce these intermediate layers is a far from trivial task, possibly prohibitively so for languages in which no such resources are in existence. Distributional analysis, in contrast, proceeds in an unsupervised manner, and so enables the creation of systems using textual data that are not annotated. The method therefore aids the building of systems for languages in which conventional linguistic resources are scarce, but is not restricted to these languages. The distributional analysis proposed here places the textual objects analysed in a continuous-valued space, rather than specifying a hard categorisation of those objects. This space is then partitioned during the training of acoustic models for synthesis, so that the models generalise over objects' surface forms in a way that is acoustically relevant. The method is applied to three levels of textual analysis: to the characterisation of sub-syllabic units, word units and utterances. Entire systems for three languages (English, Finnish and Romanian) are built with no reliance on manually labelled data or language-specific expertise. Results of a subjective evaluation are presented.
Style APA, Harvard, Vancouver, ISO itp.
6

Vine, Daniel Samuel Gordon. "Time-domain concatenative text-to-speech synthesis." Thesis, Bournemouth University, 1998. http://eprints.bournemouth.ac.uk/351/.

Pełny tekst źródła
Streszczenie:
A concatenation framework for time-domain concatenative speech synthesis (TDCSS) is presented and evaluated. In this framework, speech segments are extracted from CV, VC, CVC and CC waveforms, and abutted. Speech rhythm is controlled via a single duration parameter, which specifies the initial portion of each stored waveform to be output. An appropriate choice of segmental durations reduces spectral discontinuity problems at points of concatenation, thus reducing reliance upon smoothing procedures. For text-to-speech considerations, a segmental timing system is described, which predicts segmental durations at the word level, using a timing database and a pattern matching look-up algorithm. The timing database contains segmented words with associated duration values, and is specific to an actual inventory of concatenative units. Segmental duration prediction accuracy improves as the timing database size increases. The problem of incomplete timing data has been addressed by using `default duration' entries in the database, which are created by re-categorising existing timing data according to articulation manner. If segmental duration data are incomplete, a default duration procedure automatically categorises the missing speech segments according to segment class. The look-up algorithm then searches the timing database for duration data corresponding to these re-categorised segments. The timing database is constructed using an iterative synthesis/adjustment technique, in which a `judge' listens to synthetic speech and adjusts segmental durations to improve naturalness. This manual technique for constructing the timing database has been evaluated. Since the timing data is linked to an expert judge's perception, an investigation examined whether the expert judge's perception of speech naturalness is representative of people in general. Listening experiments revealed marked similarities between an expert judge's perception of naturalness and that of the experimental subjects. It was also found that the expert judge's perception remains stable over time. A synthesis/adjustment experiment found a positive linear correlation between segmental durations chosen by an experienced expert judge and duration values chosen by subjects acting as expert judges. A listening test confirmed that between 70% and 100% intelligibility can be achieved with words synthesised using TDCSS. In a further test, a TDCSS synthesiser was compared with five well-known text-to-speech synthesisers, and was ranked fifth most natural out of six. An alternative concatenation framework (TDCSS2) was also evaluated, in which duration parameters specify both the start point and the end point of the speech to be extracted from a stored waveform and concatenated. In a similar listening experiment, TDCSS2 stimuli were compared with five well-known text-tospeech synthesisers, and were ranked fifth most natural out of six.
Style APA, Harvard, Vancouver, ISO itp.
7

SOLEWICZ, JOSE ALBERTO. "TEXT-TO-SPEECH SYNTHESIS FOR BRAZILIAN PORTUGUESE." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 1993. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=8690@1.

Pełny tekst źródła
Streszczenie:
Este trabalho apresenta um sistema de síntese de voz a partir de texto irrestrito para a língua portuguesa falada no Brasil. O sistema é baseado na técnica de concatenação, por regras, de unidades de voz previamente codificadas. Propõe-se um inventário de unidades de síntese extremamente reduzido (149 unidades) composto, basicamente, por transições consoante-vogal (CV), que representam segmentos acústicos cruciais no processo de produção da fala. Mostrou-se ser possível produzir voz altamente inteligível através da concatenação destas unidades. É proposto, também, o uso de um modelo CELP como estrutura de compressão e síntese do inventário de unidades, incluindo as adaptações necessárias para as alterações prosódicas do sinal no momento de sua codificação. Resultados de testes auditivos mostraram que a síntese através do modelo CELP proposto é superior àquela obtida através do Vocoder-LPC (excitação mono- pulso/ruído) usualmente empregado nos sistemas de síntese de voz a partir de texto.<br>This work presents na unrestricted text-to-speech synthesis system for brazilian portuguese. The system is based on the concatenation by rules of previously coded speech units. An extremely reduced set of synthesis units (149) is proposed. This set is mostly comprised of consonant-vowel (CV) transitions, which represent crucial acoustic segments in the speech production process. Production of highly intelligible speech is show to be possible through concatenation of these units. A CELP model is also proposed as a compression and synthesis structure, which includes necessary adaptations in order to modify the speech prosody during its decoding phase. Subjective tests showed that speech synthesized through the proposed CELP model is judged superior to that obtained through an LPC Vocoder (mono-pulse/noise excited), which is traditionally used in text-to-speech synthesis systems.
Style APA, Harvard, Vancouver, ISO itp.
8

Romsdorfer, Harald [Verfasser]. "Polyglot Text-to-Speech Synthesis : Text Analysis & Prosody Control / Harald Romsdorfer." Aachen : Shaker, 2009. http://d-nb.info/1156517354/34.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Low, Phuay Hui. "Statistical analysis, modelling and synthesis of voice for text to speech synthesis." Thesis, Brunel University, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.401342.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Muldoon, Paul. "Processing of English text with a view to automatic speech synthesis." Thesis, Queen's University Belfast, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.329543.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
11

Micallef, Paul. "A text to speech synthesis system for Maltese." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/842702/.

Pełny tekst źródła
Streszczenie:
The subject of this thesis covers a considerably varied multidisciplinary area which needs to be addressed to be able to achieve a text-to-speech synthesis system of high quality, in any language. This is the first time that such a system has been built for Maltese, and therefore, there was the additional problem of no computerised sources or corpora. However many problems and much of the system designs are common to all languages. This thesis focuses on two general problems. The first is that of automatic labelling of phonemic data, since this is crucial for the setting up of Maltese speech corpora, which in turn can be used to improve the system. A novel way of achieving such automatic segmentation was investigated. This uses a mixed parameter model with maximum likelihood training of the first derivative of the features across a set of phonetic class boundaries. It was found that this gives good results even for continuous speech provided that a phonemic labelling of the text is available. A second general problem is that of segment concatenation, since the end and beginning of subsequent diphones can have mismatches in amplitude, frequency, phase and spectral envelope. The use of-intermediate frames, build up from the last and first frames of two concatenated diphones, to achieve a smoother continuity was analysed. The analysis was done both in time and in frequency. The use of wavelet theory for the separation of the spectral envelope from the excitation was also investigated. The linguistic system modules have been built for this thesis. In particular a rule based grapheme to phoneme conversion system that is serial and not hierarchical was developed. The morphological analysis required the design of a system which allowed two dissimilar lexical structures, (semitic and romance) to be integrated into one overall morphological analyser. Appendices at the back are included with detailed rules of the linguistic modules developed. The present system, while giving satisfactory intelligibility, with capability of modifying duration, does not include as yet a prosodic module.
Style APA, Harvard, Vancouver, ISO itp.
12

Sullivan, Kirk Patrick Haig. "Synthesis-by-analogy : a psychologically-motivated approach to computer text-to-speech conversion." Thesis, University of Southampton, 1992. https://eprints.soton.ac.uk/250078/.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
13

Holt, Jay. "1,8-Diarylanthracenes as reagents for asymmetric synthesis." Diss., Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/29902.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
14

Baloyi, Ntsako. "A text-to-speech synthesis system for Xitsonga using hidden Markov models." Thesis, University of Limpopo (Turfloop Campus), 2012. http://hdl.handle.net/10386/1021.

Pełny tekst źródła
Streszczenie:
Thesis (M.Sc. (Computer Science) --University of Limpopo, 2013<br>This research study focuses on building a general-purpose working Xitsonga speech synthesis system that is as far as can be possible reasonably intelligible, natural sounding, and flexible. The system built has to be able to model some of the desirable speaker characteristics and speaking styles. This research project forms part of the broader national speech technology project that aims at developing spoken language systems for human-machine interaction using the eleven official languages of South Africa (SA). Speech synthesis is the reverse of automatic speech recognition (which receives speech as input and converts it to text) in that it receives text as input and produces synthesized speech as output. It is generally accepted that most people find listening to spoken utterances better that reading the equivalent of such utterances. The Xitsonga speech synthesis system has been developed using a hidden Markov model (HMM) speech synthesis method. The HMM-based speech synthesis (HTS) system synthesizes speech that is intelligible, and natural sounding. This method can synthesize speech on a footprint of only a few megabytes of training speech data. The HTS toolkit is applied as a patch to the HTK toolkit which is a hidden Markov model toolkit primarily designed for use in speech recognition to build and manipulate hidden Markov models.
Style APA, Harvard, Vancouver, ISO itp.
15

Mohamadi, Tayeb. "Synthèse à partir du texte de visages parlants : réalisation d'un prototype et mesures d'intelligibilité bimodale." Grenoble INPG, 1993. http://www.theses.fr/1993INPG0010.

Pełny tekst źródła
Streszczenie:
Le but de cette etude est l'analyse geometrique des differentes formes de levres en francais, leur intelligibilite audiovisuelle et la realisation d'un prototype de synthetiseur de visage parlant francais. Dans ce manuscrit, nous retracons d'abord le role des levres dans la production de la parole, et l'apport de leur vision a l'intelligibilite de la parole degradee (une analyse phonetique des confusions des voyelles et des consonnes choisies, a ete faite en parallele), nous presentons les resultats d'une etude de leur geometrie et de leur mouvement qui a permis d'identifier une vingtaine de formes labiales de base appelees visemes. Ensuite, nous presentons un prototype de synthetiseur audiovisuel a partir du texte realise a partir de ce jeu de visemes et son evaluation en intelligibilite. Enfin, nous evaluons l'apport de l'intelligibilite en parole naturelle degradee de deux modeles de levres synthetiques realises a l'icp, avec une comparaison au cas naturel
Style APA, Harvard, Vancouver, ISO itp.
16

Larreategui, Mikel. "High-quality text-to-speech synthesis using sinusoidal techniques." Thesis, Staffordshire University, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.309790.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
17

Guennec, David. "Study of unit selection text-to-speech synthesis algorithms." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S055/document.

Pełny tekst źródła
Streszczenie:
La synthèse de la parole par corpus (sélection d'unités) est le sujet principal de cette thèse. Tout d'abord, une analyse approfondie et un diagnostic de l'algorithme de sélection d'unités (algorithme de recherche dans le treillis d'unités) sont présentés. L'importance de l'optimalité de la solution est discutée et une nouvelle mise en œuvre de la sélection basée sur un algorithme A* est présenté. Trois améliorations de la fonction de coût sont également présentées. La première est une nouvelle façon – dans le coût cible – de minimiser les différences spectrales en sélectionnant des séquences d'unités minimisant un coût moyen au lieu d'unités minimisant chacune un coût cible de manière absolue. Ce coût est testé pour une distance sur la durée phonémique mais peut être appliqué à d'autres distances. Notre deuxième proposition est une fonction de coût cible visant à améliorer l'intonation en se basant sur des coefficients extraits à travers une version généralisée du modèle de Fujisaki. Les paramètres de ces fonctions sont utilisés au sein d'un coût cible. Enfin, notre troisième contribution concerne un système de pénalités visant à améliorer le coût de concaténation. Il pénalise les unités en fonction de classes reposant sur une hiérarchie du degré de risque qu'un artefact de concaténation se produise lors de la concaténation sur un phone de cette classe. Ce système est différent des autres dans la littérature en cela qu'il est tempéré par une fonction floue capable d'adoucir le système de pénalités pour les unités présentant des coûts de concaténation parmi les plus bas de leur distribution<br>This PhD thesis focuses on the automatic speech synthesis field, and more specifically on unit selection. A deep analysis and a diagnosis of the unit selection algorithm (lattice search algorithm) is provided. The importance of the solution optimality is discussed and a new unit selection implementation based on a A* algorithm is presented. Three cost function enhancements are also presented. The first one is a new way – in the target cost – to minimize important spectral differences by selecting sequences of candidate units that minimize a mean cost instead of an absolute one. This cost is tested on a phonemic duration distance but can be applied to others. Our second proposition is a target sub-cost addressing intonation that is based on coefficients extracted through a generalized version of Fujisaki's command-response model. This model features gamma functions modeling F0 called atoms. Finally, our third contribution concerns a penalty system that aims at enhancing the concatenation cost. It penalizes units in function of classes defining the risk a concatenation artifact occurs when concatenating on a phone of this class. This system is different to others in the literature in that it is tempered by a fuzzy function that allows to soften penalties for units presenting low concatenation costs
Style APA, Harvard, Vancouver, ISO itp.
18

Shukla, Sunil Ravindra. "Improving High Quality Concatenative Text-to-Speech Using the Circular Linear Prediction Model." Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/14481.

Pełny tekst źródła
Streszczenie:
Current high quality text-to-speech (TTS) systems are based on unit selection from a large database that is both contextually and prosodically rich. These systems, albeit capable of natural voice quality, are computationally expensive and require a very large footprint. Their success is attributed to the dramatic reduction of storage costs in recent times. However, for many TTS applications a smaller footprint is becoming a standard requirement. This thesis presents a new method for representing speech segments that can improve the quality and/or reduce the footprint current concatenative TTS systems. The circular linear prediction (CLP) model is revisited and combined with the constant pitch transform (CPT) to provide a robust representation of speech signals that allows for limited prosodic movements without a perceivable loss in quality. The CLP model assumes that each frame of voiced speech is an infinitely periodic signal. This assumption allows for LPC modeling using the covariance method, with the efficiency of the autocorrelation method. The CPT is combined with this model to provide a database that is uniform in pitch for matching the target prosody during synthesis. With this representation, limited prosody modifications and unit concatenation can be performed without causing audible artifacts. For resolving artifacts caused by pitch modifications in voicing transitions, a method has been introduced for reducing peakiness in the LP spectra by constraining the line spectral frequencies. Two experiments have been conducted to demonstrate the potential for the capabilities of CLP/CPT method. The first is a listening test to determine the ability of this model to realize prosody modifications without perceivable degradation. Utterances are resynthesized using the CLP/CPT method with emphasized prosodics to increase intelligibility in harsh environments. The second experiment compares the quality of utterances synthesized by unit-selection based limited-domain TTS against the CLP/CPT method. The results demonstrate that the CLP/CPT representation, applied to current concatenative TTS systems, can reduce the size of the database and increase the prosodic richness without noticeable degradation in voice quality.
Style APA, Harvard, Vancouver, ISO itp.
19

Badino, Leonardo. "Identifying prosodic prominence patterns for English text-to-speech synthesis." Thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/4744.

Pełny tekst źródła
Streszczenie:
This thesis proposes to improve and enrich the expressiveness of English Text-to-Speech (TTS) synthesis by identifying and generating natural patterns of prosodic prominence. In most state-of-the-art TTS systems the prediction from text of prosodic prominence relations between words in an utterance relies on features that very loosely account for the combined effects of syntax, semantics, word informativeness and salience, on prosodic prominence. To improve prosodic prominence prediction we first follow up the classic approach in which prosodic prominence patterns are flattened into binary sequences of pitch accented and pitch unaccented words. We propose and motivate statistic and syntactic dependency based features that are complementary to the most predictive features proposed in previous works on automatic pitch accent prediction and show their utility on both read and spontaneous speech. Different accentuation patterns can be associated to the same sentence. Such variability rises the question on how evaluating pitch accent predictors when more patterns are allowed. We carry out a study on prosodic symbols variability on a speech corpus where different speakers read the same text and propose an information-theoretic definition of optionality of symbolic prosodic events that leads to a novel evaluation metric in which prosodic variability is incorporated as a factor affecting prediction accuracy. We additionally propose a method to take advantage of the optionality of prosodic events in unit-selection speech synthesis. To better account for the tight links between the prosodic prominence of a word and the discourse/sentence context, part of this thesis goes beyond the accent/no-accent dichotomy and is devoted to a novel task, the automatic detection of contrast, where contrast is meant as a (Information Structure’s) relation that ties two words that explicitly contrast with each other. This task is mainly motivated by the fact that contrastive words tend to be prosodically marked with particularly prominent pitch accents. The identification of contrastive word pairs is achieved by combining lexical information, syntactic information (which mainly aims to identify the syntactic parallelism that often activates contrast) and semantic information (mainly drawn from the Word- Net semantic lexicon), within a Support Vector Machines classifier. Once we have identified patterns of prosodic prominence we propose methods to incorporate such information in TTS synthesis and test its impact on synthetic speech naturalness trough some large scale perceptual experiments. The results of these experiments cast some doubts on the utility of a simple accent/no-accent distinction in Hidden Markov Model based speech synthesis while highlight the importance of contrastive accents.
Style APA, Harvard, Vancouver, ISO itp.
20

Cohen, Aaron Seth 1974. "Automatic generation of fundamental frequency for text-to-speech synthesis." Thesis, Massachusetts Institute of Technology, 1997. http://hdl.handle.net/1721.1/43501.

Pełny tekst źródła
Streszczenie:
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.<br>Includes bibliographical references (p. 82-86).<br>by Aaron Seth Cohen.<br>M.Eng.
Style APA, Harvard, Vancouver, ISO itp.
21

Zheng, Yilin. "Text-Based Speech Video Synthesis from a Single Face Image." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1572168353691788.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
22

Kulkarni, Ajinkya. "Expressivity transfer in deep learning based text-to-speech synthesis." Electronic Thesis or Diss., Université de Lorraine, 2022. http://www.theses.fr/2022LORR0122.

Pełny tekst źródła
Streszczenie:
Bien que la synthèse de parole à partir du texte ait connu ces dernières années un immense succès dans le domaine de l'interaction homme-machine, les systèmes actuels sont perçus comme monotones en raison de l'absence d'expressivité. L'expressivité dans la parole réfère généralement aux caractéristiques suprasegmentales représentées par les émotions, les styles d'expression, les gestes et expressions faciales, etc. Une synthèse vocale expressive devrait permettre d'améliorer considérablement l'expérience utilisateur avec les machines. Le développement d'un système de synthèse de parole expressive dépend fortement des données vocales disponibles. Cette thèse vise à développer un système de synthèse de parole expressive dans la voix d'un locuteur pour lequel seules des données vocales neutres sont disponibles. L'objectif principal de la thèse est d'étudier des approches d'apprentissage profond pour explorer le désenchevêtrement des informations locuteur et d'expressivité dans un contexte de synthèse de parole multilocuteur. Le contexte d'application concerne l'expressivité en tant qu'émotion avec des classes d'émotion bien définies. Nous proposons différentes architectures de réseaux neuronaux profonds pour créer des représentations latentes du locuteur et de l'expressivité dans des configurations de synthèse de parole multilocuteurs. Pour le transfert de l'expressivité, les représentations de l'expressivité et du locuteur sont utilisées pour synthétiser la parole expressive dans la voix du locuteur souhaité. Nous utilisons également le critère multiclass N-Pair loss lors de l'apprentissage pour améliorer la représentation latente de l'expressivité (meilleure séparation des émotions dans l'espace latent), ce qui permet d'améliorer le transfert d'expressivité. Nous étudions également les modèles génératifs profonds permettant une modélisation tractable et évolutive de données vocales complexes et hautement dimensionnelles, ces modèles étant reconnus pour une synthèse vocale de haute qualité. Nous avons enrichi ces modèles pour étudier leur capacité de transfert d'expressivité. L'évaluation des systèmes proposés est difficile car aucune donnée de référence de parole expressive n'est disponible dans la voix du locuteur cible. Par conséquent, nous proposons deux mesures d'évaluation subjectives, le MOS expressivité et le MOS locuteur, qui indiquent les performances de transfert de l'expressivité et de rétention de la voix du locuteur cible. Nous proposons également une métrique d'évaluation objective basée sur la similarité en cosinus pour mesurer la pertinence de l'expressivité et de la voix du locuteur. Les résultats obtenus démontrent la capacité des approches proposées à transférer l'expressivité tout en maintenant la qualité globale de la parole expressive synthétisée dans la voix du locuteur cible. Cependant, l'identification des paramètres des réseaux neuronaux représentant explicitement les attributs des caractéristiques du locuteur et de l'expressivité reste difficile. Les caractéristiques d'expressivité et de locuteur sont des aspects conjoints de la prosodie<br>Recently, text-to-speech (TTS) synthesis has gained immense success in the human-computer interaction domain. Current TTS systems are monotonous due to the absence of expressivity. Expressivity in speech generally refers to suprasegmental speech characteristics represented by emotions, speaking styles, and the relationship between speech and gestures, facial expressions, etc. It seems likely that expressive speech synthesis provides the ability to improve the user experience with machines greatly. The development of an expressive TTS system heavily relies on the speech data used in training the system. The thesis aims at developing an expressive TTS system in a speaker's voice for which only neutral speech data is available. The main focus of the thesis is to investigate deep learning approaches for exploring the disentanglement of speaker information and expressivity in a multispeaker TTS setting. The scope of the work incorporates expressivity as an emotion attribute with well-defined emotion classes. We present various deep neural network architectures to create latent representations of speaker and expressivity in multispeaker TTS settings. During the expressivity transfer phase, representations from expressivity and speaker are used to interpolate for synthesizing expressive speech in desired speaker's voice. We present a deep metric learning framework for improving the latent representation of expressivity in a multispeaker TTS system setting, which results in improved expressivity transfer. The thesis work also investigates the expressivity transfer capability of probability density estimation based on deep generative models. The usage of deep generative models provides scalable modeling of complex, high-dimensional speech data and tractability of the system, resulting in high-quality speech synthesis. The evaluation of the proposed systems is a challenging aspect of the thesis, as no reference expressive speech data was available in the target speaker's voice. Therefore, we propose two subjective evaluation metrics, speaker MOS and expressive MOS, which indicate the performance of the framework to transfer the expressivity and the retention of the target speaker's voice. As it is a time-consuming process to conduct a subjective evaluation each time system is developed, we propose a cosine similarity-based evaluation metric to measure the strength of expressivity and the speaker's voice. The obtained results demonstrate the ability of the proposed work to transfer the expressivity with maintaining the overall quality of synthesized expressive speech in the target speaker's voice. It is hard to identify which neural network parameters represent the attributes of speaker characteristics and expressivity. Moreover, expressivity and speaker characteristics are bounded aspects of prosody parameters
Style APA, Harvard, Vancouver, ISO itp.
23

Learmonth, Robin Alec. "Studies in asymmetric synthesis." Thesis, Rhodes University, 1991. http://hdl.handle.net/10962/d1005018.

Pełny tekst źródła
Streszczenie:
The concept of combining two well established areas of organic chemistry, viz., organosilicon chemistry and the use of chiral auxiliaries, into a viable, alternative method of asymmetric synthesis has only very recently begun to receive attention. At the outset of this investigation, no asymmetric reactions of silyl enol ethers, chiral by virtue of optically active substituents on the silicon, had been reported. A range of novel chiral silyl enol ethers have thus been prepared from a variety of ketones, including pinacolone, cyclohexanone, and α-tetralone, and employing menthol, borneol, and cholesterol as chiral auxiliaries. These preparations have been achieved via several distinct routes, including a novel convergent approach involving the isolation of either the chloro(menthyloxy)dimethylsilane or the (bornyloxy)chlorodimethylsilane. The MS and NMR spectra of these silyl enol ethers were examined in detail and, in the case of the crystalline cholesteryloxy silyl enol ether, the X-ray structure has been determined. The potential of chloroalkoxysilanes to act as general, chiral derivatizing agents has been established by the preparation of diastereomeric silyl acetal mixtures of racemic secondary alcohols (e.g. I-phenylethanol and 2-octanol). The experimental diastereomeric ratios, obtained by GLC and ¹H NMR spectroscopy, approached the expected value of unity, confirming the potential of the alkoxychlorosilanes as chiral probes. The chiral silyl enol ethers have been successfully oxidized to the corresponding α-siloxy ketones employing MCPBA, MMPP, and 2-(phenylsulphonyl)-3-phenyloxaziridine as oxidizing agents and the diastereomeric excesses obtained, which varied from 0 to 16%, indicated some potential for stereochemical control. Alkylation and hydroxyalkylation reactions of the silyl enol ethers have yielded the expected α-iert-butyl and β-hydroxy ketones in good to excellent material yields, with the enantiomeric excesses, as determined by chiral shift reagent studies, reaching 14%. To improve the stereo control in these reactions, attempts have been made to prepare chiral silyl enol ethers with auxiliaries possessing the potential for transition state complex co-ordination in the reactions under consideration. The preparation of such silyl enol ethers, incorporating the proline-derived auxiliaries, N-methyl-2-hydroxymethylpyrrolidine and 2-methoxymethylpyrrolidine met with only limited success. In an alternative approach, three derivatives of 2,3-dihydroxybornane have been prepared. However, two of these auxiliaries, viz., 3-exo-benzyloxy-2-exo-hydroxybornane and 3-exo-(1-methoxyethoxy)-2-exo-hydroxybornane failed to form silyl enol ethers, even under considerably more vigorous conditions than normally employed. The third derivative, 3,3-ethylenedioxy-2-hydroxybornane has been successfully utilized in the preparation of a pinacolone-derived chiral silyl enol ether. Hydroxyalkylation of this compound with benzaldehyde has yielded the β-hydroxyketone with significantly improved enantiomeric excess (26%) and a transition state complex has been proposed to rationalize this improvement.
Style APA, Harvard, Vancouver, ISO itp.
24

Goddard-Borger, Ethan D. "Some synthetic carbohydrate chemistry : natural product synthesis, rational inhibitor design and the development of a new reagent." University of Western Australia. School of Biomedical, Biomolecular and Chemical Sciences, 2008. http://theses.library.uwa.edu.au/adt-WU2009.0043.

Pełny tekst źródła
Streszczenie:
Earnest carbohydrate research was initiated in the nineteenth century by several talented organic chemists. Carbohydrates, now known to play essential roles in a range of fundamental biological processes, are presently studied by a throng of scientists from many fields, including: biochemistry, molecular biology, immunology, structural biology, medicine, agriculture, pharmacology and, of course, chemistry. Organic chemistry remains as relevant to carbohydrate research as it has ever been; its practitioners, with their skills in synthesis and fundamental understanding of molecules, are truly indispensable. This thesis details various synthetic endeavours within the field of carbohydrate chemistry. It describes four projects with goals as diverse as natural product synthesis, rational inhibitor design and the development of new reagents in organic synthesis. The first chapter provides an account of the synthesis of compound 1, a potent germination stimulant present in smoke, from D-xylose. Many analogues of 1 were prepared from carbohydrates and evaluated as germination stimulants, which permitted the dissemination of several structure-activity relationships. Subsequent chapters describe the design and preparation of inhibitors for various carbohydrate-processing enzymes. Compounds 55 and 56 were sought after as putative synergistic inhibitors of a Vitis vinifera (grape) uridine diphospho-glucose:flavonoid 3-O-glucosyltransferase (VvGT1). It was hoped that crystallographic investigations of VvGT1-UDP-2/3 complexes by a collaborator, structural biologist Professor Gideon Davies, would aid in clarifying mechanistic aspects of this enzyme.Compounds 114, 115 and 118 were prepared as putative arabinanase inhibitors. Once again, this work was undertaken to assist in crystallographic studies that might provide a better understanding of how these enzymes operate. The thesis concludes by describing the development of compound 152.HCl, a novel reagent for the diazotransfer reaction. Previously, this reaction utilised trifluoromethanesulfonyl azide (TfN3), an expensive and explosive liquid with a poor shelf-life, to convert a primary amine directly into an azide. Reagent 152.HCl was developed to replace TfN3 in this useful synthetic transformation. A one-pot procedure enabled the simple and inexpensive preparation of 152.HCl, which was demonstrated to be shelf-stable, crystalline and, crucially, effective in the diazotransfer reaction.
Style APA, Harvard, Vancouver, ISO itp.
25

Gordon, Jane S. "Use of synthetic speech in tests of speech discrimination." PDXScholar, 1985. https://pdxscholar.library.pdx.edu/open_access_etds/3443.

Pełny tekst źródła
Streszczenie:
The purpose of this study was to develop two tape-recorded synthetic speech discrimination test tapes and assess their intelligibility in order to determine whether or not synthetic speech was intelligible and if it would prove useful in speech discrimination testing. Four scramblings of the second MU-6 monosyllable word list were generated by the ECHO l C speech synthesizer using two methods of generating synthetic speech called TEXTALKER and SPEAKEASY. These stimuli were presented in one ear to forty normal-hearing adult subjects, 36 females and 4 males, at 60 dB HL under headphone&. Each subject listened to two different scramblings of the 50 monosyllable word list, one scrambling generated by TEXTALKER and the other scrambling generated by SPEAKEASY. The order in which the TEXTALKER and SPEAKEASY mode of presentation occurred as well as which ear to test per subject was randomly determined.
Style APA, Harvard, Vancouver, ISO itp.
26

Cohen, Andrew Dight. "The use of learnable phonetic representations in connectionist text-to-speech system." Thesis, University of Reading, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360787.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
27

Milligan, Peter. "The synthesis of parallel programs : with specific application to text processing." Thesis, Queen's University Belfast, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.317085.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
28

Slott, Jordan Matthew. "A general platform and markup language for text to speech synthesis." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/38811.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
29

Falai, Alessio. "Conditioning Text-to-Speech synthesis on dialect accent: a case study." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25805/.

Pełny tekst źródła
Streszczenie:
Modern text-to-speech systems are modular in many different ways. In recent years, end-users gained the ability to control speech attributes such as degree of emotion, rhythm and timbre, along with other suprasegmental features. More ambitious objectives are related to modelling a combination of speakers and languages, e.g. to enable cross-speaker language transfer. Though, no prior work has been done on the more fine-grained analysis of regional accents. To fill this gap, in this thesis we present practical end-to-end solutions to synthesise speech while controlling within-country variations of the same language, and we do so for 6 different dialects of the British Isles. In particular, we first conduct an extensive study of the speaker verification field and tweak state-of-the-art embedding models to work with dialect accents. Then, we adapt standard acoustic models and voice conversion systems by conditioning them on dialect accent representations and finally compare our custom pipelines with a cutting-edge end-to-end architecture from the multi-lingual world. Results show that the adopted models are suitable and have enough capacity to accomplish the task of regional accent conversion. Indeed, we are able to produce speech closely resembling the selected speaker and dialect accent, where the most accurate synthesis is obtained via careful fine-tuning of the multi-lingual model to the multi-dialect case. Finally, we delineate limitations of our multi-stage approach and propose practical mitigations, to be explored in future work.
Style APA, Harvard, Vancouver, ISO itp.
30

Baraheem, Samah Saeed. "Text to Image Synthesis via Mask Anchor Points and Aesthetic Assessment." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton158800567702413.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
31

Odéjobí, Odétùnjí A. "A computational model of prosody for Yorøbá text-to-speech synthesis." Thesis, Aston University, 2005. http://publications.aston.ac.uk/10683/.

Pełny tekst źródła
Streszczenie:
This work examines prosody modelling for the Standard Yorøbá (SY) language in the context of computer text-to-speech synthesis applications. The thesis of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combines acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. Our prosody model is conceptualised around a modular holistic framework. The framework is implemented using the Relational Tree (R-Tree) techniques (Ehrich and Foith, 1976). R-Tree is a sophisticated data structure that provides a multi-dimensional description of a waveform. A Skeletal Tree (S-Tree) is first generated using algorithms based on the tone phonological rules of SY. Subsequent steps update the S-Tree by computing the numerical values of the prosody dimensions. To implement the intonation dimension, fuzzy control rules where developed based on data from native speakers of Yorøbá. The Classification And Regression Tree (CART) and the Fuzzy Decision Tree (FDT) techniques were tested in modelling the duration dimension. The FDT was selected based on its better performance. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration and intonation, using different techniques and their subsequent integration. Our approach provides us with a flexible and extendible model that can also be used to implement, study and explain the theory behind aspects of the phenomena observed in speech prosody.
Style APA, Harvard, Vancouver, ISO itp.
32

Rizzo, Michelin Linda L. "Concept mapping in evaluation practice and theory, a synthesis of current empirical research." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/tape15/PQDD_0003/MQ36724.pdf.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
33

Odéjobí, Odétúnjí Àjàdí. "A computational model of prosody for Yorúbá text-to-speech synthesis." Thesis, Aston University, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.420173.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
34

Pouget, Maël. "Synthèse incrémentale de la parole à partir du texte." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAT008/document.

Pełny tekst źródła
Streszczenie:
Ce travail de thèse porte sur un nouveau paradigme pour la synthèse de la parole à partir du texte, à savoir la synthèse incrémentale. L'objectif est de délivrer la parole de synthèse au fur et à mesure de la saisie du texte par l'utilisateur, contrairement aux systèmes classiques pour lesquels la synthèse est déclenchée après la saisie d'une ou plusieurs phrases. L'application principale visée est l'aide aux personnes présentant un trouble sévère de la communication orale, et communiquant principalement à l'aide d'un synthétiseur vocal. Un synthétiseur vocal incrémental permettrait de fluidifier une conversation en limitant le temps que passe l'interlocuteur à attendre la fin de la saisie de la phrase à synthétiser. Un des défi que pose ce paradigme est la synthèse d'un mot ou d'un groupe de mot avec une qualité segmentale et prosodique acceptable alors que la phrase qui le contient n'est que partiellement connue au moment de la synthèse. Pour ce faire, nous proposons différentes adaptations des deux principaux modules d'un système de synthèse de parole à partir du texte : le module de traitement automatique de la langue naturelle (TAL) et le module de synthèse sonore. Pour le TAL en synthèse incrémentale, nous nous sommes intéressé à l'analyse morpho-syntaxique, qui est une étape décisive pour la phonétisation et la détermination de la prosodie cible. Nous décrivons un algorithme d'analyse morpho-syntaxique dit "à latence adaptative". Ce dernier estime en ligne si une classe lexicale (estimée à l'aide d'un analyseur morpho-syntaxique standard basé sur l'approche n-gram), est susceptible de changer après l'ajout par l'utilisateur d'un ou plusieurs mots. Si la classe est jugée instable, alors la synthèse sonore est retardée, dans le cas contraire, elle peut s'effectuer sans risque a priori de dégrader de la qualité segmentale et suprasegmentale. Cet algorithme exploite une ensemble d'arbre de décisions binaires dont les paramètres sont estimés par apprentissage automatique sur un large corpus de texte. Cette méthode nous permet de réaliser un étiquetage morpho-syntaxique en contexte incrémental avec une précision de 92,5% pour une latence moyenne de 1,4 mots. Pour la synthèse sonore, nous nous plaçons dans le cadre de la synthèse paramétrique statistique, basée sur les modèles de Markov cachés (Hidden Markov Models, HMM). Nous proposons une méthode de construction de la voix de synthèse (estimation des paramètres de modèles HMM) prenant en compte une éventuelle incertitude sur la valeur de certains descripteurs contextuels qui ne peuvent pas être calculés en synthèse incrémentale (c'est-à-dire ceux qui portent sur les mots qui ne sont pas encore saisis au moment de la synthèse).Nous comparons la méthode proposée à deux autres stratégies décrites dans la littérature. Les résultats des évaluations objectives et perceptives montrent l’intérêt de la méthode proposée pour la langue française. Enfin, nous décrivons un prototype complet qui combine les deux méthodes proposées pour le TAL et la synthèse par HMM incrémentale. Une évaluation perceptive de la pertinence et de la qualité des groupes de mots synthétisés au fur et à mesure de la saisie montre que notre système réalise un compromis acceptable entre réactivité (minimisation du temps entre la saisie d'un mot et sa synthèse) et qualité (segmentale et prosodique) de la parole de synthèse<br>In this thesis, we investigate a new paradigm for text-to-speech synthesis (TTS) allowing to deliver synthetic speech while the text is being inputted : incremental text-to-speech synthesis. Contrary to conventional TTS systems, that trigger the synthesis after a whole sentence has been typed down, incremental TTS devices deliver speech in a ``piece-meal'' fashion (i.e. word after word) while aiming at preserving the speech quality achievable by conventional TTS systems.By reducing the waiting time between two speech outputs while maintaining a good speech quality, such a system should improve the quality of the interaction for speech-impaired people using TTS devices to express themselves.The main challenge brought by incremental TTS is the synthesis of a word, or of a group of words, with the same segmental and supra-segmental quality as conventional TTS, but without knowing the end of the sentence to be synthesized. In this thesis, we propose to adapt the two main modules (natural language processing and speech synthesis) of a TTS system to the incremental paradigm.For the natural language processing module, we focused on part-of-speech tagging, which is a key step for phonetization and prosody generation. We propose an ``adaptive latency algorithm'' for part-of-speech tagging, that estimates if the inferred part-of-speech for a given word (based on the n-gram approach) is likely to change when adding one or several words. If the Part-of-speech is considered as likely to change, the synthesis of the word is delayed. In the other case, the word may be synthesized without risking to alter the segmental or supra-segmental quality of the synthetic speech. The proposed method is based on a set of binary decision trees trained over a large corpus of text. We achieve 92.5% precision for the incremental part-of-speech tagging task and a mean delay of 1.4 words.For the speech synthesis module, in the context of HMM-based speech synthesis, we propose a training method that takes into account the uncertainty about contextual features that cannot be computed at synthesis time (namely, contextual features related to the following words). We compare the proposed method to other strategies (baselines) described in the literature. Objective and subjective evaluation show that the proposed method outperforms the baselines for French.Finally, we describe a prototype developed during this thesis implementing the proposed solution for incremental part-of-speech tagging and speech synthesis. A perceptive evaluation of the word grouping derived from the proposed adaptive latency algorithm as well as the segmental quality of the synthetic speech tends to show that our system reaches a good trade-off between reactivity (minimizing the waiting time between the input and the synthesis of a word) and speech quality (both at segmental and supra-segmental levels)
Style APA, Harvard, Vancouver, ISO itp.
35

Anbinderis, Tomas. "Mathematical modelling of some aspects of stressing a Lithuanian text." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2010. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2010~D_20100702_105219-07956.

Pełny tekst źródła
Streszczenie:
The present dissertation deals with one of the speech synthesizer components – automatic stressing of a text and two other goals relating to it – homographs (words that can be stressed in several ways) disambiguation and a search for clitics (unstressed words). The method, which by means of decision trees finds sequences of letters that unambiguously define the word stressing, was applied to stress a Lithuanian text. Decision trees were created using large corpus of stressed words. Stressing rules based on sequences of letters at the beginning, ending and in the middle of a word have been formulated. The algorithm proposed reaches the accuracy of about 95.5%. The homograph disambiguation algorithm proposed by the present author is based on frequencies of lexemes and morphological features, that were obtained from corpus containing about one million words. Such methods were not used for Lithuanian language so far. The proposed algorithm enables to select the correct variant of stressing within the accuracy of 85.01%. Besides the author proposes methods of four types to search for the clitics in a Lithuanian text: methods based on recognising the combinational forms, based on statistical stressed/unstressed frequency of a word, grammar rules and stressing of the adjacent words. It is explained how to unite all the methods into a single algorithm. 4.1% of errors was obtained for the testing data among all the words, and the ratio of errors and unstressed words accounts for 18... [to full text]<br>Disertacijoje nagrinėjama viena iš kalbos sintezatoriaus sudedamųjų dalių – teksto automatinis kirčiavimas, bei su kirčiavimu susiję kiti uždaviniai: vienodai rašomų, bet skirtingai tariamų, žodžių (homografų) vienareikšminimas bei prie gretimo žodžio prišlijusių bekirčių žodžių (klitikų) paieška. Teksto kirčiavimui pritaikytas metodas, kuris naudodamas sprendimų medžius randa raidžių sekas, vienareikšmiai nusakančias žodžio kirčiavimą. Sprendimo medžiams sudaryti buvo naudojamas didelies apimties sukirčiuotų žodžių tekstynas. Buvo sudarytos kirčiavimo taisyklės remiantis raidžių sekomis žodžių pradžioje, pabaigoje ir viduryje. Pasiūlytas kirčiavimo algoritmas pasiekia apie 95,5% tikslumą. Homografams vienareikšminti pritaikyti iki šiol lietuvių kalbai nenaudoti metodai, pagrįsti leksemų ir morfologinių pažymų vartosenos dažniais, gautais iš vieno milijono žodžių tekstyno. Darbe parodyta, kad morfologinių pažymų dažniai yra svarbesni už leksemų dažnius. Pasiūlyti metodai leido homografus vienareikšminti 85,01% tikslumu. Klitikų paieškai pasiūlyti metodai, kurie remiasi: 1) samplaikinių formų atpažinimu, 2) statistiniu žodžio kirčiavimo/nekirčiavimo dažniu, 3) kai kuriomis gramatikos taisyklėmis bei 4) gretimų žodžių kirčių pasiskirstymu (ritmika). Paaiškinta, kaip visus metodus sujungti į vieną algoritmą. Pritaikius šį algoritmą testavimo duomenims, klaidų ir visų žodžių santykis buvo 4,1%, o klaidų ir nekirčiuotų žodžių santykis – 18,8%.
Style APA, Harvard, Vancouver, ISO itp.
36

Valentini, Botinhão Cássia. "Intelligibility enhancement of synthetic speech in noise." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/8877.

Pełny tekst źródła
Streszczenie:
Speech technology can facilitate human-machine interaction and create new communication interfaces. Text-To-Speech (TTS) systems provide speech output for dialogue, notification and reading applications as well as personalized voices for people that have lost the use of their own. TTS systems are built to produce synthetic voices that should sound as natural, expressive and intelligible as possible and if necessary be similar to a particular speaker. Although naturalness is an important requirement, providing the correct information in adverse conditions can be crucial to certain applications. Speech that adapts or reacts to different listening conditions can in turn be more expressive and natural. In this work we focus on enhancing the intelligibility of TTS voices in additive noise. For that we adopt the statistical parametric paradigm for TTS in the shape of a hidden Markov model (HMM-) based speech synthesis system that allows for flexible enhancement strategies. Little is known about which human speech production mechanisms actually increase intelligibility in noise and how the choice of mechanism relates to noise type, so we approached the problem from another perspective: using mathematical models for hearing speech in noise. To find which models are better at predicting intelligibility of TTS in noise we performed listening evaluations to collect subjective intelligibility scores which we then compared to the models’ predictions. In these evaluations we observed that modifications performed on the spectral envelope of speech can increase intelligibility significantly, particularly if the strength of the modification depends on the noise and its level. We used these findings to inform the decision of which of the models to use when automatically modifying the spectral envelope of the speech according to the noise. We devised two methods, both involving cepstral coefficient modifications. The first was applied during extraction while training the acoustic models and the other when generating a voice using pre-trained TTS models. The latter has the advantage of being able to address fluctuating noise. To increase intelligibility of synthetic speech at generation time we proposed a method for Mel cepstral coefficient modification based on the glimpse proportion measure, the most promising of the models of speech intelligibility that we evaluated. An extensive series of listening experiments demonstrated that this method brings significant intelligibility gains to TTS voices while not requiring additional recordings of clear or Lombard speech. To further improve intelligibility we combined our method with noise-independent enhancement approaches based on the acoustics of highly intelligible speech. This combined solution was as effective for stationary noise as for the challenging competing speaker scenario, obtaining up to 4dB of equivalent intensity gain. Finally, we proposed an extension to the speech enhancement paradigm to account for not only energetic masking of signals but also for linguistic confusability of words in sentences. We found that word level confusability, a challenging value to predict, can be used as an additional prior to increase intelligibility even for simple enhancement methods like energy reallocation between words. These findings motivate further research into solutions that can tackle the effect of energetic masking on the auditory system as well as on higher levels of processing.
Style APA, Harvard, Vancouver, ISO itp.
37

Carvalho, Sarah Negreiros de 1985. "Estudo de um sistema de conversão texto-fala baseado em HMM." [s.n.], 2013. http://repositorio.unicamp.br/jspui/handle/REPOSIP/259046.

Pełny tekst źródła
Streszczenie:
Orientador: Fábio Violaro<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação<br>Made available in DSpace on 2018-08-22T07:58:43Z (GMT). No. of bitstreams: 1 Carvalho_SarahNegreirosde_M.pdf: 2350561 bytes, checksum: 950d33430acbd816700ef5de4c78fa5d (MD5) Previous issue date: 2013<br>Resumo: Com o contínuo desenvolvimento da tecnologia, há uma demanda crescente por sistemas de síntese de fala que sejam capazes de falar como humanos, para integrá-los nas mais diversas aplicações, seja no âmbito da automação robótica, sejam para acessibilidade de pessoas com deficiências, seja em aplicativos destinados a cultura e lazer. A síntese de fala baseada em modelos ocultos de Markov (HMM) mostra-se promissora em suprir esta necessidade tecnológica. A sua natureza estatística e paramétrica a tornam um sistema flexível, capaz de adaptar vozes artificiais, inserir emoções no discurso e obter fala sintética de boa qualidade usando uma base de treinamento limitada. Esta dissertação apresenta o estudo realizado sobre o sistema de síntese de fala baseado em HMM (HTS), descrevendo as etapas que envolvem o treinamento dos modelos HMMs e a geração do sinal de fala. São apresentados os modelos espectrais, de pitch e de duração que constituem estes modelos HMM dos fonemas dependentes de contexto, considerando as diversas técnicas de estruturação deles. Alguns dos problemas encontrados no HTS, tais como a característica abafada e monótona da fala artificial, são analisados juntamente com algumas técnicas propostas para aprimorar a qualidade final do sinal de fala sintetizado<br>Abstract: With the continuous development of technology, there is a growing demand for text-to-speech systems that are able to speak like humans, in order to integrate them in the most diverse applications whether in the field of automation and robotics, or for accessibility of people with disabilities, as for culture and leisure activities. Speech synthesis based on hidden Markov models (HMM) shows to be promising in addressing this need. Their statistical and parametric nature make it a flexible system capable of adapting artificial voices, insert emotions in speech and get artificial speech of good quality using a limited amount of speech data for HMM training. This thesis presents the study realized on HMM-based speech synthesis system (HTS), describing the steps that involve the training of HMM models and the artificial speech generation. Spectral, pitch and duration models are presented, which form context-dependent HMM models, and also are considered the various techniques for structuring them. Some of the problems encountered in the HTS, such as the characteristic muffled and monotone of artificial speech, are analyzed along with some of the proposed techniques to improve the final quality of the synthesized speech signal<br>Mestrado<br>Telecomunicações e Telemática<br>Mestra em Engenharia Elétrica
Style APA, Harvard, Vancouver, ISO itp.
38

Yoon, Kyuchul. "Building a prosodically sensitive diphone database for a Korean text-to-speech synthesis system." Connect to this title online, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1119010941.

Pełny tekst źródła
Streszczenie:
Thesis (Ph. D.)--Ohio State University, 2005.<br>Title from first page of PDF file. Document formatted into pages; contains xxii, 291 p.; also includes graphics (some col.) Includes bibliographical references (p. 210-216). Available online via OhioLINK's ETD Center
Style APA, Harvard, Vancouver, ISO itp.
39

Atterer, Michaela. "Experiments on the prediction of prosodic phrasing for german text to speech synthesis /." Stuttgart : Univ., AIMS, 2005. http://swbplus.bsz-bw.de/bsz116719958abs.htm.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
40

Soltani, Omid. "Photochemical preparations of salicylate/resorcylate esters/amides asymmetric synthesis of SCH 351448 /." Access to abstract only; dissertation is embargoed until after 5/16/2007, 2006. http://www4.utsouthwestern.edu/library/ETD/etdDetails.cfm?etdID=168.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
41

Dall, Rasmus. "Statistical parametric speech synthesis using conversational data and phenomena." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/29016.

Pełny tekst źródła
Streszczenie:
Statistical parametric text-to-speech synthesis currently relies on predefined and highly controlled prompts read in a “neutral” voice. This thesis presents work on utilising recordings of free conversation for the purpose of filled pause synthesis and as an inspiration for improved general modelling of speech for text-to-speech synthesis purposes. A corpus of both standard prompts and free conversation is presented and the potential usefulness of conversational speech as the basis for text-to-speech voices is validated. Additionally, through psycholinguistic experimentation it is shown that filled pauses can have potential subconscious benefits to the listener but that current text-to-speech voices cannot replicate these effects. A method for pronunciation variant forced alignment is presented in order to obtain a more accurate automatic speech segmentation something which is particularly bad for spontaneously produced speech. This pronunciation variant alignment is utilised not only to create a more accurate underlying acoustic model, but also as the driving force behind creating more natural pronunciation prediction at synthesis time. While this improves both the standard and spontaneous voices the naturalness of spontaneous speech based voices still lags behind the quality of voices based on standard read prompts. Thus, the synthesis of filled pauses is investigated in relation to specific phonetic modelling of filled pauses and through techniques for the mixing of standard prompts with spontaneous utterances in order to retain the higher quality of standard speech based voices while still utilising the spontaneous speech for filled pause modelling. A method for predicting where to insert filled pauses in the speech stream is also developed and presented, relying on an analysis of human filled pause usage and a mix of language modelling methods. The method achieves an insertion accuracy in close agreement with human usage. The various approaches are evaluated and their improvements documented throughout the thesis, however, at the end the resulting filled pause quality is assessed through a repetition of the psycholinguistic experiments and an evaluation of the compilation of all developed methods.
Style APA, Harvard, Vancouver, ISO itp.
42

Lambert, Tanya. "Databases for concatenative text-to-speech synthesis systems : unit selection and knowledge-based approach." Thesis, University of East Anglia, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.421192.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
43

Schlünz, Georg Isaac. "Advanced natural language processing for improved prosody in text-to-speech synthesis / G. I. Schlünz." Thesis, North-West University, 2014. http://hdl.handle.net/10394/10634.

Pełny tekst źródła
Streszczenie:
Text-to-speech synthesis enables the speech-impeded user of an augmentative and alternative communication system to partake in any conversation on any topic, because it can produce dynamic content. Current synthetic voices do not sound very natural, however, lacking in the areas of emphasis and emotion. These qualities are furthermore important to convey meaning and intent beyond that which can be achieved by the vocabulary of words only. Put differently, speech synthesis requires a more comprehensive analysis of its text input beyond the word level to infer the meaning and intent that elicit emphasis and emotion. The synthesised speech then needs to imitate the effects that these textual factors have on the acoustics of human speech. This research addresses these challenges by commencing with a literature study on the state of the art in the fields of natural language processing, text-to-speech synthesis and speech prosody. It is noted that the higher linguistic levels of discourse, information structure and affect are necessary for the text analysis to shape the prosody appropriately for more natural synthesised speech. Discourse and information structure account for meaning, intent and emphasis, and affect formalises the modelling of emotion. The OCC model is shown to be a suitable point of departure for a new model of affect that can leverage the higher linguistic levels. The audiobook is presented as a text and speech resource for the modelling of discourse, information structure and affect because its narrative structure is prosodically richer than the random constitution of a traditional text-to-speech corpus. A set of audiobooks are selected and phonetically aligned for subsequent investigation. The new model of discourse, information structure and affect, called e-motif, is developed to take advantage of the audiobook text. It is a subjective model that does not specify any particular belief system in order to appraise its emotions, but defines only anonymous affect states. Its cognitive and social features rely heavily on the coreference resolution of the text, but this process is found not to be accurate enough to produce usable features values. The research concludes with an experimental investigation of the influence of the e-motif features on human speech and synthesised speech. The aligned audiobook speech is inspected for prosodic correlates of the cognitive and social features, revealing that some activity occurs in the into national domain. However, when the aligned audiobook speech is used in the training of a synthetic voice, the e-motif effects are overshadowed by those of structural features that come standard in the voice building framework.<br>PhD (Information Technology), North-West University, Vaal Triangle Campus, 2014
Style APA, Harvard, Vancouver, ISO itp.
44

Evrard, Marc. "Synthèse de parole expressive à partir du texte : Des phonostyles au contrôle gestuel pour la synthèse paramétrique statistique." Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112202.

Pełny tekst źródła
Streszczenie:
L’objectif de cette thèse est l’étude et la conception d’une plateforme de synthèse de parole expressive.Le système de synthèse — LIPS3, développé dans le cadre de ce travail, incorpore deux éléments : un module linguistique et un module de synthèse paramétrique par apprentissage statistique (construit à l’aide de HTS et de STRAIGHT). Le système s’appuie sur un corpus monolocuteur conçu, enregistréet étiqueté à cette occasion.Une première étude, sur l’influence de la qualité de l’étiquetage du corpus d’apprentissage, indique que la synthèse paramétrique statistique est robuste aux erreurs de labels et d’alignement. Cela répond au problème de la variation des réalisations phonétiques en parole expressive.Une seconde étude, sur l’analyse acoustico-phonétique du corpus permet la caractérisation de l’espace expressif utilisé par la locutrice pour réaliser les consignes expressives qui lui ont été fournies. Les paramètres de source et les paramètres articulatoires sont analysés suivant les classes phonétiques, ce qui permet une caractérisation fine des phonostyles.Une troisième étude porte sur l’intonation et le rythme. Calliphony 2.0 est une interface de contrôlechironomique temps-réel permettant la modification de paramètres prosodiques (f0 et tempo) des signaux de synthèse sans perte de qualité, via une manipulation directe de ces paramètres. Une étude sur la stylisation de l’intonation et du rythme par contrôle gestuel montre que cette interface permet l’amélioration, non-seulement de la qualité expressive de la parole synthétisée, mais aussi de la qualité globale perçue en comparaison avec la modélisation statistique de la prosodie.Ces études montrent que la synthèse paramétrique, combinée à une interface chironomique, offre une solution performante pour la synthèse de la parole expressive, ainsi qu’un outil d’expérimentation puissant pour l’étude de la prosodie<br>The subject of this thesis was the study and conception of a platform for expressive speech synthesis.The LIPS3 Text-to-Speech system — developed in the context of this thesis — includes a linguistic module and a parametric statistical module (built upon HTS and STRAIGHT). The system was based on a new single-speaker corpus, designed, recorded and annotated.The first study analyzed the influence of the precision of the training corpus phonetic labeling on the synthesis quality. It showed that statistical parametric synthesis is robust to labeling and alignment errors. This addresses the issue of variation in phonetic realizations for expressive speech.The second study presents an acoustico-phonetic analysis of the corpus, characterizing the expressive space used by the speaker to instantiate the instructions that described the different expressive conditions. Voice source parameters and articulatory settings were analyzed according to their phonetic classes, which allowed for a fine phonostylistic characterization.The third study focused on intonation and rhythm. Calliphony 2.0 is a real-time chironomic interface that controls the f0 and rhythmic parameters of prosody, using drawing/writing hand gestures with a stylus and a graphic tablet. These hand-controlled modulations are used to enhance the TTS output, producing speech that is more realistic, without degradation as it is directly applied to the vocoder parameters. Intonation and rhythm stylization using this interface brings significant improvement to the prototypicality of expressivity, as well as to the general quality of synthetic speech.These studies show that parametric statistical synthesis, combined with a chironomic interface, offers an efficient solution for expressive speech synthesis, as well as a powerful tool for the study of prosody
Style APA, Harvard, Vancouver, ISO itp.
45

Malatji, Promise Tshepiso. "The development of accented English synthetic voices." Thesis, University of Limpopo, 2019. http://hdl.handle.net/10386/2917.

Pełny tekst źródła
Streszczenie:
Thesis (M. Sc. (Computer Science)) --University of Limpopo, 2019<br>A Text-to-speech (TTS) synthesis system is a software system that receives text as input and produces speech as output. A TTS synthesis system can be used for, amongst others, language learning, and reading out text for people living with different disabilities, i.e., physically challenged, visually impaired, etc., by native and non-native speakers of the target language. Most people relate easily to a second language spoken by a non-native speaker they share a native language with. Most online English TTS synthesis systems are usually developed using native speakers of English. This research study focuses on developing accented English synthetic voices as spoken by non-native speakers in the Limpopo province of South Africa. The Modular Architecture for Research on speech sYnthesis (MARY) TTS engine is used in developing the synthetic voices. The Hidden Markov Model (HMM) method was used to train the synthetic voices. Secondary training text corpus is used to develop the training speech corpus by recording six speakers reading the text corpus. The quality of developed synthetic voices is measured in terms of their intelligibility, similarity and naturalness using a listening test. The results in the research study are classified based on evaluators’ occupation and gender and the overall results. The subjective listening test indicates that the developed synthetic voices have a high level of acceptance in terms of similarity and intelligibility. A speech analysis software is used to compare the recorded synthesised speech and the human recordings. There is no significant difference in the voice pitch of the speakers and the synthetic voices except for one synthetic voice.
Style APA, Harvard, Vancouver, ISO itp.
46

Breitenbücher, Mark. "Textvorverarbeitung zur deutschen Version des Festival Text-to-Speech Synthese Systems." [S.l.] : Universität Stuttgart , Fakultät Philosophie, 1997. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB6783514.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
47

Anbinderis, Tomas. "Kai kurių lietuvių kalbos teksto kirčiavimo aspektų matematinis modeliavimas." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2010. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2010~D_20100702_105309-68172.

Pełny tekst źródła
Streszczenie:
Disertacijoje nagrinėjama viena iš kalbos sintezatoriaus sudedamųjų dalių – teksto automatinis kirčiavimas, bei su kirčiavimu susiję kiti uždaviniai: vienodai rašomų, bet skirtingai tariamų, žodžių (homografų) vienareikšminimas bei prie gretimo žodžio prišlijusių bekirčių žodžių (klitikų) paieška. Teksto kirčiavimui pritaikytas metodas, kuris naudodamas sprendimų medžius randa raidžių sekas, vienareikšmiai nusakančias žodžio kirčiavimą. Sprendimo medžiams sudaryti buvo naudojamas didelies apimties sukirčiuotų žodžių tekstynas. Buvo sudarytos kirčiavimo taisyklės remiantis raidžių sekomis žodžių pradžioje, pabaigoje ir viduryje. Pasiūlytas kirčiavimo algoritmas pasiekia apie 95,5% tikslumą. Homografams vienareikšminti pritaikyti iki šiol lietuvių kalbai nenaudoti metodai, pagrįsti leksemų ir morfologinių pažymų vartosenos dažniais, gautais iš vieno milijono žodžių tekstyno. Darbe parodyta, kad morfologinių pažymų dažniai yra svarbesni už leksemų dažnius. Pasiūlyti metodai leido homografus vienareikšminti 85,01% tikslumu. Klitikų paieškai pasiūlyti metodai, kurie remiasi: 1) samplaikinių formų atpažinimu, 2) statistiniu žodžio kirčiavimo/nekirčiavimo dažniu, 3) kai kuriomis gramatikos taisyklėmis bei 4) gretimų žodžių kirčių pasiskirstymu (ritmika). Paaiškinta, kaip visus metodus sujungti į vieną algoritmą. Pritaikius šį algoritmą testavimo duomenims, klaidų ir visų žodžių santykis buvo 4,1%, o klaidų ir nekirčiuotų žodžių santykis – 18,8%.<br>The present dissertation deals with one of the speech synthesizer components – automatic stressing of a text and two other goals relating to it – homographs (words that can be stressed in several ways) disambiguation and a search for clitics (unstressed words). The method, which by means of decision trees finds sequences of letters that unambiguously define the word stressing, was applied to stress a Lithuanian text. Decision trees were created using large corpus of stressed words. Stressing rules based on sequences of letters at the beginning, ending and in the middle of a word have been formulated. The algorithm proposed reaches the accuracy of about 95.5%. The homograph disambiguation algorithm proposed by the present author is based on frequencies of lexemes and morphological features, that were obtained from corpus containing about one million words. Such methods were not used for Lithuanian language so far. The proposed algorithm enables to select the correct variant of stressing within the accuracy of 85.01%. Besides the author proposes methods of four types to search for the clitics in a Lithuanian text: methods based on recognising the combinational forms, based on statistical stressed/unstressed frequency of a word, grammar rules and stressing of the adjacent words. It is explained how to unite all the methods into a single algorithm. 4.1% of errors was obtained for the testing data among all the words, and the ratio of errors and unstressed words accounts for 18.8%... [to full text]
Style APA, Harvard, Vancouver, ISO itp.
48

Coogan, Melinda Ann. "Bioaccumulation of Triclocarban, Triclosan, and Methyl-triclosan in a North Texas Wastewater Treatment Plant Receiving Stream and Effects of Triclosan on Algal Lipid Synthesis." Thesis, University of North Texas, 2007. https://digital.library.unt.edu/ark:/67531/metadc3986/.

Pełny tekst źródła
Streszczenie:
Triclosan (TCS) and triclocarban (TCC), widely used antimicrobial agents found in numerous consumer products, are incompletely removed by wastewater treatment plant (WWTP) processing. Methyl-triclosan (M-TCS) is a more lipophilic metabolite of its parent compound, TCS. The focus of this study was to quantify bioaccumulation factors (BAFs) for TCS, M-TCS, and TCC in Pecan creek, the receiving stream for the City of Denton, Texas WWTP by using field samples mostly composed of the alga Cladophora sp. and the caged snail Helisoma trivolvis as test species. Additionally, TCS effects on E. coli and Arabidopsis have been shown to reduce fatty acid biosynthesis and total lipid content by inhibiting the trans-2 enoyl- ACP reductase. The lipid synthesis pathway effects of TCS on field samples of Cladophora spp. were also investigated in this study by using [2-14C]acetate radiolabeling procedures. Preliminary results indicate high TCS concentrations are toxic to lipid biosynthesis and reduce [2-14C]acetate incorporation into total lipids. These results have led to the concern that chronic exposure of algae in receiving streams to environmentally relevant TCS concentrations might affect their nutrient value. If consumer growth is limited, trophic cascade strength may be affected and serve to limit population growth and reproduction of herbivores in these riparian systems.
Style APA, Harvard, Vancouver, ISO itp.
49

Pearson, Nicholas John. "Experimental Snap Loading of Synthetic Fiber Ropes." Thesis, Virginia Tech, 2002. http://hdl.handle.net/10919/30925.

Pełny tekst źródła
Streszczenie:
Energy is lost when a rope transfers from a slack state to a taut state. This transfer is called a snap load and can be very violent. It is proposed to use synthetic fiber ropes as a type of passive control device in new or existing structures to mitigate seismic response. Experimental static and snap load (dynamic) tests were conducted on various synthetic fiber ropes. An eleven-foot-tall drop tower was built in the Virginia Tech Structures and Materials Laboratory in order to conduct these tests. Force and acceleration of the drop plate, which slides vertically within the drop tower, were measured with respect to time for all dynamic tests. Acceleration data was integrated using the trapezoidal or midpoint rule to obtain velocity and displacement values. Plots were made for each test in order to give a better representation of the results. These plots include representations of force and acceleration vs. time, force vs. absolute displacement, force vs. velocity, and force, acceleration, velocity, and displacement vs. time (during the initial taut phase only). Test results show that energy was dissipated in all of the dynamic drop tests, which was expected. Also, the displacement of each rope did not return to zero at the same time that the force returned to zero after the initial snap load. This proves that the ropes undergo some permanent elongation under load. The stiffness of each rope increased with continuous testing. As more tests are conducted on each rope, the strands are pulled tighter into the braided configuration, which causes the rope to become stiffer.<br>Master of Science
Style APA, Harvard, Vancouver, ISO itp.
50

Boula, de Mareüil Philippe. "Etude linguistique appliquee a la synthese de la parole a partir du texte." Paris 11, 1997. http://www.theses.fr/1997PA112371.

Pełny tekst źródła
Streszczenie:
Cette these est consacree a une etude linguistique appliquee a la synthese de la parole a partir du texte. Elle se divise en deux volets : la conversion grapheme-phoneme et l'analyse syntaxique, notamment pour la generation automatique de la prosodie. La conversion grapheme-phoneme, dans des langues comme le francais, est hautement dependante du contexte : l'accent a ici ete mis sur les ambiguites morpho-phonologiques, les glides et le schwa, les liaisons et les noms propres. Les nombres et les abreviations, problemes qui peuvent etre qualifies d'extra-lexicaux, sont pretraites en amont. Un systeme de synthese vocale a partir du texte necessitant une analyse syntaxique, une grammaire en troncons a ete developpee, qui segmente la phrase en sequences non recursives. Celles-ci permettent de definir des frontieres prosodiques potentielles (mineures, majeures ou majeures intermediaires). Nous nous sommes efforces de proceder par intention : conversion grapheme-phoneme par regles plutot qu'a base de lexique d'exceptions (jusque dans le traitement des sigles) et etiquetage en parties du discours non lexicaliste. L'approche structurelle a egalement ete preferee aux modeles probabilistes, pour l'oralisation des noms propres et la resolution des ambiguites de l'orthographe francaise (ou un critere de regle plus generale a ete mis en evidence), comme pour l'etiquetage et le parenthesage morpho-syntaxiques (ou un principe d'ensemble de categories possibles a ete applique). Ce traitement automatique (pretraitement, conversion grapheme-phoneme, analyse syntaxique et regles syntactico-prosodiques) a ete integre dans un systeme de synthese de la parole a partir du texte. Il a ete abondamment evalue, et les resultats sont tres positifs.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii