Dissertations / Theses on the topic 'Parole, Systèmes de traitement de la'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Parole, Systèmes de traitement de la.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Choumane, Ali. "Traitement générique des références dans le cadre multimodal parole-image-tactile." Rennes 1, 2008. ftp://ftp.irisa.fr/techreports/theses/2008/choumane.pdf.
Full textWe are interested in multimodal human-computer communication systems that use the following modes: speech, gesture and vision. The user communicates with the system by oral utterance in natural language and/or by gesture. The user's request contains his/her goal and the designation of objects (referents) required to the goal realisation. The system should identify in a precise and non ambiguous way the designated objects. In this context, we aim to improve the understanding process of multimodal requests. Hence, we propose a generic set of processing of modalities, for fusion and for reference resolution. The main aspects of the realisation consist in modeling the natural language processing in speech environment, the gesture processing and the visual context (visual salience use) while taking into account the difficulties in multimodal context: speech recognition errors, natural language ambiguity, gesture imprecision due to the user performance, designation ambiguity due to the perception of the displayed objects or to the display topology. To complete the interpretation of the user's request, we propose a method for fusion/verification of modalities processing results to find the designated objects by the user
Mauclair, Julie. "Mesures de confiance en traitement automatique de la parole et applications." Le Mans, 2006. http://cyberdoc.univ-lemans.fr/theses/2006/2006LEMA1027.pdf.
Full textLeboeuf, Jérôme. "Un système connexionniste appliqué au traitement automatique de la parole." Paris 11, 1988. http://www.theses.fr/1988PA112276.
Full textThe adaptative, dynamic and associative model ADAM is aimed at processing patterns that involve a temporal dimension. The design of a software simulation allowed us to study its behavior and to show the role of its parameters. The speech signal is transformed into a set of events, each event corresponding to an energy gap within a frequency channel. The high variability of the resulting input patterns leads us to propose a mecanism of global comparison, the architecture of which is derived from the initial model. The recognition tests showed the advantage of our approach in the treatment of speech signal disturbed with added speech
Bazillon, Thierry. "Transcription et traitement manuel de la parole spontanée pour sa reconnaissance automatique." Phd thesis, Université du Maine, 2011. http://tel.archives-ouvertes.fr/tel-00598427.
Full textTihoni, Jacqueline. "Geph : un générateur phonologique expert. Applications au traitement automatique de la parole." Toulouse 3, 1991. http://www.theses.fr/1991TOU30186.
Full textVeloz, Guerrero Arturo. "Un système de compréhension de parole continue sur microprocesseur." Paris 11, 1985. http://www.theses.fr/1985PA112240.
Full textThis thesis describes the implementation of a speech understanding system on a microprocessor. The system is designed to accept continuous speech from one speaker and to work within the context of a limited task situation and small vocabularies. The system utilizes phonetic recognition at the phonetic level and an optimal one-pass dynamic programming algorithm at the lexical and syntactic levels. The system has an interactive program for the definition of grammars for a given specific task language and a program of orthographic-phonetic translation that takes into account some phonological variations of words
Spalanzani, Anne. "Algorithmes évolutionnaires pour l'étude de la robustesse des systèmes de reconnaissance automatique de la parole." Phd thesis, Université Joseph Fourier (Grenoble), 1999. http://tel.archives-ouvertes.fr/tel-00004850.
Full textLoiselle, Stéphane. "Traitement bio-inspiré de la parole pour système de reconnaissance vocale." Thèse, Université de Sherbrooke, 2010. http://savoirs.usherbrooke.ca/handle/11143/1952.
Full textAhafhaf, Mohamed. "Evaluation des systèmes de dialogue oral homme-machine : quelques éléments linguistiques appliqués au paradigme DCR." Grenoble 3, 2004. http://www.theses.fr/2004GRE39048.
Full textVillaneau, Jeanne. "Contribution au traitement syntaxico-pragmatique de la langue naturelle parlée : approche logique pour la compréhension de la parole." Lorient, 2003. http://www.theses.fr/2003LORIS026.
Full textGoulian, Jerome. "Stratégie d'analyse détaillée pour la compréhension automatique robuste de la parole." Lorient, 2002. http://www.theses.fr/2002LORIS021.
Full textThis PHD focusses on speech understanding in man-machine communication. We discuss the issue of how a speech understanding system can be made robust against spontaneous speech phenomena as well as achieving a detailed analysis of spoken French. We argue that a detailed linguistic analysis (with both syntax and semantics) is essential for correctly process spoken utterances and is also a necesary condition to develop applications that are not entirely dedicated to a very specific task but present sufficient genericity. The system presented (ROMUS) implements speech understanding in a two-satge process. The first one achieves a finite-state shallow parsing consists in segmenting the utterance into basic units (spoken adaptated chunks). This stage is generic and is motivated by the regularities observed in spoken French. The second one, a Link Grammar parser, looks for inter-chunks dependencies in order to build a rich representation of the semantic structure of the utterance
Cotto, Daniel. "Traitement automatique des textes en vue de la synthèse vocale." Toulouse 3, 1992. http://www.theses.fr/1992TOU30225.
Full textLecorvé, Gwénolé. "Adaptation thématique non supervisée d'un système de reconnaissance automatique de la parole." Phd thesis, INSA de Rennes, 2010. http://tel.archives-ouvertes.fr/tel-00566824.
Full textMathieu, François-Arnould. "Prise en compte de contraintes pragmatiques pour guider un système de reconnaissance de la parole : le système COMPPA[S. L. ]." Nancy 1, 1997. http://www.theses.fr/1997NAN10022.
Full textIn order to develop robust and efficient speech recognition system, the number of possible hypotheses corresponding to a spoken utterance has to be drastically reduced. In the specific framework of vocal command systems, this often leads to the design of languages that are difficult to learn. As a consequence, the use of these systems is limited to environment where neither keyboard, nor mouse can be used (chapter 1). Our purpose is then to take into account the application and dialogue context to define the accepted language dynamically. For example, the command "erase the green cube" will not be considered at first if there is no instance of a green cube in the application at the time the sentence is uttered. Similarly, the utterance "erase it" will be eliminated if the pronoun "it" is irrelevant in the current dialogue context. Constraining the recognition process by means of such pragmatic considerations allows us to accept a more natural language (chapter 2). [. . . ]
Camelin, Nathalie. "Stratégies robustes de compréhension de la parole basées sur des méthodes de classification automatique." Avignon, 2007. http://www.theses.fr/2007AVIG0149.
Full textThe work presented in this PhD thesis deals with the automatic Spoken Language Understanding (SLU) problem in multiple speaker applications which accept spontaneous speech. The study consists in integrating automatic classification methods in the speech decoding and understanding processes. My work consists in adapting methods, wich have already shown good performance in text domain, to the particularities of an Automatic Speech Recognition system outputs. The main difficulty of the process of this type of data is due to the uncertainty in the input parameters for the classifiers. Among all existing automatic classification methods, we choose to use three of them. The first is based on Semantic Classification Trees, the two others classification methods, considered among the most performant in the scientific community of machine learning, are large margin ones based on boosting and support vector machines. A sequence labelling method, Conditional Random Fields (CRF), is also studied and used. Two applicative frameworks are investigated : -PlanResto is a tourism application of human-computer dialogue. It enables users to ask information about a restaurant in Paris in natural language. The real-time speech understanding process consists in building a request for a database. Within this framework, the consensual agreement of the different classifiers, considered as semantic experts, is used as a confidence measure ; -SCOrange is a spoken telephone survey corpus. The purpose is to collect messages of mobile users expressing their opinion about the customer service. The off-line speech understanding process consists in evaluating proportions of opinions about a topic and a polarity. Classifiers enable the extraction of user's opinions in a strategy that can reliably evalute the distribution of opinions and their temporal evolution
Chevelu, Jonathan. "Production de paraphrases pour les systèmes vocaux humain-machine." Phd thesis, Université de Caen, 2011. http://tel.archives-ouvertes.fr/tel-00603750.
Full textPouteau, Xavier. "Dialogue de commande multimodal en milieu opérationnel : une communication naturelle pour l'utilisateur ?" Nancy 1, 1995. http://www.theses.fr/1995NAN10419.
Full textBarbier, Vincent. "Utilisation de connaissances sémantiques pour l’analyse de justifications de réponses à des questions." Paris 11, 2009. http://www.theses.fr/2009PA112127.
Full textThis thesis belongs to the domain of question-answering systems. These systems receive a question in natural language from the user and search for the answer in a collection of documents. This work relies on the notion of justification, which is formalised as a mapping between the pieces of linguistic information of the question and the corresponding elements in the answer passage. That model takes into account three categories of linguistic phenomena : paradigmatic (local) variations of terms (semantical, morphological, inference), syntagmatic links between sentence constituents, and a component of enunciative semantics linking together the remote elements (by anaphora, coreference, thematisation), in a multi-sentence context, as well mono- or multi-documents. In this work, I first describe the semi-automatic extraction of a corpus of question-answer couples. That corpus brings together couples of a question and a answering passage where has been annotated the before-mentioned structure of the justification. On the corpus, we measure the justifications' conformation in terms of semantic variation and spatial extension. Then, I describe an evaluate a program for extracting and weighting the justifications located in the newspaper articles' passages brought by a question-answering processing chain. My program aims at preserving the system's ability to produce a structured justification, while making possible to integrate a large variety of heterogeneous linguistic processes of various nature, granularity level and reliability
Husson, Jean-Luc. "Une approche hiérarchique de la segmentation du signal de parole." Nancy 1, 1998. http://www.theses.fr/1998NAN10292.
Full textServan, Christophe. "Apprentissage automatique et compréhension dans le cadre d'un dialogue homme-machine téléphonique à initiative mixte." Phd thesis, Université d'Avignon, 2008. http://tel.archives-ouvertes.fr/tel-00591997.
Full textFrath, Pierre. "Semantique, reference et acquisition automatique de connaissances a partir de textes." Strasbourg 2, 1997. http://www.theses.fr/1997STR20079.
Full textAutomatic knowledge acquisition from text ideally consists in generating a structured representation of a corpus, which a human or a machine should be able to query. Designing and realising such a system raises a number of difficulties, both theoretical and practical, which we intend to look into. The first part of this dissertation studies the two main approaches to the problem : automatic terminology retrieval, and model driven knowledge acquisition. The second part studies the mostly implicit theoretical foundations of natural language processing i. E. Logical positivism and componential lexical semantics. We offer an alternative inspired from the work of charles sanders peirce, ludwig wittgenstein and georges kleiber, i. E. A semantics based on the notions of sign, usage and reference. The third part is devoted to a detailed semantic analysis of a medical corpus. Reference is studied through two notions, denomination and denotation. Denominations allow for arbitrary, preconstructed and opaque reference; denotations, for discursive, constructed and transparent reference. In the fourth part, we manually construct a detailed representation of a fragment of the corpus. The aim is to study the relevance of the theoretical analysis and to set precise objectives to the system. The fifth part focuses on implementation. It is devoted to the construction of a terminological knowledge base capable of representing a domain corpus, and sufficiently structured for use by applications in terminology or domain modelling for example. In a nutshell, this dissertation examines automatic knowledge acquisition from text from a theoretical and technical point of view, with the technology setting the guidelines for the theoretical discussions
Milhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel." Thesis, Paris, ENST, 2014. http://www.theses.fr/2014ENST0087/document.
Full textRecently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism
Bobillet, William. "Contribution à l'étude des modèles à erreurs dans les variables : application au traitement de la parole et à l'estimation de canaux de propagation." Bordeaux 1, 2007. http://www.theses.fr/2007BOR13391.
Full textWu, Zong Liang. "Peut-on entendre des événements articulatoires ? : traitement temporel de la parole dans un modèle du système auditif." Grenoble INPG, 1990. http://www.theses.fr/1990INPG0093.
Full textFohr, Dominique. "Aphodex : Un système expert en décodage acoustico-phonétique de la parole continue." Nancy 1, 1986. http://docnum.univ-lorraine.fr/public/SCD_T_1986_0416_FOHR.pdf.
Full textGrisvard, Olivier. "Modélisation et gestion du dialogue oral homme-machine de commande." Nancy 1, 2000. http://www.theses.fr/2000NAN10011.
Full textTo design a spoken man-machine command dialogue system to be used by the largest number of people, that is even people who are not specialists of interacting with computers, is not an easy task. On the one hand, it requires to take into account sorne characteristics of human conversation in general, in order to provide the system with natural means of interacting with the user. On the other hand, it implies to respect constraints specifie to task-based dialogue, that is dialogue used to manage a definite computer task. Given such a framework, we propose a model for this class of dialogues. Although the model's main purpose is to be implemented in a real command system, its definition is based on an in-depth study of princip les and mecanisms of man-man dialogue. More precisely, our dialogue model comprises a structured representation formalism for task and dialogue data, which is based on the notion of eventuality, as well as a dialogue management procedure. This procedure includes pragmatic analysis of user utterances, effective management of the event-based dialogue representation, application management, and system utterance production. The model we propose is intended to be generic enough in order to be independent of the application
Le, Maguer Sébastien. "Évaluation expérimentale d'un système statistique de synthèse de la parole, HTS, pour la langue française." Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00934060.
Full textBéchet, Frédéric. "Système de traitement de connaissances phonétiques et lexicales : application à la reconnaissance de mots isolés sur de grands vocabulaires et à la recherche de mots cibles dans un discours continu." Avignon, 1994. http://www.theses.fr/1994AVIG0106.
Full textBen, Jannet Mohamed Amer. "Évaluation adaptative des systèmes de transcription en contexte applicatif." Thesis, Université Paris-Saclay (ComUE), 2015. http://www.theses.fr/2015SACLS041/document.
Full textIt is important to regularly assess the technological innovation products in order to estimate the level of maturity reached by the technology and study the applications frameworks in which they can be used. Natural language processing (NLP) aims at developing modules and applications that automatically process the human language. That makes the field relevant to beth research and technological innovation. For years, the different technological modules from the NLP were developed separately. Therefore, the existing evaluation methods are in most modular. They allow to evaluate only one module at a time, while today, many applications need to combine several NLP modules to solve complex tasks. The new challenge in terms of evaluation is then to evaluate the different modules while taking into account the applicative context.Our work addresses the evaluation of Automatic Speech Recognition (ASR) systems according to the applicative context. We will focus on the case of Named Entities Recognition (NER) from spoken documents transcriped automatically. In the first part, we address the issue of evaluating ASR systems according to the application context through a study of the state of the art. We describes the tasks of ASR and NER proposed during several evalution campaigns and we discuss the protocols established for their evaluation. We also point the limitations of modular evaluation approaches and we expose the alternatives measures proposed in the literature. In the second part we describe the studied task of named entities detection, classification and decomposition and we propose a new metric ETER (Entity Tree Error Rate) which allows to take into account the specificity of the task and the applicative context during the evaluation. ETER also eliminates the biases observed with the existing metrics. In the third part, we define a new measure ATENE (Automatic Transcriptions Evaluation for Named Entities) that evaluates the quality of ASR systems and the impact of their errors for REN systems applied downstream. Rather than directly comparing reference and hypothesis transcriptions, ATENE measure how harder it becames to identify entities given the differences between hypothesis and reference by comparing an estimated likelihood of presence of entities. It is composed of two elementary measurements. The first aims to assess the risk of entities deletions and substitutions and the second aims to assess the risk of entities insertions caused by ASR errors.Our validation experiments show that the measurements given by ATENE correlate better than other measures from the state of the art with the performance of REN systems
Mignot, Christophe. "Usage de la parole et du geste dans les interfaces multimodales : étude expérimentale et modélisation." Nancy 1, 1995. http://www.theses.fr/1995NAN10229.
Full textGong, Yifan. "Contribution à l'interprétation automatique des signaux en présence d'incertitude." Nancy 1, 1988. http://www.theses.fr/1988NAN10035.
Full textTichon, Jacques. "Conception et réalisation d'un système de communication pour handicapés, utilisant des techniques d'accès à un dictionnaire." Lille 1, 1985. http://www.theses.fr/1985LIL10078.
Full textSéjourné, Kévin. "Questions réponses et interactions." Phd thesis, Université Paris Sud - Paris XI, 2009. http://tel.archives-ouvertes.fr/tel-00618412.
Full textCabana, Antoine. "Contribution à l'évaluation opérationnelle des systèmes biométriques multimodaux." Thesis, Normandie, 2018. http://www.theses.fr/2018NORMC249/document.
Full textDevelopment and spread of connected devices, in particular smartphones, requires the implementation of authentication methods. In an ergonomic concern, manufacturers integrates biometric systems in order to deal with logical control access issues. These biometric systems grant access to critical data and application (payment, e-banking, privcy concerns : emails...). Thus, evaluation processes allows to estimate the systems' suitabilty with these uses. In order to improve recognition performances, manufacturer are susceptible to perform multimodal fusion.In this thesis, the evaluation of operationnal biometric systems has been studied, and an implementation is presented. A second contribution studies the quality estimation of speech samples, in order to predict recognition performances
Maurel, Fabrice. "Transmodalité et multimodalité écrit/oral : modélisation, traitement automatique et évaluation de stratégies de présentation des structures "visuo-architecturale" des textes." Toulouse 3, 2004. http://www.theses.fr/2004TOU30256.
Full textWe are interested in the utility and, if the need arises, the usability of texts visual structure, within the framework of their oral transposition. We propose the synoptic of an oralisation system who leads to a text representation directly interpretable by Text-To-Speech systems. We partially realized the module specific to the oralisation strategies, in order to render some signifying parts of the text often “forgotten” by synthesis systems. The first results of this study led to specifications in the course of integration by an industrial partner. Predictive hypothesis, related to the impact on memorizing/understanding of two strategies coming from our Reformulation-based Oralisation Model for Texts Written to be Silently Read (MORTELS), have been formulated and tested. This work shows that cognitive functions was lost. Prototypes, exploiting the “Page Reflection” notion, have been conceived through interfaces in which multimodality is used to fill this gaps
Meurs, Marie-Jean. "Approche stochastique bayésienne de la composition sémantique pour les modules de compréhension automatique de la parole dans les systèmes de dialogue homme-machine." Phd thesis, Université d'Avignon, 2009. http://tel.archives-ouvertes.fr/tel-00634269.
Full textMorris, Andrew Cameron. "Analyse informationnelle du traitement de la parole dans le système auditif périphérique et le noyau cochléaire : application à la reconnaissance des occlusives voisées du français." Grenoble INPG, 1992. http://www.theses.fr/1992INPG0140.
Full textMangeol, Bernard. "La composante lexicale dans les systèmes de dialogue oral homme-machine du CRIN." Nancy 1, 1988. http://www.theses.fr/1988NAN10178.
Full textHueber, Thomas. "Reconstitution de la parole par imagerie ultrasonore et vidéo de l'appareil vocal : vers une communication parlée silencieuse." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2009. http://pastel.archives-ouvertes.fr/pastel-00005707.
Full textNguyen, Roselyne. "Un système multi-agent pour la machine à dicter vocale MAUD : conception et intégration d'une source de connaissances phonologiques." Nancy 1, 1996. http://www.theses.fr/1996NAN10321.
Full textAbdallah, Nassib. "Interprétation des signaux cérébraux pour l’autonomie des handicapés : Système de reconnaissance de mots imaginés." Thesis, Angers, 2018. http://www.theses.fr/2018ANGE0038.
Full textThe Brain Machine interfaces represent a solution to restore several human issues such as movement, speech, etc. The construction of BCI consists of four main phases: "Data Recording", "Signal preprocessing", "Extraction and Selection of Characteristics", and "Classification". In this report we present a new imagery recognition system based on a non-invasive (EEG) and portable acquisition technique to facilitate communication with the outside world for people with specific disabilities.This thesis includes a system called FEASR for the construction of a relevant and optimized database. This database has been tested with several classification methods to obtain a maximum recognition rate of 83.4% for five words imagined in Arabic. In addition, we discuss the impact of optimization algorithms (Wernicke sensor selection, principal component analysis algorithm and the selection of subbands resulting from the discrete wavelet transform decomposition) on recognition percentages according to the size of our database and its reduction
Lepauloux, Ludovick. "Prise de son distante par système multimicrophone. Application à la communication parlée en environnement bruyant." Phd thesis, Université Rennes 1, 2010. http://tel.archives-ouvertes.fr/tel-00636256.
Full textLe, Bigot Ludovic. "La recherche d'informations avec un système de dialogue en langage naturel." Poitiers, 2004. http://www.theses.fr/2004POIT5020.
Full textThis study focused on the advantages and disavantages of two different dialogue modes with an artificial system : the spoken and written modes. The purpose was to study the influence exerted by modality on users' verbal behaviour and to interpret the results in the light of the psychological theories which focus on interactive and non-interactive discourse. In the five conducted experiments, the tasks given to the subjects consisted in looking for information by using a real dialogue system and scenarios. All the results suggest that each interaction mode has its own specificities. Text makes the control of activity and task easier to the prejudice of the dialogue component. Speech promotes collaborative behaviours but is constrained by the amount of information to process
Bernard, Guillaume. "Réordonnancement de candidats reponses pour un système de questions-réponses." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00606025.
Full textSpriet, Thierry. "Traitements formels de connaissances linguistiques dans un système de reconnaissance automatique de la parole continue : syrapac." Avignon, 1993. http://www.theses.fr/1993AVIG0104.
Full textWu, Qin. "Élaboration d'algorithmesde la reconnaissance vocale à bord de véhicule." Paris 11, 1987. http://www.theses.fr/1987PA112293.
Full textThis dissertation treats principally the problem of recognition of isolated words pronounced inside a vehicule. In this particular application, the noise injected into the recognition system has a high and variable level with respect to the speech signal. The different chapters will deal with: the localisation of the speech phrase within the noise, the discrimination of noise with respect to speech, - the adaptation of the system with respect to the ambient environment, the noise soustraction. Algorithms for speech recognition are also discussed and developed. The last chapter describes a speech recognition system designed around o single-chip microprocesseur (INTEL 8096)
Roussanaly, Azim. "Dial, la composante dialogue d'un système de communication orale homme-machine finalisée en langage naturel." Nancy 1, 1988. http://www.theses.fr/1988NAN10461.
Full textKhouzaimi, Hatim. "Turn-taking enhancement in spoken dialogue systems with reinforcement learning." Thesis, Avignon, 2016. http://www.theses.fr/2016AVIG0213/document.
Full textIncremental dialogue systems are able to process the user’s speech as it is spoken (without waiting for the end of a sentence before starting to process it). This makes them able to take the floor whenever they decide to (the user can also speak whenever she wants, even if the system is still holding the floor). As a consequence, they are able to perform a richer set of turn-taking behaviours compared to traditional systems. Several contributions are described in this thesis with the aim of showing that dialogue systems’ turn-taking capabilities can be automatically improved from data. First, human-human dialogue is analysed and a new taxonomy of turn-taking phenomena in human conversation is established. Based on this work, the different phenomena are analysed and some of them are selected for replication in a human-machine context (the ones that are more likely to improve a dialogue system’s efficiency). Then, a new architecture for incremental dialogue systems is introduced with the aim of transforming a traditional dialogue system into an incremental one at a low cost (also separating the turn-taking manager from the dialogue manager). To be able to perform the first tests, a simulated environment has been designed and implemented. It is able to replicate user and ASR behaviour that are specific to incremental processing, unlike existing simulators. Combined together, these contributions led to the establishement of a rule-based incremental dialogue strategy that is shown to improve the dialogue efficiency in a task-oriented situation and in simulation. A new reinforcement learning strategy has also been proposed. It is able to autonomously learn optimal turn-taking behavious throughout the interactions. The simulated environment has been used for training and for a first evaluation, where the new data-driven strategy is shown to outperform both the non-incremental and rule-based incremental strategies. In order to validate these results in real dialogue conditions, a prototype through which the users can interact in order to control their smart home has been developed. At the beginning of each interaction, the turn-taking strategy is randomly chosen among the non-incremental, the rule-based incremental and the reinforcement learning strategy (learned in simulation). A corpus of 206 dialogues has been collected. The results show that the reinforcement learning strategy significantly improves the dialogue efficiency without hurting the user experience (slightly improving it, in fact)
Ouayoun, Michel-Christian. "Traitement phonetique de la parole pour implants cochleaires." Paris 11, 1997. http://www.theses.fr/1997PA112454.
Full textBougares, Fethi. "Attelage de systèmes de transcription automatique de la parole." Phd thesis, Université du Maine, 2012. http://tel.archives-ouvertes.fr/tel-00839990.
Full text