Log in

Relevant bibliographies by topics / Spoken Dialog System / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Spoken Dialog System.

Dissertations / Theses on the topic 'Spoken Dialog System'

Author: Grafiati

Published: 4 June 2021

Last updated: 25 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 44 dissertations / theses for your research on the topic 'Spoken Dialog System.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Takeda, Kazuya, Norihide Kitaoka, and Sunao Hara. "Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram." ISCA(International Speech Communication Association), 2010. http://hdl.handle.net/2237/15498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Intilisano, Antonio Rosario. "Spoken dialog systems: from automatic speech recognition to spoken language understanding." Doctoral thesis, Università di Catania, 2016. http://hdl.handle.net/10761/3920.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Acosta, Jaime Cesar. "Using emotion to gain rapport in a spoken dialog system." To access this resource online via ProQuest Dissertations and Theses @ UTEP, 2009. http://0-proquest.umi.com.lib.utep.edu/login?COPT=REJTPTU0YmImSU5UPTAmVkVSPTI=&clientId=2515.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Carter, Teresa G. "Five-Factor Model as a Predictor for Spoken Dialog Systems." NSUWorks, 2016. http://nsuworks.nova.edu/gscis_etd/990.

Full text

Abstract:

Human behavior varies widely as does the design of spoken dialog systems (SDS). The search for predictors to match a user’s preference and efficiency for a specific dialog interface type in an SDS was the focus of this research. By using personality as described by the Five-Factor Method (FFM) and the Wizard of Oz technique for delivering three system initiatives of the SDS, participants interacted with each of the SDS initiatives in scheduling an airline flight. The three system initiatives were constructed as strict system, which did not allow the user control of the interaction; mixed system, which allowed the user some control of the interaction but with a system override; and user system, which allowed the user control of the interaction. In order to eliminate gender bias in using the FFM as the instrument, participants were matched in gender and age. Participants were 18 years old to 70 years old, passed a hearing test, had no disability that prohibited the use of the SDS, and were native English speakers. Participants completed an adult consent form, a 50-question personality assessment as described by the FFM, and the interaction with the SDS. Participants also completed a system preference indication form at the end of the interaction. Observations for efficiency were recorded on paper by the researcher. Although the findings did not show a definitive predictor for a SDS due to the small population sample, by using a multinomial regression approach to the statistical analysis, odds ratios of the data helped draw conclusions that support certain personality factors as important roles in a user’s preference and efficiency in choosing and using a SDS. This gives an area for future research. Also, the presumption that preference and efficiency always match was not supported by the results from two of the three systems. An additional area for future research was discovered in the gender data. Although not an initial part of the research, the data shows promise in predicting preference and efficiency for certain SDS. Future research is indicated.

APA, Harvard, Vancouver, ISO, and other styles

5

Takeda, Kazuya, Norihide Kitaoka, and Sunao Hara. "Detection of task-incomplete dialogs based on utterance-and-behavior tag N-gram for spoken dialog systems." ISCA(International Speech Communication Association), 2011. http://hdl.handle.net/2237/15499.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Milhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel." Thesis, Paris, ENST, 2014. http://www.theses.fr/2014ENST0087/document.

Full text

Abstract:

L'interaction vocale avec des systèmes automatiques connaît, depuis quelques années, un accroissement dans l'intérêt que lui porte tant le grand public que la communauté de la recherche. Cette thèse s'inscrit dans ce cadre pour aborder le sujet depuis deux points de vue complémentaires. D'une part, celui apparent de la fiabilité, de l'efficacité et de l'utilisabilité de ces interfaces. D'autre part, les aspects de conception et d'implémentation sont étudiés pour apporter des outils de développement aux concepteurs plus ou moins initiés de tels systèmes. A partir des outils et des évolutions dans le domaine, une plate-forme modulaire de dialogue vocal a été agrégée. L'interaction continue, basée sur une "écoute" permanente du système pose des problèmes de segmentation, de débruitage, de capture de son, de sélection des segments adressés au système, etc... Une méthode simple, basée sur la comparaison des résultats de traitements parallèles a prouvé son efficacité, tout comme ses limites pour une interaction continue avec l'utilisateur. Les modules de compréhension du langage forment un sous-système interconnecté au sein de la plate-forme. Ils sont les adaptations d'algorithmes de l'état de l'art comme des idées originales. Le choix de la gestion du dialogue basé sur des modèles de tâches hiérarchiques, comme c'est la cas pour la plate-forme, est argumenté. Ce formalisme est basé sur une construction humaine et présente, de fait, des obstacles pour concevoir, implémenter, maintenir et faire évoluer les modèles. Pour parer à ceux-ci, un nouveau formalisme est proposé qui se transforme en hiérarchie de tâches grâce aux outils associés
Recently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism

APA, Harvard, Vancouver, ISO, and other styles

7

Figueiredo, Sara Cristina Albuquerque. "Development of a dialog system for interaction with robots." Master's thesis, Universidade de Aveiro, 2014. http://hdl.handle.net/10773/14030.

Full text

Abstract:

Mestrado em Engenharia de Computadores e Telemática
Service robots operate in the same environment as humans and perform actions that a human usually performs. These robots must be able to operate autonomously in unknown and dynamic environments, as well as to maneuver with several people and know how to deal with them. By complying with these requirements, they are able to successfully address humans and fulfill their requests whenever they need assistance in a certain task. Natural language communication, including speech that is the most natural way of communication between humans, becomes relevant in the field of Human-Robot Interaction (HRI). By endowing service robots with intuitive spoken interfaces, the specification of the human required tasks is facilitated. However, this is a complicated task to achieve due to the resources involved in creating a sufficiently intuitive spoken interface and because of the difficulty of deploying it in different robots. The main objective of this thesis is the definition, implementation and evaluation of a dialogue system that can be easily integrated into any robotic platform and that functions as a flexible base for the creation of any conversational scenario in the Portuguese language. The system must meet the basic requirements for intuitive and natural communications, namely the characteristics of human-human conversations. A system was developed that functions as a base to give continuity to future work on Spoken Dialog Systems. The system incorporates the client-server architecture, where the client runs on the robot and captures what the user says. The client takes advantage on external dialogue management services. They are executed by the server, which processes the audio obtained, returning an appropriate response given the context of the dialogue. The development was based on a critical analysis of the state of the art in order for the system to be as faithful as possible to what is already done. Through the evaluation phase of the system, it was managed to obtain by few volunteers the conclusion that the main objective was accomplished: a base system was created that is flexible enough to explore different contexts of conversation, such as interacting with children or providing information on a university environment.
Os robôs de serviço operam no mesmo ambiente dos humanos e executam ações que um humano normalmente executaria. Estes robôs devem ser capazes de operar de forma autónoma em ambientes desconhecidos e dinâmicos, assim como de manobrar em ambientes com várias pessoas e de saberem lidar com elas. Ao respeitarem estes requisitos, conseguirão abordar com sucesso os humanos e cumprir as suas solicitações sempre que estes precisem de assistência em alguma tarefa. A comunicação por linguagem natural, nomeadamente a fala que é a forma mais abrangente de comunicação entre humanos, torna-se relevante na área da Interação humano-robô (IHR). Ao dotar os robôs de serviço com sistemas de voz intuitivos facilita-se a especificação das tarefas a realizar. No entanto, é uma tarefa complicada de se realizar devido aos recursos envolvidos na criação de uma interação suficientemente intuitiva e devido à dificuldade de funcionar em diversos robôs. O objetivo principal deste trabalho é a definição, implementação e avaliação de um sistema de diálogo que seja de fácil integração em qualquer sistema robótico e que funcione como uma base flexível para qualquer cenário de conversação na língua Portuguesa. Deve obedecer a requisitos base de comunicação intuitiva e natural, nomeadamente a características de conversas entre humanos. Foi desenvolvido um sistema que funciona como uma base para dar continuidade a trabalho futuro em sistemas de diálogo. O sistema incorpora a arquitetura cliente-servidor onde o cliente é executado no robô e capta o que o utilizador diz. O cliente tira partido de serviços de gestão de diálogo externos ao robô, executados pelo servidor, que processa o áudio obtido, devolvendo uma resposta ao cliente adequada ao contexto do diálogo. O desenvolvimento foi baseado numa análise crítica do estado da arte para se tentar manter fiel ao que já foi feito e de forma a se tomarem as principais decisões durante a implementação. Mediante a fase de avaliação do sistema, tanto a nível do ponto de vista da interação como do programador, conseguiu-se obter por parte de alguns voluntários que o objetivo principal foi cumprido: foi criada uma base suficientemente flexível para explorar diferentes contextos de conversação, nomeadamente interagir com crianças ou fornecimento de informações em ambiente universitário.

APA, Harvard, Vancouver, ISO, and other styles

8

Milhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel." Electronic Thesis or Diss., Paris, ENST, 2014. http://www.theses.fr/2014ENST0087.

Full text

Abstract:

L'interaction vocale avec des systèmes automatiques connaît, depuis quelques années, un accroissement dans l'intérêt que lui porte tant le grand public que la communauté de la recherche. Cette thèse s'inscrit dans ce cadre pour aborder le sujet depuis deux points de vue complémentaires. D'une part, celui apparent de la fiabilité, de l'efficacité et de l'utilisabilité de ces interfaces. D'autre part, les aspects de conception et d'implémentation sont étudiés pour apporter des outils de développement aux concepteurs plus ou moins initiés de tels systèmes. A partir des outils et des évolutions dans le domaine, une plate-forme modulaire de dialogue vocal a été agrégée. L'interaction continue, basée sur une "écoute" permanente du système pose des problèmes de segmentation, de débruitage, de capture de son, de sélection des segments adressés au système, etc... Une méthode simple, basée sur la comparaison des résultats de traitements parallèles a prouvé son efficacité, tout comme ses limites pour une interaction continue avec l'utilisateur. Les modules de compréhension du langage forment un sous-système interconnecté au sein de la plate-forme. Ils sont les adaptations d'algorithmes de l'état de l'art comme des idées originales. Le choix de la gestion du dialogue basé sur des modèles de tâches hiérarchiques, comme c'est la cas pour la plate-forme, est argumenté. Ce formalisme est basé sur une construction humaine et présente, de fait, des obstacles pour concevoir, implémenter, maintenir et faire évoluer les modèles. Pour parer à ceux-ci, un nouveau formalisme est proposé qui se transforme en hiérarchie de tâches grâce aux outils associés
Recently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism

APA, Harvard, Vancouver, ISO, and other styles

9

Vaienti, Andrea. "Assistenti Vocali: Una Panoramica." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019.

Find full text

Abstract:

Gli assistenti vocali sono uno dei principali cambiamenti nell'interazione tra uomo e macchina del recente passato e consentono molteplici utilizzi che semplificano ed automatizzano attività potenzialmente noiose e ripetitive. Ad oggi, con il continuo sviluppo della tecnologia, i sistemi ad interazione vocale stanno diventando sempre più parte della nostra vita quotidiana, aprendo un nuovo mondo in cui è possibile comunicare con una macchina interagendo come se fosse un essere umano. Questo è possibile grazie alla sofisticata programmazione di tecnologie come l’elaborazione del linguaggio naturale e l’apprendimento automatico, che consentono la comprensione del comando emesso dall'utente e la generazione di una risposta in tempo reale, potendo così creare un dialogo complesso ed eseguire le diverse richieste dell’utente. I numerosi dispositivi a basso costo sviluppati negli ultimi anni hanno permesso di portare gli assistenti vocali nell'utilizzo quotidiano di tutte le persone, permettendo l’esecuzione di qualsiasi azione, dal porre semplici domande informative alla riproduzione di musica o il controllo di elettrodomestici intelligenti tramite il controllo vocale.

APA, Harvard, Vancouver, ISO, and other styles

10

Meurs, Marie-Jean. "Approche stochastique bayésienne de la composition sémantique pour les modules de compréhension automatique de la parole dans les systèmes de dialogue homme-machine." Phd thesis, Université d'Avignon, 2009. http://tel.archives-ouvertes.fr/tel-00634269.

Full text

Abstract:

Les systèmes de dialogue homme-machine ont pour objectif de permettre un échange oral efficace et convivial entre un utilisateur humain et un ordinateur. Leurs domaines d'applications sont variés, depuis la gestion d'échanges commerciaux jusqu'au tutorat ou l'aide à la personne. Cependant, les capacités de communication de ces systèmes sont actuellement limités par leur aptitude à comprendre la parole spontanée. Nos travaux s'intéressent au module de compréhension de la parole et présentent une proposition entièrement basée sur des approches stochastiques, permettant l'élaboration d'une hypothèse sémantique complète. Notre démarche s'appuie sur une représentation hiérarchisée du sens d'une phrase à base de frames sémantiques. La première partie du travail a consisté en l'élaboration d'une base de connaissances sémantiques adaptée au domaine du corpus d'expérimentation MEDIA (information touristique et réservation d'hôtel). Nous avons eu recours au formalisme FrameNet pour assurer une généricité maximale à notre représentation sémantique. Le développement d'un système à base de règles et d'inférences logiques nous a ensuite permis d'annoter automatiquement le corpus. La seconde partie concerne l'étude du module de composition sémantique lui-même. En nous appuyant sur une première étape d'interprétation littérale produisant des unités conceptuelles de base (non reliées), nous proposons de générer des fragments sémantiques (sous-arbres) à l'aide de réseaux bayésiens dynamiques. Les fragments sémantiques générés fournissent une représentation sémantique partielle du message de l'utilisateur. Pour parvenir à la représentation sémantique globale complète, nous proposons et évaluons un algorithme de composition d'arbres décliné selon deux variantes. La première est basée sur une heuristique visant à construire un arbre de taille et de poids minimum. La seconde s'appuie sur une méthode de classification à base de séparateurs à vaste marge pour décider des opérations de composition à réaliser. Le module de compréhension construit au cours de ce travail peut être adapté au traitement de tout type de dialogue. Il repose sur une représentation sémantique riche et les modèles utilisés permettent de fournir des listes d'hypothèses sémantiques scorées. Les résultats obtenus sur les données expérimentales confirment la robustesse de l'approche proposée aux données incertaines et son aptitude à produire une représentation sémantique consistante

APA, Harvard, Vancouver, ISO, and other styles

11

Ma, Yi. "Learning for Spoken Dialog Systems with Discriminative Graphical Models." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440166760.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

河口, 信夫, Nobuo Kawaguchi, 誠. 長森, Makoto Nagamori, 茂樹松原, Shigeki Matsubara, 康善稲垣, and Yasuyoshi Inagaki. "複数の音声対話システムの統合制御機構とその評価." 情報処理学会, 2001. http://hdl.handle.net/2237/6886.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Kanda, Naoyuki. "Open-ended Spoken Language Technology: Studies on Spoken Dialogue Systems and Spoken Document Retrieval Systems." 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/188874.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Vlasenko, Bogdan Verfasser], Andreas [Akademischer Betreuer] [Wendemuth, and Dietmar [Akademischer Betreuer] Rösner. "Emotion recognition within spoken dialog systems / Bogdan Vlasenko. Betreuer: Andreas Wendemuth ; Dietmar Rösner." Magdeburg : Universitätsbibliothek, 2011. http://d-nb.info/1047595907/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Yasavur, Ugan. "Statistical Dialog Management for Health Interventions." FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1550.

Full text

Abstract:

Research endeavors on spoken dialogue systems in the 1990s and 2000s have led to the deployment of commercial spoken dialogue systems (SDS) in microdomains such as customer service automation, reservation/booking and question answering systems. Recent research in SDS has been focused on the development of applications in different domains (e.g. virtual counseling, personal coaches, social companions) which requires more sophistication than the previous generation of commercial SDS. The focus of this research project is the delivery of behavior change interventions based on the brief intervention counseling style via spoken dialogue systems. Brief interventions (BI) are evidence-based, short, well structured, one-on-one counseling sessions. Many challenges are involved in delivering BIs to people in need, such as finding the time to administer them in busy doctors' offices, obtaining the extra training that helps staff become comfortable providing these interventions, and managing the cost of delivering the interventions. Fortunately, recent developments in spoken dialogue systems make the development of systems that can deliver brief interventions possible. The overall objective of this research is to develop a data-driven, adaptable dialogue system for brief interventions for problematic drinking behavior, based on reinforcement learning methods. The implications of this research project includes, but are not limited to, assessing the feasibility of delivering structured brief health interventions with a data-driven spoken dialogue system. Furthermore, while the experimental system focuses on harmful alcohol drinking as a target behavior in this project, the produced knowledge and experience may also lead to implementation of similarly structured health interventions and assessments other than the alcohol domain (e.g. obesity, drug use, lack of exercise), using statistical machine learning approaches. In addition to designing a dialog system, the semantic and emotional meanings of user utterances have high impact on interaction. To perform domain specific reasoning and recognize concepts in user utterances, a named-entity recognizer and an ontology are designed and evaluated. To understand affective information conveyed through text, lexicons and sentiment analysis module are developed and tested.

APA, Harvard, Vancouver, ISO, and other styles

16

Henderson, Matthew S. "Discriminative methods for statistical spoken dialogue systems." Thesis, University of Cambridge, 2015. https://www.repository.cam.ac.uk/handle/1810/249015.

Full text

Abstract:

Dialogue promises a natural and effective method for users to interact with and obtain information from computer systems. Statistical spoken dialogue systems are able to disambiguate in the presence of errors by maintaining probability distributions over what they believe to be the state of a dialogue. However, traditionally these distributions have been derived using generative models, which do not directly optimise for the criterion of interest and cannot easily exploit arbitrary information that may potentially be useful. This thesis presents how discriminative methods can overcome these problems in Spoken Language Understanding (SLU) and Dialogue State Tracking (DST). A robust method for SLU is proposed, based on features extracted from the full posterior distribution of recognition hypotheses encoded in the form of word confusion networks. This method uses discriminative classifiers, trained on unaligned input/output pairs. Performance is evaluated on both an off-line corpus, and on-line in a live user trial. It is shown that a statistical discriminative approach to SLU operating on the full posterior ASR output distribution can substantially improve performance in terms of both accuracy and overall dialogue reward. Furthermore, additional gains can be obtained by incorporating features from the system's output. For DST, a new word-based tracking method is presented that maps directly from the speech recognition results to the dialogue state without using an explicit semantic decoder. The method is based on a recurrent neural network structure that is capable of generalising to unseen dialogue state hypotheses, and requires very little feature engineering. The method is evaluated in the second and third Dialog State Tracking Challenges, as well as in a live user trial. The results demonstrate consistently high performance across all of the off-line metrics and a substantial increase in the quality of the dialogues in the live trial. The proposed method is shown to be readily applied to expanding dialogue domains, by exploiting robust features and a new method for online unsupervised adaptation. It is shown how the neural network structure can be adapted to output structured joint distributions, giving an improvement over estimating the dialogue state as a product of marginal distributions.

APA, Harvard, Vancouver, ISO, and other styles

17

Li, William (William Pui Lum). "Understanding user state and preferences for robust spoken dialog systems and location-aware assistive technology." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/72938.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science; and, (S.M. in Technology and Policy)--Massachusetts Institute of Technology, Engineering Systems Division, Technology and Policy Program, 2012.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 119-125).
This research focuses on improving the performance of spoken dialog systems (SDS) in the domain of assistive technology for people with disabilities. Automatic speech recognition (ASR) has compelling potential applications as a means of enabling people with physical disabilities to enjoy greater levels of independence and participation. This thesis describes the development and evaluation of a spoken dialog system modeled as a partially observable Markov decision process (SDS-POMDP). The SDSPOMDP can understand commands related to making phone calls and providing information about weather, activities, and menus in a specialized-care residence setting. Labeled utterance data was used to train observation and utterance confidence models. With a user simulator, the SDS-POMDP reward function parameters were optimized, and the SDS-POMDP is shown to out-perform simpler threshold-based dialog strategies. These simulations were validated in experiments with human participants, with the SDS-POMDP resulting in more successful dialogs and faster dialog completion times, particularly for speakers with high word-error rates. This thesis also explores the social and ethical implications of deploying location based assistive technology in specialized-care settings. These technologies could have substantial potential benefit to residents and caregivers in such environments, but they may also raise issues related to user safety, independence, autonomy, or privacy. As one example, location-aware mobile devices are potentially useful to increase the safety of individuals in a specialized-care setting who may be at risk of unknowingly wandering, but they raise important questions about privacy and informed consent. This thesis provides a survey of U.S. legislation related to the participation of individuals who have questionable capacity to provide informed consent in research studies. Overall, it seeks to precisely describe and define the key issues that are arise as a result of new, unforeseen technologies that may have both benefits and costs to the elderly and people with disabilities.
by William Li.
S.M.in Technology and Policy
S.M.

APA, Harvard, Vancouver, ISO, and other styles

18

Gustafson, Joakim. "Developing Multimodal Spoken Dialogue Systems : Empirical Studies of Spoken Human–Computer Interaction." Doctoral thesis, KTH, Tal, musik och hörsel, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3460.

Full text

Abstract:

This thesis presents work done during the last ten years on developing five multimodal spoken dialogue systems, and the empirical user studies that have been conducted with them. The dialogue systems have been multimodal, giving information both verbally with animated talking characters and graphically on maps and in text tables. To be able to study a wider rage of user behaviour each new system has been in a new domain and with a new set of interactional abilities. The five system presented in this thesis are: The Waxholm system where users could ask about the boat traffic in the Stockholm archipelago; the Gulan system where people could retrieve information from the Yellow pages of Stockholm; the August system which was a publicly available system where people could get information about the author Strindberg, KTH and Stockholm; the AdAptsystem that allowed users to browse apartments for sale in Stockholm and the Pixie system where users could help ananimated agent to fix things in a visionary apartment publicly available at the Telecom museum in Stockholm. Some of the dialogue systems have been used in controlled experiments in laboratory environments, while others have been placed inpublic environments where members of the general public have interacted with them. All spoken human-computer interactions have been transcribed and analyzed to increase our understanding of how people interact verbally with computers, and to obtain knowledge on how spoken dialogue systems canutilize the regularities found in these interactions. This thesis summarizes the experiences from building these five dialogue systems and presents some of the findings from the analyses of the collected dialogue corpora.
QC 20100611

APA, Harvard, Vancouver, ISO, and other styles

19

Cuayáhuitl, Heriberto. "Hierarchical reinforcement learning for spoken dialogue systems." Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/2750.

Full text

Abstract:

This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision Processes (SMDPs), and proposes two hierarchical reinforcement learning methods to optimize sub-dialogues rather than full dialogues. The first method uses a hierarchy of SMDPs, where every SMDP ignores irrelevant state variables and actions in order to optimize a sub-dialogue. The second method extends the first one by constraining every SMDP in the hierarchy with prior expert knowledge. The latter method proposes a learning algorithm called 'HAM+HSMQ-Learning', which combines two existing algorithms in the literature of hierarchical reinforcement learning. Whilst the first method generates fully-learnt behaviour, the second one generates semi-learnt behaviour. In addition, this research proposes a heuristic dialogue simulation environment for automatic dialogue strategy learning. Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: First, both methods scale well at the cost of near-optimal solutions, resulting in slightly longer dialogues than the optimal solutions. Second, dialogue strategies learnt with coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Third, semi-learnt dialogue behaviours are a better alternative (because of their higher overall performance) than hand-coded or fully-learnt dialogue behaviours. Last, hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of adaptive behaviours in larger-scale spoken dialogue systems. This research makes the following contributions to spoken dialogue systems which learn their dialogue behaviour. First, the Semi-Markov Decision Process (SMDP) model was proposed to learn spoken dialogue strategies in a scalable way. Second, the concept of 'partially specified dialogue strategies' was proposed for integrating simultaneously hand-coded and learnt spoken dialogue behaviours into a single learning framework. Third, an evaluation with real users of hierarchical reinforcement learning dialogue agents was essential to validate their effectiveness in a realistic environment.

APA, Harvard, Vancouver, ISO, and other styles

20

Khouzaimi, Hatim. "Turn-taking enhancement in spoken dialogue systems with reinforcement learning." Thesis, Avignon, 2016. http://www.theses.fr/2016AVIG0213/document.

Full text

Abstract:

Les systèmes de dialogue incrémentaux sont capables d’entamer le traitement des paroles de l’utilisateur au moment même où il les prononce (sans attendre de signal de fin de phrase tel un long silence par exemple). Ils peuvent ainsi prendre la parole à n’importe quel moment et l’utilisateur peut faire de même (et interrompre le système). De ce fait, ces systèmes permettent d’effectuer une plus large palette de comportements de prise de parole en comparaison avec les systèmes de dialogue traditionnels. Cette thèse s’articule autour de la problématique suivante : est-il possible pour un système de dialogue incrémental d’apprendre une stratégie optimale de prise de parole de façon autonome? Tout d’abord, une analyse des mécanismes sous-jacents à la dynamique de prise de parole dans une conversation homme-homme a permis d’établir une taxonomie de ces phénomènes. Ensuite, une nouvelle architecture permettant de doter les systèmes de dialogues conventionnels de capacités de traitement incrémentales de la parole, à moindre coût, a été proposée. Dans un premier temps, un simulateur de dialogue destiné à répliquer les comportements incrémentaux de l’utilisateur et de la reconnaissance vocale a été développé puis utilisé pour effectuer les premier tests de stratégies de dialogue incrémentales. Ces dernières ont été développées à base de règles issues de l’analyse effectuée lors de l’établissement de la taxonomie des phénomènes de prise de parole. Les résultats de la simulation montrent que le caractère incrémental permet d’obtenir des interactions plus efficaces. La meilleure stratégie à base de règles a été retenue comme référence pour la suite. Dans un second temps, une stratégie basée sur l’apprentissage par renforcement a été implémentée. Elle est capable d’apprendre à optimiser ses décisions de prise de parole de façon totalement autonome étant donnée une fonction de récompense. Une première comparaison, en simulation, a montré que cette stratégie engendre des résultats encore meilleurs par rapport à la stratégie à base de règles. En guise de validation, une expérience avec des utilisateurs réels a été menée (interactions avec une maison intelligente). Une amélioration significative du taux de complétion de tâche a été constatée dans le cas de la stratégie apprise par renforcement et ce, sans dégradation de l’appréciation globale par les utilisateurs de la qualité du dialogue (en réalité, une légère amélioration a été constatée)
Incremental dialogue systems are able to process the user’s speech as it is spoken (without waiting for the end of a sentence before starting to process it). This makes them able to take the floor whenever they decide to (the user can also speak whenever she wants, even if the system is still holding the floor). As a consequence, they are able to perform a richer set of turn-taking behaviours compared to traditional systems. Several contributions are described in this thesis with the aim of showing that dialogue systems’ turn-taking capabilities can be automatically improved from data. First, human-human dialogue is analysed and a new taxonomy of turn-taking phenomena in human conversation is established. Based on this work, the different phenomena are analysed and some of them are selected for replication in a human-machine context (the ones that are more likely to improve a dialogue system’s efficiency). Then, a new architecture for incremental dialogue systems is introduced with the aim of transforming a traditional dialogue system into an incremental one at a low cost (also separating the turn-taking manager from the dialogue manager). To be able to perform the first tests, a simulated environment has been designed and implemented. It is able to replicate user and ASR behaviour that are specific to incremental processing, unlike existing simulators. Combined together, these contributions led to the establishement of a rule-based incremental dialogue strategy that is shown to improve the dialogue efficiency in a task-oriented situation and in simulation. A new reinforcement learning strategy has also been proposed. It is able to autonomously learn optimal turn-taking behavious throughout the interactions. The simulated environment has been used for training and for a first evaluation, where the new data-driven strategy is shown to outperform both the non-incremental and rule-based incremental strategies. In order to validate these results in real dialogue conditions, a prototype through which the users can interact in order to control their smart home has been developed. At the beginning of each interaction, the turn-taking strategy is randomly chosen among the non-incremental, the rule-based incremental and the reinforcement learning strategy (learned in simulation). A corpus of 206 dialogues has been collected. The results show that the reinforcement learning strategy significantly improves the dialogue efficiency without hurting the user experience (slightly improving it, in fact)

APA, Harvard, Vancouver, ISO, and other styles

21

Mrkšić, Nikola. "Data-driven language understanding for spoken dialogue systems." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276689.

Full text

Abstract:

Spoken dialogue systems provide a natural conversational interface to computer applications. In recent years, the substantial improvements in the performance of speech recognition engines have helped shift the research focus to the next component of the dialogue system pipeline: the one in charge of language understanding. The role of this module is to translate user inputs into accurate representations of the user goal in the form that can be used by the system to interact with the underlying application. The challenges include the modelling of linguistic variation, speech recognition errors and the effects of dialogue context. Recently, the focus of language understanding research has moved to making use of word embeddings induced from large textual corpora using unsupervised methods. The work presented in this thesis demonstrates how these methods can be adapted to overcome the limitations of language understanding pipelines currently used in spoken dialogue systems. The thesis starts with a discussion of the pros and cons of language understanding models used in modern dialogue systems. Most models in use today are based on the delexicalisation paradigm, where exact string matching supplemented by a list of domain-specific rephrasings is used to recognise users' intents and update the system's internal belief state. This is followed by an attempt to use pretrained word vector collections to automatically induce domain-specific semantic lexicons, which are typically hand-crafted to handle lexical variation and account for a plethora of system failure modes. The results highlight the deficiencies of distributional word vectors which must be overcome to make them useful for downstream language understanding models. The thesis next shifts focus to overcoming the language understanding models' dependency on semantic lexicons. To achieve that, the proposed Neural Belief Tracking (NBT) model forsakes the use of standard one-hot n-gram representations used in Natural Language Processing in favour of distributed representations of user utterances, dialogue context and domain ontologies. The NBT model makes use of external lexical knowledge embedded in semantically specialised word vectors, obviating the need for domain-specific semantic lexicons. Subsequent work focuses on semantic specialisation, presenting an efficient method for injecting external lexical knowledge into word vector spaces. The proposed Attract-Repel algorithm boosts the semantic content of existing word vectors while simultaneously inducing high-quality cross-lingual word vector spaces. Finally, NBT models powered by specialised cross-lingual word vectors are used to train multilingual belief tracking models. These models operate across many languages at once, providing an efficient method for bootstrapping language understanding models for lower-resource languages with limited training data.

APA, Harvard, Vancouver, ISO, and other styles

22

Toney, Dave. "Evolutionary reinforcement learning of spoken dialogue strategies." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/1769.

Full text

Abstract:

From a system developer's perspective, designing a spoken dialogue system can be a time-consuming and difficult process. A developer may spend a lot of time anticipating how a potential user might interact with the system and then deciding on the most appropriate system response. These decisions are encoded in a dialogue strategy, essentially a mapping between anticipated user inputs and appropriate system outputs. To reduce the time and effort associated with developing a dialogue strategy, recent work has concentrated on modelling the development of a dialogue strategy as a sequential decision problem. Using this model, reinforcement learning algorithms have been employed to generate dialogue strategies automatically. These algorithms learn strategies by interacting with simulated users. Some progress has been made with this method but a number of important challenges remain. For instance, relatively little success has been achieved with the large state representations that are typical of real-life systems. Another crucial issue is the time and effort associated with the creation of simulated users. In this thesis, I propose an alternative to existing reinforcement learning methods of dialogue strategy development. More specifically, I explore how XCS, an evolutionary reinforcement learning algorithm, can be used to find dialogue strategies that cover large state spaces. Furthermore, I suggest that hand-coded simulated users are sufficient for the learning of useful dialogue strategies. I argue that the use of evolutionary reinforcement learning and hand-coded simulated users is an effective approach to the rapid development of spoken dialogue strategies. Finally, I substantiate this claim by evaluating a learned strategy with real users. Both the learned strategy and a state-of-the-art hand-coded strategy were integrated into an end-to-end spoken dialogue system. The dialogue system allowed real users to make flight enquiries using a live database for an Edinburgh-based airline. The performance of the learned and hand-coded strategies were compared. The evaluation results show that the learned strategy performs as well as the hand-coded one (81% and 77% task completion respectively) but takes much less time to design (two days instead of two weeks). Moreover, the learned strategy compares favourably with previous user evaluations of learned strategies.

APA, Harvard, Vancouver, ISO, and other styles

23

Dahlgren, Karl. "Context-dependent voice commands in spoken dialogue systems for home environments : A study on the effect of introducing context-dependent voice commands to a spoken dialogue system for home environments." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-128170.

Full text

Abstract:

This thesis aims to investigate the eect context could have to interaction between a user and a spoken dialogue system. It was assumed that using context-dependent voice commands instead of absolute semantic voice commands would make the dialogue more natural and also increase the usability. This thesis also investigate if introducing context could aect the user's privacy and if it could expose a threat for the user from a user perspective. Based on an extended literature review of spoken dialogue system, voice recognition, ambient intelligence, human-computer interaction and privacy, a spoken dialogue system was designed and implemented to test the assumption. The test study included two steps: experiment and interview. The participants conducted the dierent scenarios where a spoken dialogue system could be used with both context-dependent commands and absolute semantic commands. Based on these studies, qualitative results regarding natural, usability and privacy validated the authors hypothesis to some extent. The results indicated that the interaction between users and spoken dialogue systems was more natural and increased the usability when using context. The participants did not feel more monitored by the spoken dialogue system when using context. Some participants stated that there could be a theoretical privacy issues, but only if the security measurements were not met. The paper concludes with suggestions for future work in the scientic area.
Denna uppsats har som mal att undersoka vilken eekt kontext kan ha pa interaktion mellan en anvandare och ett spoken dialogue system. Det antogs att anvandbarheten skulle oka genom att anvanda kontextberoende rostkommandon istallet for absolut semantiska rostkommandon. Denna uppsats granskar aven om kontext kan paverka anvandarens integritet och om den, ur ett anvandarperspektiv, kan utgora ett hot. Baserat pa den utokade litteraturstudien av spoken dialogue system, rostigenkanning, ambient intelligence, manniska-datorinteraktion och integritet, designades och implementerades ett spoken dialogue system for att testa detta antagande. Teststudien bestod av tva steg: experiment och intervju. Deltagarna utforde olika scenarier dar ett spoken dialogue system kunde anvands med kontextberoende rostkommandon och absolut semantiska rostkommandon. Kvalitativa resultat angaende naturlighet, anvandbarhet och integritet validerade forfattarens hypotes till en viss grad. Resultatet indikerade att interaktionen mellan anvandare och ett spoken dialogue system var mer naturlig och mer anvandbar vid anvandning av kontextberoende rostkommandon istallet for absolut semantiska rostkommandon. Deltagarna kande sig inte mer overvakade av ett spoken dialogue system vid anvandning av kontextberoende rostkommandon. Somliga deltagare angav att det, i teorin, fanns integritetsproblem, men endast om inte alla sakerhetsatgarder var uppnadda. Uppsatsen avslutas med forslag pa framtida studier inom detta vetenskapliga omrade.

APA, Harvard, Vancouver, ISO, and other styles

24

Janarthanam, Srinivasan Chandrasekaran. "Learning user modelling strategies for adaptive referring expression generation in spoken dialogue systems." Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5033.

Full text

Abstract:

We address the problem of dynamic user modelling for referring expression generation in spoken dialogue systems, i.e how a spoken dialogue system should choose referring expressions to refer to domain entities to users with different levels of domain expertise, whose domain knowledge is initially unknown to the system. We approach this problem using a statistical planning framework: Reinforcement Learning techniques in Markov Decision Processes (MDP). We present a new reinforcement learning framework to learn user modelling strategies for adaptive referring expression generation (REG) in resource scarce domains (i.e. where no large corpus exists for learning). As a part of the framework, we present novel user simulation models that are sensitive to the referring expressions used by the system and are able to simulate users with different levels of domain knowledge. Such models are shown to simulate real user behaviour more closely than baseline user simulation models. In contrast to previous approaches to user adaptive systems, we do not assume that the user’s domain knowledge is available to the system before the conversation starts. We show that using a small corpus of non-adaptive dialogues it is possible to learn an adaptive user modelling policy in resource scarce domains using our framework. We also show that the learned user modelling strategies performed better in terms of adaptation than hand-coded baselines policies on both simulated and real users. With real users, the learned policy produced around 20% increase in adaptation in comparison to the best performing hand-coded adaptive baseline. We also show that adaptation to user’s domain knowledge results in improving task success (99.47% for learned policy vs 84.7% for hand-coded baseline) and reducing dialogue time of the conversation (11% relative difference). This is because users found it easier to identify domain objects when the system used adaptive referring expressions during the conversations.

APA, Harvard, Vancouver, ISO, and other styles

25

Yoshino, Koichiro. "Spoken Dialogue System for Information Navigation based on Statistical Learning of Semantic and Dialogue Structure." 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/192214.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Inoue, Koji. "Engagement Recognition based on Multimodal Behaviors for Human-Robot Dialogue." Kyoto University, 2018. http://hdl.handle.net/2433/235112.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Frampton, Matthew. "Using Dialogue Acts in dialogue strategy learning : optimising repair strategies." Thesis, University of Edinburgh, 2008. http://hdl.handle.net/1842/2381.

Full text

Abstract:

A Spoken Dialogue System's (SDS's) dialogue strategy specifies which action it will take depending on its representation of the current dialogue context. Designing it by hand involves anticipating how users will interact with the system, and/or repeated testing and refining, and so can be a difficult, time-consuming task. Since SDSs inevitably make understanding errors, a particularly important issue is how to design ``repair strategies'', the parts of the dialogue strategy which attempt to get the dialogue ``back-on-track'' following these errors. To try to produce better dialogue strategies with less time and effort, previous researchers have modelled a dialogue strategy as a sequential decision problem called a Markov Decision Process (MDP), and then applied Reinforcement Learning (RL) algorithms to example training dialogues to generate dialogue strategies automatically. More recent research has used training dialogues conducted with simulated rather than real users and learned which action to take in all dialogue contexts, (a ``full'' as opposed to a ``partial'' dialogue strategy) - simulated users allow more training dialogues to be generated, and the exploration of new dialogue contexts not present in an original dataset. As yet however, limited insight has been provided as to which dialogue contextual features are important to include in the MDP and why. Indeed, a full dialogue strategy has not been learned from training dialogues with a realistic probabilistic user simulation derived from real user data, and then shown to work well with real users. This thesis investigates the value of adding new linguistically-motivated contextual features to the MDP when using RL to learn full dialogue strategies for SDSs. These new features are recent Dialogue Acts (DAs). DAs indicate the role or intention of an utterance in a dialogue e.g. ``provide-information'', an utterance being a complete unit of a speaker's speech, often bounded by silence. An accurate probabilistic user simulation learned from real user data is used for generating training dialogues, and the recent DAs are shown to improve performance in testing in simulation and with real users. With real users, performance is also better than other competing learned and hand-crafted strategies. Analysis of the strategies, and further simulation experiments show how the DAs improve performance through better repair strategies. The main findings are expected to apply to SDSs in general - indeed our strategies are learned and tested on real users in different domains, (flight-booking versus tourist information). Comparisons are also made to recent research which focuses on handling understanding errors in SDSs, but which does not use RL or user simulations.

APA, Harvard, Vancouver, ISO, and other styles

28

Asri, Layla El. "Learning the Parameters of Reinforcement Learning from Data for Adaptive Spoken Dialogue Systems." Thesis, Université de Lorraine, 2016. http://www.theses.fr/2016LORR0350/document.

Full text

Abstract:

Cette thèse s’inscrit dans le cadre de la recherche sur les systèmes de dialogue. Ce document propose d’apprendre le comportement d’un système à partir d’un ensemble de dialogues annotés. Le système apprend un comportement optimal via l’apprentissage par renforcement. Nous montrons qu’il n’est pas nécessaire de définir une représentation de l’espace d’état ni une fonction de récompense. En effet, ces deux paramètres peuvent être appris à partir du corpus de dialogues annotés. Nous montrons qu’il est possible pour un développeur de systèmes de dialogue d’optimiser la gestion du dialogue en définissant seulement la logique du dialogue ainsi qu’un critère à maximiser (par exemple, la satisfaction utilisateur). La première étape de la méthodologie que nous proposons consiste à prendre en compte un certain nombre de paramètres de dialogue afin de construire une représentation de l’espace d’état permettant d’optimiser le critère spécifié par le développeur. Par exemple, si le critère choisi est la satisfaction utilisateur, il est alors important d’inclure dans la représentation des paramètres tels que la durée du dialogue et le score de confiance de la reconnaissance vocale. L’espace d’état est modélisé par une mémoire sparse distribuée. Notre modèle, Genetic Sparse Distributed Memory for Reinforcement Learning (GSDMRL), permet de prendre en compte de nombreux paramètres de dialogue et de sélectionner ceux qui sont importants pour l’apprentissage par évolution génétique. L’espace d’état résultant ainsi que le comportement appris par le système sont aisément interprétables. Dans un second temps, les dialogues annotés servent à apprendre une fonction de récompense qui apprend au système à optimiser le critère donné par le développeur. A cet effet, nous proposons deux algorithmes, reward shaping et distance minimisation. Ces deux méthodes interprètent le critère à optimiser comme étant la récompense globale pour chaque dialogue. Nous comparons ces deux fonctions sur un ensemble de dialogues simulés et nous montrons que l’apprentissage est plus rapide avec ces fonctions qu’en utilisant directement le critère comme récompense finale. Nous avons développé un système de dialogue dédié à la prise de rendez-vous et nous avons collecté un corpus de dialogues annotés avec ce système. Ce corpus permet d’illustrer la capacité de mise à l’échelle de la représentation de l’espace d’état GSDMRL et constitue un bon exemple de système industriel sur lequel la méthodologie que nous proposons pourrait être appliquée
This document proposes to learn the behaviour of the dialogue manager of a spoken dialogue system from a set of rated dialogues. This learning is performed through reinforcement learning. Our method does not require the definition of a representation of the state space nor a reward function. These two high-level parameters are learnt from the corpus of rated dialogues. It is shown that the spoken dialogue designer can optimise dialogue management by simply defining the dialogue logic and a criterion to maximise (e.g user satisfaction). The methodology suggested in this thesis first considers the dialogue parameters that are necessary to compute a representation of the state space relevant for the criterion to be maximized. For instance, if the chosen criterion is user satisfaction then it is important to account for parameters such as dialogue duration and the average speech recognition confidence score. The state space is represented as a sparse distributed memory. The Genetic Sparse Distributed Memory for Reinforcement Learning (GSDMRL) accommodates many dialogue parameters and selects the parameters which are the most important for learning through genetic evolution. The resulting state space and the policy learnt on it are easily interpretable by the system designer. Secondly, the rated dialogues are used to learn a reward function which teaches the system to optimise the criterion. Two algorithms, reward shaping and distance minimisation are proposed to learn the reward function. These two algorithms consider the criterion to be the return for the entire dialogue. These functions are discussed and compared on simulated dialogues and it is shown that the resulting functions enable faster learning than using the criterion directly as the final reward. A spoken dialogue system for appointment scheduling was designed during this thesis, based on previous systems, and a corpus of rated dialogues with this system were collected. This corpus illustrates the scaling capability of the state space representation and is a good example of an industrial spoken dialogue system upon which the methodology could be applied

APA, Harvard, Vancouver, ISO, and other styles

29

Bell, Linda. "Linguistic Adaptations in Spoken Human-Computer Dialogues - Empirical Studies of User Behavior." Doctoral thesis, Stockholm : KTH, 2003. http://www.speech.kth.se/~bell/linda_bell.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Chandramohan, Senthilkumar. "Revisiting user simulation in dialogue systems : do we still need them ? : will imitation play the role of simulation ?" Phd thesis, Université d'Avignon, 2012. http://tel.archives-ouvertes.fr/tel-00875229.

Full text

Abstract:

Recent advancements in the area of spoken language processing and the wide acceptance of portable devices, have attracted signicant interest in spoken dialogue systems.These conversational systems are man-machine interfaces which use natural language (speech) as the medium of interaction.In order to conduct dialogues, computers must have the ability to decide when and what information has to be exchanged with the users. The dialogue management module is responsible to make these decisions so that the intended task (such as ticket booking or appointment scheduling) can be achieved.Thus learning a good strategy for dialogue management is a critical task.In recent years reinforcement learning-based dialogue management optimization has evolved to be the state-of-the-art. A majority of the algorithms used for this purpose needs vast amounts of training data.However, data generation in the dialogue domain is an expensive and time consuming process. In order to cope with this and also to evaluatethe learnt dialogue strategies, user modelling in dialogue systems was introduced. These models simulate real users in order to generate synthetic data.Being computational models, they introduce some degree of modelling errors. In spite of this, system designers are forced to employ user models due to the data requirement of conventional reinforcement learning algorithms can learn optimal dialogue strategies from limited amount of training data when compared to the conventional algorithms. As a consequence of this, user models are no longer required for the purpose of optimization, yet they continue to provide a fast and easy means for quantifying the quality of dialogue strategies. Since existing methods for user modelling are relatively less realistic compared to real user behaviors, the focus is shifted towards user modelling by means of inverse reinforcement learning. Using experimental results, the proposed method's ability to learn a computational models with real user like qualities is showcased as part of this work.

APA, Harvard, Vancouver, ISO, and other styles

31

Alfenas, Daniel Assis. "Aplicações da tecnologia adaptativa no gerenciamento de diálogo falado em sistemas computacionais." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/3/3141/tde-13012015-111240/.

Full text

Abstract:

Este trabalho apresenta um estudo sobre como a tecnologia adaptativa pode ser utilizada para aprimorar métodos existentes de gerenciamento de diálogo. O gerenciamento de diálogo é a atividade central em um sistema computacional de diálogo falado, sendo a responsável por decidir as ações comunicativas que devem ser enviadas ao usuário. Para evidenciar pontos que pudessem ser melhorados através do uso da tecnologia adaptativa, faz-se uma revisão literária ampla do gerenciamento do diálogo. Esta revisão também permite elencar critérios existentes e criar outros novos para avaliar gerenciadores de diálogos. Um modelo de gerenciamento adaptativo baseado em máquinas de estados, denominado Adaptalker, é então proposto e utilizado para criar um framework de desenvolvimento de gerenciadores de diálogo, o qual foi exercitado pelo desenvolvimento ilustrativo de uma aplicação simples de venda de pizzas. A análise desse exemplo permite observar como a adaptatividade é utilizada para aperfeiçoar o modelo, tornando-o capaz, por exemplo, de lidar de forma mais eficiente tanto com o reparo do diálogo quanto com a iniciativa do usuário. As regras de gerenciamento do Adaptalker são organizadas em submáquinas, que trabalham de forma concorrente para decidir qual a próxima ação comunicativa.
This work presents a study on how to apply adaptive technologies to improve existing dialog management methodologies. Dialog management is the central activity of a spoken dialog system, being responsible for choosing the communicative actions sent to the system user. In order to evidence parts that can be improved with adaptive technology, a large review on dialog management is presented. This review allows us to point existing criteria and create new ones to evaluate dialog managers. An adaptive management model based on finite state-based spoken dialog systems, Adaptalker, is proposed and used to build a development framework of dialog managers, which is illustrated by creating a pizza selling application. Analysis of this example allows us to observe how to use adaptivity to improve the model, allowing it to handle both dialog repair and user initiative more efficiently. Adaptalker groups its management rules in submachines that work concurrently to choose the next communication action.

APA, Harvard, Vancouver, ISO, and other styles

32

Wirström, Li, and Mattias Huledal. "Den svenska callcenterbranschen och de tekniska lösningar som används : Branschanalys samt identifiering av problematiska dialogsystemsyttranden med hjälp av maskininlärning." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-175360.

Full text

Abstract:

Detta arbete består av två delar. Den första delen syftar till att beskriva och analysera callcenterbranschen i Sverige samt vilka faktorer som påverkar branschen och dess utveckling. Analysen grundar sig i två modeller: Porters fempunktsmodell och PEST. Fokus ligger på den del av branschen som består av kundtjänstverksamhet för att koppla till arbetets andra del. Analysen visar att branschen främst påverkas av hög konkurrens och företagens, som behöver tillhandahålla kundtjänst, val mellan interna eller externa kundtjänstlösningar. Analysen indikerar även att branschen kommer fortsätta växa och att det finns en trend att företag i större utsträckning väljer att outsourca sin kundtjänst. Utvecklingen hos de tekniska lösningar som används i callcenter, till exempel dialogsystem, är efterfrågade av företagen då dessa är viktiga verktyg för att skapa en väl fungerande kundtjänst. Dagens digitala system har uppenbara utvecklingsområden. Det är ofta stora internationella företag eller internationella arbetslag som utvecklar de digitala systemen. Dock sträcker sig användningsområdet för dessa system långt utanför endast callcenterbranschen. Den andra delen handlar om att identifiera problematiska dialogsystemyttranden med hjälp av maskininlärning och inspireras av SpeDial, ett EU-projekt med syfte att förbättra dialogsystem. Yttranden från dialogsystemet kan anses problematiska beroende på till exempel att systemet missuppfattat användarens avsikt. Syftet med arbetets andra del är att undersöka vilken eller vilka maskininlärningsmetoder i verktyget WEKA som lämpar sig bäst för att identifiera problematiska dialogsystemyttranden. De data som använts i arbetet kommer från en kundtjänstentré baserad på fritt tal, vilket innebär att användaren själv uppmanas beskriva sitt ärende för att kunna kopplas vidare till rätt avdelning inom kundtjänsten. Våra data har tillhandahållits av företaget Voice Provider som utvecklar, implementerar och underhåller kundtjänstsystem. Voice Provider kom vi i kontakt med via Institutionen för tal, musik och hörsel (TMH), vid Kungliga Tekniska högskolan, som deltar i SpeDial-projektet. Arbetet gick initialt ut på att förbereda tillhandahållen data för att kunna användas av maskininlärningsverktyget WEKAs inbyggda klassificerare, varefter sex klassificerare valdes ut för vidare utvärdering. Resultaten visar att ingen av klassificerarna lyckades utföra uppgiften på ett fullt ut tillfredsställande sätt. Den som lyckades bäst var dock metoden Random Forest. Det är svårt att dra några ytterligare slutsatser från resultaten.
This work consists of two parts. The first part aims to describe and analyze the call center industry in Sweden and the factors that affect the industry and its development. The analysis is based on two models: Porter’s five forces and PEST. The focus is mainly on the part of the industry that consists of customer service operations. The analysis shows that the industry is mainly affected by high competition and businesses’, that need to provide customer service, choice between internal or external customer service operations. The analysis also indicates that the industry will continue to grow and that there is a trend that companies increasingly choose to outsource their customer service. The development of the technological solutions used in call centers, for example, dialogue systems, are requested by companies as these are important tools to create a well-functioning customer service. Digital systems today have obvious development areas. It is often large international companies or international teams that develop the digital systems used. However, extends the area of use for these systems far beyond the call center industry. The second part involves identifying problematic dialogue system utterances using machine learning and is inspired by SpeDial, an EU project aimed at improving dialogue systems. Problematic dialogue system utterances can be considered problematic depending on, for example, that the system misinterprets the user's intention. The aim of the work done in the second part is to investigate what or which machine learning methods in the WEKA tool that are best suited to identify problematic dialogue system utterances. The data used in this work comes from a customer service entrance based on free speech, which means that the user is asked to describe their case to be transferred to the right department within the customer service. Our data has been provided by the company Voice Provider that develops, implements and maintains customer service systems. We came in contact with Voice Provider through the Department of Speech, Music and Hearing (TMH), at the Royal Institute of Technology, that are involved in the SpeDial project. The work initially consisted of preparing the supplied data to enable it to me used by the machine learning tool WEKA’s built-in classifiers, after which six classifiers were selected for further evaluation. The results show that none of the classifiers managed to accomplish the task in a fully satisfactory manner. Whoever the method that was most successful was the Random Forest method. It is difficult to draw any further conclusions from the results.

APA, Harvard, Vancouver, ISO, and other styles

33

Griol, Barres David. "Desarrollo y evaluación de diferentes metodologías para la gestión automática del diálogo." Doctoral thesis, Universitat Politècnica de València, 2008. http://hdl.handle.net/10251/1956.

Full text

Abstract:

El objetivo principal de la tesis que se presenta es el estudio y desarrollo de diferentes metodologías para la gestión del diálogo en sistemas de diálogo hablado. El principal reto planteado en la tesis reside en el desarrollo de metodologías puramente estadísticas para la gestión del diálogo, basadas en el aprendizaje de un modelo a partir de un corpus de diálogos etiquetados. En este campo, se presentan diferentes aproximaciones para realizar la gestión, la mejora del modelo estadístico y la evaluación del sistema del diálogo. Para la implementación práctica de estas metodologías, en el ámbito de una tarea específica, ha sido necesaria la adquisición y etiquetado de un corpus de diálogos. El hecho de disponer de un gran corpus de diálogos ha facilitado el aprendizaje y evaluación del modelo de gestión desarrollado. Así mismo, se ha implementado un sistema de diálogo completo, que permite evaluar el funcionamiento práctico de las metodologías de gestión en condiciones reales de uso. Para evaluar las técnicas de gestión del diálogo se proponen diferentes aproximaciones: la evaluación mediante usuarios reales; la evaluación con el corpus adquirido, en el cual se han definido unas particiones de entrenamiento y prueba; y la utilización de técnicas de simulación de usuarios. El simulador de usuario desarrollado permite modelizar de forma estadística el proceso completo del diálogo. En la aproximación que se presenta, tanto la obtención de la respuesta del sistema como la generación del turno de usuario se modelizan como un problema de clasificación, para el que se codifica como entrada un conjunto de variables que representan el estado actual del diálogo y como resultado de la clasificación se obtienen las probabilidades de seleccionar cada una de las respuestas (secuencia de actos de diálogo) definidas respectivamente para el usuario y el sistema.
Griol Barres, D. (2007). Desarrollo y evaluación de diferentes metodologías para la gestión automática del diálogo [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1956
Palancia

APA, Harvard, Vancouver, ISO, and other styles

34

Su, Pei-Hao. "Reinforcement learning and reward estimation for dialogue policy optimisation." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275649.

Full text

Abstract:

Modelling dialogue management as a reinforcement learning task enables a system to learn to act optimally by maximising a reward function. This reward function is designed to induce the system behaviour required for goal-oriented applications, which usually means fulfilling the user’s goal as efficiently as possible. However, in real-world spoken dialogue systems, the reward is hard to measure, because the goal of the conversation is often known only to the user. Certainly, the system can ask the user if the goal has been satisfied, but this can be intrusive. Furthermore, in practice, the reliability of the user’s response has been found to be highly variable. In addition, due to the sparsity of the reward signal and the large search space, reinforcement learning-based dialogue policy optimisation is often slow. This thesis presents several approaches to address these problems. To better evaluate a dialogue for policy optimisation, two methods are proposed. First, a recurrent neural network-based predictor pre-trained from off-line data is proposed to estimate task success during subsequent on-line dialogue policy learning to avoid noisy user ratings and problems related to not knowing the user’s goal. Second, an on-line learning framework is described where a dialogue policy is jointly trained alongside a reward function modelled as a Gaussian process with active learning. This mitigates the noisiness of user ratings and minimises user intrusion. It is shown that both off-line and on-line methods achieve practical policy learning in real-world applications, while the latter provides a more general joint learning system directly from users. To enhance the policy learning speed, the use of reward shaping is explored and shown to be effective and complementary to the core policy learning algorithm. Furthermore, as deep reinforcement learning methods have the potential to scale to very large tasks, this thesis also investigates the application to dialogue systems. Two sample-efficient algorithms, trust region actor-critic with experience replay (TRACER) and episodic natural actor-critic with experience replay (eNACER), are introduced. In addition, a corpus of demonstration data is utilised to pre-train the models prior to on-line reinforcement learning to handle the cold start problem. Combining these two methods, a practical approach is demonstrated to effectively learn deep reinforcement learning-based dialogue policies in a task-oriented information seeking domain. Overall, this thesis provides solutions which allow truly on-line and continuous policy learning in spoken dialogue systems.

APA, Harvard, Vancouver, ISO, and other styles

35

Ferreira, Arikleyton de Oliveira. "MHNSS: um Middleware para o Desenvolvimento de Aplicações Móveis com Interações Baseada na Fala." Universidade Federal do Maranhão, 2014. http://tedebc.ufma.br:8080/jspui/handle/tede/522.

Full text

Abstract:

Made available in DSpace on 2016-08-17T14:53:29Z (GMT). No. of bitstreams: 1 DISSERTACAO Arikleyton de Oliveira Ferreira.pdf: 1952997 bytes, checksum: 4c3733cd1aefc31e6f18a8068828d271 (MD5) Previous issue date: 2014-08-04
Conselho Nacional de Desenvolvimento Científico e Tecnológico
Applications for mobile computing environments usually have several accessibility limitations due to the dependency on the interaction with the user through the device display, which hinders its use to people who have limitations to read, write (type) and/or have little fluency in the use of technology. In this master thesis we propose a middleware that provides support for developing mobile applications with accessibility features through spoken dialogue systems. These systems are able to hold a conversation with the user, providing a natural interaction interface that does not require prior learning. Thus, mobile applications can use the middleware to provide accessibility to the user that overcomes the physical or visual contact needs. The proposed middleware was developed in the context of the MobileHealthNet project, where it will help mobile applications with focus in the health domain to reach users with different profiles, with particular attention to underserved and remote communities. To perform the middleware evaluation, we used a case study based on a mobile application for evaluating the health condition of patients with atrial fibrillation. The evaluation involved 10 individuals, and the results obtained were very positive.
Aplicações para ambientes computacionais móveis usualmente apresentam diversas limitações de acessibilidade por dependerem da interação com o usuário através da tela dos dispositivos móveis, o que dificulta seu uso às pessoas que possuem limitações para ler, escrever (digitar) e que tenham pouca fluência no uso de tecnologias. Neste trabalho de mestrado propomos um middleware que fornece suporte ao desenvolvimento de aplicações móveis com recurso de acessibilidade através do diálogo falado. Essa modalidade de acesso é capaz de manter uma conversa com o usuário, proporcionando uma interface de interação natural que não requer prévio aprendizado. Assim, aplicações móveis podem utilizar o middleware para proporcionar acessibilidade ao usuário que supera a necessidade do contato físico ou visual, pois eles podem apenas dialogar entre si. O middleware proposto está inserido no contexto do projeto MobileHealthNet, onde auxiliará aplicações móveis focadas ao domínio da saúde a atingir usuários com diferentes perfis, com especial atenção a moradores de comunidades carentes e distantes. No processo de avalidação do middleware proposto foi utilizado um estudo de caso de uma aplicação dedicada a acompanhar o estado de saúde de pacientes portadores de fibrilação atrial, realizando-se uma avaliação com 10 sujeitos na qual obteve-se resultados bastante positivos.

APA, Harvard, Vancouver, ISO, and other styles

36

Ke, Shin-Cheng, and 柯欣成. "Spoken Dialog Based Automobile Information System." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/89913284805839373174.

Full text

Abstract:

碩士
國立成功大學
電機工程學系
89
A mobile Information System (MIS) can provide driver many kinds of information, such as the location of gasoline station, traffic situation and shortest path to a destination. Most current MIS use screen and touch pad as communicate interface. However, this model is not convenient and safe while driving. The better interface will use spontaneous speech to communicate with the system. In this thesis, common errors of Spoken Dialog System (SDS) are first analyzed. Then, many practicable strategies are proposed to avoid and recover from these errors. These strategies focus on three components of the SDS, i.e. user, recognizer, and dialog manager. To overcome the defect of traditional guided dialog strategy that spend many turns to complete a dialog, this thesis proposes the Semi-Guided Dialog System (SGDS) to reduce the vocabulary size and shorten the dialog turns. The method is to classify all landmarks by analyzing landmark literally. Words with the same components will be classified into the same category. When recognizing, a two-pass recognition technique is adopted. The first pass is to recognize the categories, and the second pass then recognizes landmarks in the recognized categories. Using this method, the dialog turns is reduced to be one or two, and greatly improve the efficiency of the interaction. For the experiments, the keyword recognition rate arises 5% when SGDS is applied for large landmark retrieval. In addition, about 12% of recognition rate improvement is gained by applying proposed strategies of SDS error recovery. Finally, we test the dialog’s finish rate, and is greater then 89%. 圖目錄 iii 表目錄 v 第一章緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究方法與步驟 3 1.4 系統架構 4 1.5 章節概要 5 第二章語音對話系統之錯誤分析及可行的解決策略 1 2.1 語音對話系統之錯誤定義 1 2.2 各子系統造成的錯誤分析與回復策略 1 2.2.1 使用者之輸入系統無法處理 2 2.2.2 語音辨識錯誤 5 2.2.3 對話管理系統錯誤 8 第三章半導引式對話系統 13 3.1 簡介 13 3.2 傳統導引式對話架構查詢流程 13 3.3 半導引式對話系統 16 3.3.1 漸次擷取使用者輸入資訊 17 3.3.2 地標詞表分類 17 3.3.3 內部多重辨識 18 3.4 半導引式對話系統建立流程 20 3.4.1 地標特性分析 20 3.4.2 地標詞表分類建立流程 21 3.4.3 半導引式對話系統地標查詢流程 25 第四章對話行動資訊系統 27 4.1 對話系統之開發 27 4.1.1 資料收集模組 28 4.1.2 構句模組 34 4.1.3 語意理解模組 34 4.1.4 資料查詢模組 38 4.1.5 對話管理模組 38 4.1.6 文字翻語音模組 41 4.2 行動資訊系統架構 42 4.3 對話管理系統之改善 43 4.4 各子系統之整合 44 第五章實驗結果 47 5.1 實驗環境 47 5.2 關鍵詞數與語音辨識正確率的實驗 47 5.3 SGDS測試及錯誤回復實驗 48 5.4 MIS對話完成度實驗 49 第六章結論及未來展望 53 6.1 結論 53 6.2 未來展望 53 參考文獻： 55 附錄 57

APA, Harvard, Vancouver, ISO, and other styles

37

Jing-MinChen and 陳敬閔. "Spoken Dialog Summarization System with HAPPINESS/SUFFERING Factor Recognition." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/k86see.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Mengistu, Kinfe Tadesse [Verfasser]. "Robust acoustic and semantic modeling in a telephone-based spoken dialog system / Kinfe Tadesse Mengistu." 2009. http://d-nb.info/995555265/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Wang, Nick Jui-Chang. "Relevant Technologies For Improved Chinese Spoken Dialog Systems." 2007. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2607200718203600.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Wang, Nick Jui-Chang, and 王瑞璋. "Relevant Technologies For Improved Chinese Spoken Dialog Systems." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/06575877680801601286.

Full text

Abstract:

博士
臺灣大學
電信工程學研究所
95
The application of automatic speech recognition technology in spoken dialogue systems comprises important technologies from several different aspects, including digital signal processing, robust speech feature extraction, acoustic-phonetic modeling, speaker adaptation, language modeling, and language understanding, dialogue management, as well as language generation and speech synthesis. All of these technologies contribute to the performance of the spoken dialogue system, which accomplishes the communication between men and machines. The dissertation includes three relevant technologies for improved Chinese spoken dialogue systems: the first topic is about speaker adaptation and speaker identification, the second one is about speech understanding, and the third one is an interactive open-vocabulary Chinese name input system. The speaker adaptation technology in the first topic is to adapt acoustic models to speaker voice characteristics to improve speech recognition accuracy. Eigen-MLLR approach was proposed to construct the subspace of MLLR parameters space by Principal Component Analysis (PCA) technique, hence it is more robust than MLLR approach with small amount of enrollment data. Compared with Eigenvoice approach, it requires less storage memory for model-adaptation estimates. Therefore, it could be more realistic for application in speaker-independent spoken dialogue systems. The author compared Eigen-MLLR with MLLR and Eigenvoice, developed a fast Eigen-MLLR coefficient estimation algorithm, and applied Eigen-MLLR coefficients for speaker identification. The second topic is about speech understanding. Most of speech understanding systems with middle to large vocabularies incorporate a two-stage approach: the speech recognition component as the first stage, followed by the second stage of natural language understanding component. The speech understanding performance is usually constrained by speech recognition errors and out-of-grammar problems. Therefore, it is necessary to have robust speech understanding ability. The proposed novel approach integrates a concept layer, Key Semantic Chunk, into the two-stage system. The Key Semantic Chunk is a language unit between sentence and word, is integrated into both speech recognition and language understanding components, and interfaces the communication between these two components. Not only the language model of speech recognition can be improved in its robustness to data-sparseness, but also the language understanding processing on the speech recognition output can work more robustly. Consequently, the improved system achieved about 30% reduction over understanding errors. Besides, the building and maintenance efforts for language understanding grammars and speech recognition n-gram models can be reduced. The last topic is to build an interactive open-vocabulary Chinese name input system and to establish an error correction mechanism. The motivation of building the system came from the experience of 104 directory-assistance services in Chunghwa Telecom. This service is the biggest commercial telephony service in Taiwan. It has the largest group of consumers and is frequently used by the telephone user. However, its service is clear and simple – the telephone number of a person, a company, or a branch of a company. The difficulty of an open-vocabulary Chinese name input task is its huge vocabulary size. For example, with very short periods, less than two seconds, of speech, the task requires a system to recognize the target name among billion names. It is incredible to have high recognition accuracy only by the speech recognition technique. The experimental system attempts to design an intelligent and friendly dialogue strategy by incorporate the error correction mechanism to achieve a reasonable high success rate. Referring to actual 104-service interactions, the human operator may attempt to ask the caller to describe again the ambiguous characters. Finally, both character confirmation and character input mechanisms were designed into the experimental system and achieved an 86.7% high success rate. The dissertation has included several relevant technologies for improved Chinese spoken dialogue systems, although the first two can also be applied in different languages. Via all different research topics, the author would like to understand more about the spoken dialogue system and to improve the whole system performance. There is a wish in the mind of the author: to see the speech recognition and dialogue system technologies being widely and successfully applied in many applications.

APA, Harvard, Vancouver, ISO, and other styles

41

"An evaluation paradigm for spoken dialog systems based on crowdsourcing and collaborative filtering." 2011. http://library.cuhk.edu.hk/record=b5894830.

Full text

Abstract:

Yang, Zhaojun.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.
Includes bibliographical references (p. 92-99).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- SDS Architecture --- p.1
Chapter 1.2 --- Dialog Model --- p.3
Chapter 1.3 --- SDS Evaluation --- p.4
Chapter 1.4 --- Thesis Outline --- p.7
Chapter 2 --- Previous Work --- p.9
Chapter 2.1 --- Approaches to Dialog Modeling --- p.9
Chapter 2.1.1 --- Handcrafted Dialog Modeling --- p.9
Chapter 2.1.2 --- Statistical Dialog Modeling --- p.12
Chapter 2.2 --- Evaluation Metrics --- p.16
Chapter 2.2.1 --- Subjective User Judgments --- p.17
Chapter 2.2.2 --- Interaction Metrics --- p.18
Chapter 2.3 --- The PARADISE Framework --- p.19
Chapter 2.4 --- Chapter Summary --- p.22
Chapter 3 --- Implementation of a Dialog System based on POMDP --- p.23
Chapter 3.1 --- Partially Observable Markov Decision Processes (POMDPs) --- p.24
Chapter 3.1.1 --- Formal Definition --- p.24
Chapter 3.1.2 --- Value Iteration --- p.26
Chapter 3.1.3 --- Point-based Value Iteration --- p.27
Chapter 3.1.4 --- A Toy Example of POMDP: The NaiveBusInfo System --- p.27
Chapter 3.2 --- The SDS-POMDP Model --- p.31
Chapter 3.3 --- Composite Summary Point-based Value Iteration (CSPBVI) --- p.33
Chapter 3.4 --- Application of SDS-POMDP Model: The Buslnfo System --- p.35
Chapter 3.4.1 --- System Description --- p.35
Chapter 3.4.2 --- Demonstration Description --- p.39
Chapter 3.5 --- Chapter Summary --- p.42
Chapter 4 --- Collecting User Judgments on Spoken Dialogs with Crowdsourcing --- p.46
Chapter 4.1 --- Dialog Corpus and Automatic Dialog Classification --- p.47
Chapter 4.2 --- User Judgments Collection with Crowdsourcing --- p.50
Chapter 4.2.1 --- HITs on Dialog Evaluation --- p.51
Chapter 4.2.2 --- HITs on Inter-rater Agreement --- p.53
Chapter 4.2.3 --- Approval of Ratings --- p.54
Chapter 4.3 --- Collected Results and Analysis --- p.55
Chapter 4.3.1 --- Approval Rates and Comments from Mturk Workers --- p.55
Chapter 4.3.2 --- Consistency between Automatic Dialog Classification and Manual Ratings --- p.57
Chapter 4.3.3 --- Inter-rater Agreement Among Workers --- p.60
Chapter 4.4 --- Comparing Experts to Non-experts --- p.64
Chapter 4.4.1 --- Inter-rater Agreement on the Let's Go! System --- p.65
Chapter 4.4.2 --- Consistency Between Expert and Non-expert Annotations on SDC Systems --- p.66
Chapter 4.5 --- Chapter Summary --- p.68
Chapter 5 --- Collaborative Filtering for Performance Prediction --- p.70
Chapter 5.1 --- Item-Based Collaborative Filtering --- p.71
Chapter 5.2 --- CF Model for User Satisfaction Prediction --- p.72
Chapter 5.2.1 --- ICFM for User Satisfaction Prediction --- p.72
Chapter 5.2.2 --- Extended ICFM for User Satisfaction Prediction --- p.73
Chapter 5.3 --- Extraction of Interaction Features --- p.74
Chapter 5.4 --- Experimental Results and Analysis --- p.76
Chapter 5.4.1 --- Prediction of User Satisfaction --- p.76
Chapter 5.4.2 --- Analysis of Prediction Results --- p.79
Chapter 5.5 --- Verifying the Generalibility of CF Model --- p.81
Chapter 5.6 --- Evaluation of The Buslnfo System --- p.86
Chapter 5.7 --- Chapter Summary --- p.87
Chapter 6 --- Conclusions and Future Work --- p.89
Chapter 6.1 --- Thesis Summary --- p.89
Chapter 6.2 --- Future Work --- p.90
Bibliography --- p.92

APA, Harvard, Vancouver, ISO, and other styles

42

Dušek, Ondřej. "Nové metody generování promluv v dialogových systémech." Doctoral thesis, 2017. http://www.nusl.cz/ntk/nusl-364953.

Full text

Abstract:

Title: Novel Methods for Natural Language Generation in Spoken Dialogue Systems Author: Ondřej Dušek Department: Institute of Formal and Applied Linguistics Supervisor: Ing. Mgr. Filip Jurčíček, Ph.D., Institute of Formal and Applied Linguistics Abstract: This thesis explores novel approaches to natural language generation (NLG) in spoken dialogue systems (i.e., generating system responses to be presented the user), aiming at simplifying adaptivity of NLG in three respects: domain portability, language portability, and user-adaptive outputs. Our generators improve over state-of-the-art in all of them: First, our gen- erators, which are based on statistical methods (A* search with perceptron ranking and sequence-to-sequence recurrent neural network architectures), can be trained on data without fine-grained semantic alignments, thus simplifying the process of retraining the generator for a new domain in comparison to previous approaches. Second, we enhance the neural-network-based gener- ator so that it takes preceding dialogue context into account (i.e., user's way of speaking), thus producing user-adaptive outputs. Third, we evaluate sev- eral extensions to the neural-network-based generator designed for producing output in morphologically rich languages, showing improvements in Czech generation. In...

APA, Harvard, Vancouver, ISO, and other styles

43

Vejman, Martin. "Development of an English public transport information dialogue system." Master's thesis, 2015. http://www.nusl.cz/ntk/nusl-331758.

Full text

Abstract:

This thesis presents a development of an English spoken dialogue system based on the Alex dialogue system framework. The work describes a component adaptation of the framework for a different domain and language. The system provides public transport information in New York. This work involves creating a statistical model and the deployment of custom Kaldi speech recognizer. Its performance was better in comparison with the Google Speech API. The comparison was based on a subjective user satisfaction acquired by crowdsourcing. Powered by TCPDF (www.tcpdf.org)

APA, Harvard, Vancouver, ISO, and other styles

44

Rato, João Pedro Cordeiro. "Conversação homem-máquina. Caracterização e avaliação do estado actual das soluções de speech recognition, speech synthesis e sistemas de conversação homem-máquina." Master's thesis, 2016. http://hdl.handle.net/10400.8/2375.

Full text

Abstract:

A comunicação verbal humana é realizada em dois sentidos, existindo uma compreensão de ambas as partes que resulta em determinadas considerações. Este tipo de comunicação, também chamada de diálogo, para além de agentes humanos pode ser constituído por agentes humanos e máquinas. A interação entre o Homem e máquinas, através de linguagem natural, desempenha um papel importante na melhoria da comunicação entre ambos. Com o objetivo de perceber melhor a comunicação entre Homem e máquina este documento apresenta vários conhecimentos sobre sistemas de conversação Homemmáquina, entre os quais, os seus módulos e funcionamento, estratégias de diálogo e desafios a ter em conta na sua implementação. Para além disso, são ainda apresentados vários sistemas de Speech Recognition, Speech Synthesis e sistemas que usam conversação Homem-máquina. Por último são feitos testes de performance sobre alguns sistemas de Speech Recognition e de forma a colocar em prática alguns conceitos apresentados neste trabalho, é apresentado a implementação de um sistema de conversação Homem-máquina. Sobre este trabalho várias ilações foram obtidas, entre as quais, a alta complexidade dos sistemas de conversação Homem-máquina, a baixa performance no reconhecimento de voz em ambientes com ruído e as barreiras que se podem encontrar na implementação destes sistemas.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!