Dissertations / Theses on the topic 'Spoken Dialog System'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 44 dissertations / theses for your research on the topic 'Spoken Dialog System.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Takeda, Kazuya, Norihide Kitaoka, and Sunao Hara. "Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram." ISCA(International Speech Communication Association), 2010. http://hdl.handle.net/2237/15498.
Full textIntilisano, Antonio Rosario. "Spoken dialog systems: from automatic speech recognition to spoken language understanding." Doctoral thesis, Università di Catania, 2016. http://hdl.handle.net/10761/3920.
Full textAcosta, Jaime Cesar. "Using emotion to gain rapport in a spoken dialog system." To access this resource online via ProQuest Dissertations and Theses @ UTEP, 2009. http://0-proquest.umi.com.lib.utep.edu/login?COPT=REJTPTU0YmImSU5UPTAmVkVSPTI=&clientId=2515.
Full textCarter, Teresa G. "Five-Factor Model as a Predictor for Spoken Dialog Systems." NSUWorks, 2016. http://nsuworks.nova.edu/gscis_etd/990.
Full textTakeda, Kazuya, Norihide Kitaoka, and Sunao Hara. "Detection of task-incomplete dialogs based on utterance-and-behavior tag N-gram for spoken dialog systems." ISCA(International Speech Communication Association), 2011. http://hdl.handle.net/2237/15499.
Full textMilhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel." Thesis, Paris, ENST, 2014. http://www.theses.fr/2014ENST0087/document.
Full textRecently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism
Figueiredo, Sara Cristina Albuquerque. "Development of a dialog system for interaction with robots." Master's thesis, Universidade de Aveiro, 2014. http://hdl.handle.net/10773/14030.
Full textService robots operate in the same environment as humans and perform actions that a human usually performs. These robots must be able to operate autonomously in unknown and dynamic environments, as well as to maneuver with several people and know how to deal with them. By complying with these requirements, they are able to successfully address humans and fulfill their requests whenever they need assistance in a certain task. Natural language communication, including speech that is the most natural way of communication between humans, becomes relevant in the field of Human-Robot Interaction (HRI). By endowing service robots with intuitive spoken interfaces, the specification of the human required tasks is facilitated. However, this is a complicated task to achieve due to the resources involved in creating a sufficiently intuitive spoken interface and because of the difficulty of deploying it in different robots. The main objective of this thesis is the definition, implementation and evaluation of a dialogue system that can be easily integrated into any robotic platform and that functions as a flexible base for the creation of any conversational scenario in the Portuguese language. The system must meet the basic requirements for intuitive and natural communications, namely the characteristics of human-human conversations. A system was developed that functions as a base to give continuity to future work on Spoken Dialog Systems. The system incorporates the client-server architecture, where the client runs on the robot and captures what the user says. The client takes advantage on external dialogue management services. They are executed by the server, which processes the audio obtained, returning an appropriate response given the context of the dialogue. The development was based on a critical analysis of the state of the art in order for the system to be as faithful as possible to what is already done. Through the evaluation phase of the system, it was managed to obtain by few volunteers the conclusion that the main objective was accomplished: a base system was created that is flexible enough to explore different contexts of conversation, such as interacting with children or providing information on a university environment.
Os robôs de serviço operam no mesmo ambiente dos humanos e executam ações que um humano normalmente executaria. Estes robôs devem ser capazes de operar de forma autónoma em ambientes desconhecidos e dinâmicos, assim como de manobrar em ambientes com várias pessoas e de saberem lidar com elas. Ao respeitarem estes requisitos, conseguirão abordar com sucesso os humanos e cumprir as suas solicitações sempre que estes precisem de assistência em alguma tarefa. A comunicação por linguagem natural, nomeadamente a fala que é a forma mais abrangente de comunicação entre humanos, torna-se relevante na área da Interação humano-robô (IHR). Ao dotar os robôs de serviço com sistemas de voz intuitivos facilita-se a especificação das tarefas a realizar. No entanto, é uma tarefa complicada de se realizar devido aos recursos envolvidos na criação de uma interação suficientemente intuitiva e devido à dificuldade de funcionar em diversos robôs. O objetivo principal deste trabalho é a definição, implementação e avaliação de um sistema de diálogo que seja de fácil integração em qualquer sistema robótico e que funcione como uma base flexível para qualquer cenário de conversação na língua Portuguesa. Deve obedecer a requisitos base de comunicação intuitiva e natural, nomeadamente a características de conversas entre humanos. Foi desenvolvido um sistema que funciona como uma base para dar continuidade a trabalho futuro em sistemas de diálogo. O sistema incorpora a arquitetura cliente-servidor onde o cliente é executado no robô e capta o que o utilizador diz. O cliente tira partido de serviços de gestão de diálogo externos ao robô, executados pelo servidor, que processa o áudio obtido, devolvendo uma resposta ao cliente adequada ao contexto do diálogo. O desenvolvimento foi baseado numa análise crítica do estado da arte para se tentar manter fiel ao que já foi feito e de forma a se tomarem as principais decisões durante a implementação. Mediante a fase de avaliação do sistema, tanto a nível do ponto de vista da interação como do programador, conseguiu-se obter por parte de alguns voluntários que o objetivo principal foi cumprido: foi criada uma base suficientemente flexível para explorar diferentes contextos de conversação, nomeadamente interagir com crianças ou fornecimento de informações em ambiente universitário.
Milhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel." Electronic Thesis or Diss., Paris, ENST, 2014. http://www.theses.fr/2014ENST0087.
Full textRecently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism
Vaienti, Andrea. "Assistenti Vocali: Una Panoramica." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019.
Find full textMeurs, Marie-Jean. "Approche stochastique bayésienne de la composition sémantique pour les modules de compréhension automatique de la parole dans les systèmes de dialogue homme-machine." Phd thesis, Université d'Avignon, 2009. http://tel.archives-ouvertes.fr/tel-00634269.
Full textMa, Yi. "Learning for Spoken Dialog Systems with Discriminative Graphical Models." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440166760.
Full text河口, 信夫, Nobuo Kawaguchi, 誠. 長森, Makoto Nagamori, 茂樹 松原, Shigeki Matsubara, 康善 稲垣, and Yasuyoshi Inagaki. "複数の音声対話システムの統合制御機構とその評価." 情報処理学会, 2001. http://hdl.handle.net/2237/6886.
Full textKanda, Naoyuki. "Open-ended Spoken Language Technology: Studies on Spoken Dialogue Systems and Spoken Document Retrieval Systems." 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/188874.
Full textVlasenko, Bogdan Verfasser], Andreas [Akademischer Betreuer] [Wendemuth, and Dietmar [Akademischer Betreuer] Rösner. "Emotion recognition within spoken dialog systems / Bogdan Vlasenko. Betreuer: Andreas Wendemuth ; Dietmar Rösner." Magdeburg : Universitätsbibliothek, 2011. http://d-nb.info/1047595907/34.
Full textYasavur, Ugan. "Statistical Dialog Management for Health Interventions." FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1550.
Full textHenderson, Matthew S. "Discriminative methods for statistical spoken dialogue systems." Thesis, University of Cambridge, 2015. https://www.repository.cam.ac.uk/handle/1810/249015.
Full textLi, William (William Pui Lum). "Understanding user state and preferences for robust spoken dialog systems and location-aware assistive technology." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/72938.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 119-125).
This research focuses on improving the performance of spoken dialog systems (SDS) in the domain of assistive technology for people with disabilities. Automatic speech recognition (ASR) has compelling potential applications as a means of enabling people with physical disabilities to enjoy greater levels of independence and participation. This thesis describes the development and evaluation of a spoken dialog system modeled as a partially observable Markov decision process (SDS-POMDP). The SDSPOMDP can understand commands related to making phone calls and providing information about weather, activities, and menus in a specialized-care residence setting. Labeled utterance data was used to train observation and utterance confidence models. With a user simulator, the SDS-POMDP reward function parameters were optimized, and the SDS-POMDP is shown to out-perform simpler threshold-based dialog strategies. These simulations were validated in experiments with human participants, with the SDS-POMDP resulting in more successful dialogs and faster dialog completion times, particularly for speakers with high word-error rates. This thesis also explores the social and ethical implications of deploying location based assistive technology in specialized-care settings. These technologies could have substantial potential benefit to residents and caregivers in such environments, but they may also raise issues related to user safety, independence, autonomy, or privacy. As one example, location-aware mobile devices are potentially useful to increase the safety of individuals in a specialized-care setting who may be at risk of unknowingly wandering, but they raise important questions about privacy and informed consent. This thesis provides a survey of U.S. legislation related to the participation of individuals who have questionable capacity to provide informed consent in research studies. Overall, it seeks to precisely describe and define the key issues that are arise as a result of new, unforeseen technologies that may have both benefits and costs to the elderly and people with disabilities.
by William Li.
S.M.in Technology and Policy
S.M.
Gustafson, Joakim. "Developing Multimodal Spoken Dialogue Systems : Empirical Studies of Spoken Human–Computer Interaction." Doctoral thesis, KTH, Tal, musik och hörsel, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3460.
Full textQC 20100611
Cuayáhuitl, Heriberto. "Hierarchical reinforcement learning for spoken dialogue systems." Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/2750.
Full textKhouzaimi, Hatim. "Turn-taking enhancement in spoken dialogue systems with reinforcement learning." Thesis, Avignon, 2016. http://www.theses.fr/2016AVIG0213/document.
Full textIncremental dialogue systems are able to process the user’s speech as it is spoken (without waiting for the end of a sentence before starting to process it). This makes them able to take the floor whenever they decide to (the user can also speak whenever she wants, even if the system is still holding the floor). As a consequence, they are able to perform a richer set of turn-taking behaviours compared to traditional systems. Several contributions are described in this thesis with the aim of showing that dialogue systems’ turn-taking capabilities can be automatically improved from data. First, human-human dialogue is analysed and a new taxonomy of turn-taking phenomena in human conversation is established. Based on this work, the different phenomena are analysed and some of them are selected for replication in a human-machine context (the ones that are more likely to improve a dialogue system’s efficiency). Then, a new architecture for incremental dialogue systems is introduced with the aim of transforming a traditional dialogue system into an incremental one at a low cost (also separating the turn-taking manager from the dialogue manager). To be able to perform the first tests, a simulated environment has been designed and implemented. It is able to replicate user and ASR behaviour that are specific to incremental processing, unlike existing simulators. Combined together, these contributions led to the establishement of a rule-based incremental dialogue strategy that is shown to improve the dialogue efficiency in a task-oriented situation and in simulation. A new reinforcement learning strategy has also been proposed. It is able to autonomously learn optimal turn-taking behavious throughout the interactions. The simulated environment has been used for training and for a first evaluation, where the new data-driven strategy is shown to outperform both the non-incremental and rule-based incremental strategies. In order to validate these results in real dialogue conditions, a prototype through which the users can interact in order to control their smart home has been developed. At the beginning of each interaction, the turn-taking strategy is randomly chosen among the non-incremental, the rule-based incremental and the reinforcement learning strategy (learned in simulation). A corpus of 206 dialogues has been collected. The results show that the reinforcement learning strategy significantly improves the dialogue efficiency without hurting the user experience (slightly improving it, in fact)
Mrkšić, Nikola. "Data-driven language understanding for spoken dialogue systems." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276689.
Full textToney, Dave. "Evolutionary reinforcement learning of spoken dialogue strategies." Thesis, University of Edinburgh, 2007. http://hdl.handle.net/1842/1769.
Full textDahlgren, Karl. "Context-dependent voice commands in spoken dialogue systems for home environments : A study on the effect of introducing context-dependent voice commands to a spoken dialogue system for home environments." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-128170.
Full textDenna uppsats har som mal att undersoka vilken eekt kontext kan ha pa interaktion mellan en anvandare och ett spoken dialogue system. Det antogs att anvandbarheten skulle oka genom att anvanda kontextberoende rostkommandon istallet for absolut semantiska rostkommandon. Denna uppsats granskar aven om kontext kan paverka anvandarens integritet och om den, ur ett anvandarperspektiv, kan utgora ett hot. Baserat pa den utokade litteraturstudien av spoken dialogue system, rostigenkanning, ambient intelligence, manniska-datorinteraktion och integritet, designades och implementerades ett spoken dialogue system for att testa detta antagande. Teststudien bestod av tva steg: experiment och intervju. Deltagarna utforde olika scenarier dar ett spoken dialogue system kunde anvands med kontextberoende rostkommandon och absolut semantiska rostkommandon. Kvalitativa resultat angaende naturlighet, anvandbarhet och integritet validerade forfattarens hypotes till en viss grad. Resultatet indikerade att interaktionen mellan anvandare och ett spoken dialogue system var mer naturlig och mer anvandbar vid anvandning av kontextberoende rostkommandon istallet for absolut semantiska rostkommandon. Deltagarna kande sig inte mer overvakade av ett spoken dialogue system vid anvandning av kontextberoende rostkommandon. Somliga deltagare angav att det, i teorin, fanns integritetsproblem, men endast om inte alla sakerhetsatgarder var uppnadda. Uppsatsen avslutas med forslag pa framtida studier inom detta vetenskapliga omrade.
Janarthanam, Srinivasan Chandrasekaran. "Learning user modelling strategies for adaptive referring expression generation in spoken dialogue systems." Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5033.
Full textYoshino, Koichiro. "Spoken Dialogue System for Information Navigation based on Statistical Learning of Semantic and Dialogue Structure." 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/192214.
Full textInoue, Koji. "Engagement Recognition based on Multimodal Behaviors for Human-Robot Dialogue." Kyoto University, 2018. http://hdl.handle.net/2433/235112.
Full textFrampton, Matthew. "Using Dialogue Acts in dialogue strategy learning : optimising repair strategies." Thesis, University of Edinburgh, 2008. http://hdl.handle.net/1842/2381.
Full textAsri, Layla El. "Learning the Parameters of Reinforcement Learning from Data for Adaptive Spoken Dialogue Systems." Thesis, Université de Lorraine, 2016. http://www.theses.fr/2016LORR0350/document.
Full textThis document proposes to learn the behaviour of the dialogue manager of a spoken dialogue system from a set of rated dialogues. This learning is performed through reinforcement learning. Our method does not require the definition of a representation of the state space nor a reward function. These two high-level parameters are learnt from the corpus of rated dialogues. It is shown that the spoken dialogue designer can optimise dialogue management by simply defining the dialogue logic and a criterion to maximise (e.g user satisfaction). The methodology suggested in this thesis first considers the dialogue parameters that are necessary to compute a representation of the state space relevant for the criterion to be maximized. For instance, if the chosen criterion is user satisfaction then it is important to account for parameters such as dialogue duration and the average speech recognition confidence score. The state space is represented as a sparse distributed memory. The Genetic Sparse Distributed Memory for Reinforcement Learning (GSDMRL) accommodates many dialogue parameters and selects the parameters which are the most important for learning through genetic evolution. The resulting state space and the policy learnt on it are easily interpretable by the system designer. Secondly, the rated dialogues are used to learn a reward function which teaches the system to optimise the criterion. Two algorithms, reward shaping and distance minimisation are proposed to learn the reward function. These two algorithms consider the criterion to be the return for the entire dialogue. These functions are discussed and compared on simulated dialogues and it is shown that the resulting functions enable faster learning than using the criterion directly as the final reward. A spoken dialogue system for appointment scheduling was designed during this thesis, based on previous systems, and a corpus of rated dialogues with this system were collected. This corpus illustrates the scaling capability of the state space representation and is a good example of an industrial spoken dialogue system upon which the methodology could be applied
Bell, Linda. "Linguistic Adaptations in Spoken Human-Computer Dialogues - Empirical Studies of User Behavior." Doctoral thesis, Stockholm : KTH, 2003. http://www.speech.kth.se/~bell/linda_bell.pdf.
Full textChandramohan, Senthilkumar. "Revisiting user simulation in dialogue systems : do we still need them ? : will imitation play the role of simulation ?" Phd thesis, Université d'Avignon, 2012. http://tel.archives-ouvertes.fr/tel-00875229.
Full textAlfenas, Daniel Assis. "Aplicações da tecnologia adaptativa no gerenciamento de diálogo falado em sistemas computacionais." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/3/3141/tde-13012015-111240/.
Full textThis work presents a study on how to apply adaptive technologies to improve existing dialog management methodologies. Dialog management is the central activity of a spoken dialog system, being responsible for choosing the communicative actions sent to the system user. In order to evidence parts that can be improved with adaptive technology, a large review on dialog management is presented. This review allows us to point existing criteria and create new ones to evaluate dialog managers. An adaptive management model based on finite state-based spoken dialog systems, Adaptalker, is proposed and used to build a development framework of dialog managers, which is illustrated by creating a pizza selling application. Analysis of this example allows us to observe how to use adaptivity to improve the model, allowing it to handle both dialog repair and user initiative more efficiently. Adaptalker groups its management rules in submachines that work concurrently to choose the next communication action.
Wirström, Li, and Mattias Huledal. "Den svenska callcenterbranschen och de tekniska lösningar som används : Branschanalys samt identifiering av problematiska dialogsystemsyttranden med hjälp av maskininlärning." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-175360.
Full textThis work consists of two parts. The first part aims to describe and analyze the call center industry in Sweden and the factors that affect the industry and its development. The analysis is based on two models: Porter’s five forces and PEST. The focus is mainly on the part of the industry that consists of customer service operations. The analysis shows that the industry is mainly affected by high competition and businesses’, that need to provide customer service, choice between internal or external customer service operations. The analysis also indicates that the industry will continue to grow and that there is a trend that companies increasingly choose to outsource their customer service. The development of the technological solutions used in call centers, for example, dialogue systems, are requested by companies as these are important tools to create a well-functioning customer service. Digital systems today have obvious development areas. It is often large international companies or international teams that develop the digital systems used. However, extends the area of use for these systems far beyond the call center industry. The second part involves identifying problematic dialogue system utterances using machine learning and is inspired by SpeDial, an EU project aimed at improving dialogue systems. Problematic dialogue system utterances can be considered problematic depending on, for example, that the system misinterprets the user's intention. The aim of the work done in the second part is to investigate what or which machine learning methods in the WEKA tool that are best suited to identify problematic dialogue system utterances. The data used in this work comes from a customer service entrance based on free speech, which means that the user is asked to describe their case to be transferred to the right department within the customer service. Our data has been provided by the company Voice Provider that develops, implements and maintains customer service systems. We came in contact with Voice Provider through the Department of Speech, Music and Hearing (TMH), at the Royal Institute of Technology, that are involved in the SpeDial project. The work initially consisted of preparing the supplied data to enable it to me used by the machine learning tool WEKA’s built-in classifiers, after which six classifiers were selected for further evaluation. The results show that none of the classifiers managed to accomplish the task in a fully satisfactory manner. Whoever the method that was most successful was the Random Forest method. It is difficult to draw any further conclusions from the results.
Griol, Barres David. "Desarrollo y evaluación de diferentes metodologías para la gestión automática del diálogo." Doctoral thesis, Universitat Politècnica de València, 2008. http://hdl.handle.net/10251/1956.
Full textGriol Barres, D. (2007). Desarrollo y evaluación de diferentes metodologías para la gestión automática del diálogo [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1956
Palancia
Su, Pei-Hao. "Reinforcement learning and reward estimation for dialogue policy optimisation." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275649.
Full textFerreira, Arikleyton de Oliveira. "MHNSS: um Middleware para o Desenvolvimento de Aplicações Móveis com Interações Baseada na Fala." Universidade Federal do Maranhão, 2014. http://tedebc.ufma.br:8080/jspui/handle/tede/522.
Full textConselho Nacional de Desenvolvimento Científico e Tecnológico
Applications for mobile computing environments usually have several accessibility limitations due to the dependency on the interaction with the user through the device display, which hinders its use to people who have limitations to read, write (type) and/or have little fluency in the use of technology. In this master thesis we propose a middleware that provides support for developing mobile applications with accessibility features through spoken dialogue systems. These systems are able to hold a conversation with the user, providing a natural interaction interface that does not require prior learning. Thus, mobile applications can use the middleware to provide accessibility to the user that overcomes the physical or visual contact needs. The proposed middleware was developed in the context of the MobileHealthNet project, where it will help mobile applications with focus in the health domain to reach users with different profiles, with particular attention to underserved and remote communities. To perform the middleware evaluation, we used a case study based on a mobile application for evaluating the health condition of patients with atrial fibrillation. The evaluation involved 10 individuals, and the results obtained were very positive.
Aplicações para ambientes computacionais móveis usualmente apresentam diversas limitações de acessibilidade por dependerem da interação com o usuário através da tela dos dispositivos móveis, o que dificulta seu uso às pessoas que possuem limitações para ler, escrever (digitar) e que tenham pouca fluência no uso de tecnologias. Neste trabalho de mestrado propomos um middleware que fornece suporte ao desenvolvimento de aplicações móveis com recurso de acessibilidade através do diálogo falado. Essa modalidade de acesso é capaz de manter uma conversa com o usuário, proporcionando uma interface de interação natural que não requer prévio aprendizado. Assim, aplicações móveis podem utilizar o middleware para proporcionar acessibilidade ao usuário que supera a necessidade do contato físico ou visual, pois eles podem apenas dialogar entre si. O middleware proposto está inserido no contexto do projeto MobileHealthNet, onde auxiliará aplicações móveis focadas ao domínio da saúde a atingir usuários com diferentes perfis, com especial atenção a moradores de comunidades carentes e distantes. No processo de avalidação do middleware proposto foi utilizado um estudo de caso de uma aplicação dedicada a acompanhar o estado de saúde de pacientes portadores de fibrilação atrial, realizando-se uma avaliação com 10 sujeitos na qual obteve-se resultados bastante positivos.
Ke, Shin-Cheng, and 柯欣成. "Spoken Dialog Based Automobile Information System." Thesis, 2001. http://ndltd.ncl.edu.tw/handle/89913284805839373174.
Full text國立成功大學
電機工程學系
89
A mobile Information System (MIS) can provide driver many kinds of information, such as the location of gasoline station, traffic situation and shortest path to a destination. Most current MIS use screen and touch pad as communicate interface. However, this model is not convenient and safe while driving. The better interface will use spontaneous speech to communicate with the system. In this thesis, common errors of Spoken Dialog System (SDS) are first analyzed. Then, many practicable strategies are proposed to avoid and recover from these errors. These strategies focus on three components of the SDS, i.e. user, recognizer, and dialog manager. To overcome the defect of traditional guided dialog strategy that spend many turns to complete a dialog, this thesis proposes the Semi-Guided Dialog System (SGDS) to reduce the vocabulary size and shorten the dialog turns. The method is to classify all landmarks by analyzing landmark literally. Words with the same components will be classified into the same category. When recognizing, a two-pass recognition technique is adopted. The first pass is to recognize the categories, and the second pass then recognizes landmarks in the recognized categories. Using this method, the dialog turns is reduced to be one or two, and greatly improve the efficiency of the interaction. For the experiments, the keyword recognition rate arises 5% when SGDS is applied for large landmark retrieval. In addition, about 12% of recognition rate improvement is gained by applying proposed strategies of SDS error recovery. Finally, we test the dialog’s finish rate, and is greater then 89%. 圖目錄 iii 表目錄 v 第一章 緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 研究方法與步驟 3 1.4 系統架構 4 1.5 章節概要 5 第二章 語音對話系統之錯誤分析及可行的解決策略 1 2.1 語音對話系統之錯誤定義 1 2.2 各子系統造成的錯誤分析與回復策略 1 2.2.1 使用者之輸入系統無法處理 2 2.2.2 語音辨識錯誤 5 2.2.3 對話管理系統錯誤 8 第三章 半導引式對話系統 13 3.1 簡介 13 3.2 傳統導引式對話架構查詢流程 13 3.3 半導引式對話系統 16 3.3.1 漸次擷取使用者輸入資訊 17 3.3.2 地標詞表分類 17 3.3.3 內部多重辨識 18 3.4 半導引式對話系統建立流程 20 3.4.1 地標特性分析 20 3.4.2 地標詞表分類建立流程 21 3.4.3 半導引式對話系統地標查詢流程 25 第四章 對話行動資訊系統 27 4.1 對話系統之開發 27 4.1.1 資料收集模組 28 4.1.2 構句模組 34 4.1.3 語意理解模組 34 4.1.4 資料查詢模組 38 4.1.5 對話管理模組 38 4.1.6 文字翻語音模組 41 4.2 行動資訊系統架構 42 4.3 對話管理系統之改善 43 4.4 各子系統之整合 44 第五章 實驗結果 47 5.1 實驗環境 47 5.2 關鍵詞數與語音辨識正確率的實驗 47 5.3 SGDS測試及錯誤回復實驗 48 5.4 MIS對話完成度實驗 49 第六章 結論及未來展望 53 6.1 結論 53 6.2 未來展望 53 參考文獻: 55 附錄 57
Jing-MinChen and 陳敬閔. "Spoken Dialog Summarization System with HAPPINESS/SUFFERING Factor Recognition." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/k86see.
Full textMengistu, Kinfe Tadesse [Verfasser]. "Robust acoustic and semantic modeling in a telephone-based spoken dialog system / Kinfe Tadesse Mengistu." 2009. http://d-nb.info/995555265/34.
Full textWang, Nick Jui-Chang. "Relevant Technologies For Improved Chinese Spoken Dialog Systems." 2007. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-2607200718203600.
Full textWang, Nick Jui-Chang, and 王瑞璋. "Relevant Technologies For Improved Chinese Spoken Dialog Systems." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/06575877680801601286.
Full text臺灣大學
電信工程學研究所
95
The application of automatic speech recognition technology in spoken dialogue systems comprises important technologies from several different aspects, including digital signal processing, robust speech feature extraction, acoustic-phonetic modeling, speaker adaptation, language modeling, and language understanding, dialogue management, as well as language generation and speech synthesis. All of these technologies contribute to the performance of the spoken dialogue system, which accomplishes the communication between men and machines. The dissertation includes three relevant technologies for improved Chinese spoken dialogue systems: the first topic is about speaker adaptation and speaker identification, the second one is about speech understanding, and the third one is an interactive open-vocabulary Chinese name input system. The speaker adaptation technology in the first topic is to adapt acoustic models to speaker voice characteristics to improve speech recognition accuracy. Eigen-MLLR approach was proposed to construct the subspace of MLLR parameters space by Principal Component Analysis (PCA) technique, hence it is more robust than MLLR approach with small amount of enrollment data. Compared with Eigenvoice approach, it requires less storage memory for model-adaptation estimates. Therefore, it could be more realistic for application in speaker-independent spoken dialogue systems. The author compared Eigen-MLLR with MLLR and Eigenvoice, developed a fast Eigen-MLLR coefficient estimation algorithm, and applied Eigen-MLLR coefficients for speaker identification. The second topic is about speech understanding. Most of speech understanding systems with middle to large vocabularies incorporate a two-stage approach: the speech recognition component as the first stage, followed by the second stage of natural language understanding component. The speech understanding performance is usually constrained by speech recognition errors and out-of-grammar problems. Therefore, it is necessary to have robust speech understanding ability. The proposed novel approach integrates a concept layer, Key Semantic Chunk, into the two-stage system. The Key Semantic Chunk is a language unit between sentence and word, is integrated into both speech recognition and language understanding components, and interfaces the communication between these two components. Not only the language model of speech recognition can be improved in its robustness to data-sparseness, but also the language understanding processing on the speech recognition output can work more robustly. Consequently, the improved system achieved about 30% reduction over understanding errors. Besides, the building and maintenance efforts for language understanding grammars and speech recognition n-gram models can be reduced. The last topic is to build an interactive open-vocabulary Chinese name input system and to establish an error correction mechanism. The motivation of building the system came from the experience of 104 directory-assistance services in Chunghwa Telecom. This service is the biggest commercial telephony service in Taiwan. It has the largest group of consumers and is frequently used by the telephone user. However, its service is clear and simple – the telephone number of a person, a company, or a branch of a company. The difficulty of an open-vocabulary Chinese name input task is its huge vocabulary size. For example, with very short periods, less than two seconds, of speech, the task requires a system to recognize the target name among billion names. It is incredible to have high recognition accuracy only by the speech recognition technique. The experimental system attempts to design an intelligent and friendly dialogue strategy by incorporate the error correction mechanism to achieve a reasonable high success rate. Referring to actual 104-service interactions, the human operator may attempt to ask the caller to describe again the ambiguous characters. Finally, both character confirmation and character input mechanisms were designed into the experimental system and achieved an 86.7% high success rate. The dissertation has included several relevant technologies for improved Chinese spoken dialogue systems, although the first two can also be applied in different languages. Via all different research topics, the author would like to understand more about the spoken dialogue system and to improve the whole system performance. There is a wish in the mind of the author: to see the speech recognition and dialogue system technologies being widely and successfully applied in many applications.
"An evaluation paradigm for spoken dialog systems based on crowdsourcing and collaborative filtering." 2011. http://library.cuhk.edu.hk/record=b5894830.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 2011.
Includes bibliographical references (p. 92-99).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- SDS Architecture --- p.1
Chapter 1.2 --- Dialog Model --- p.3
Chapter 1.3 --- SDS Evaluation --- p.4
Chapter 1.4 --- Thesis Outline --- p.7
Chapter 2 --- Previous Work --- p.9
Chapter 2.1 --- Approaches to Dialog Modeling --- p.9
Chapter 2.1.1 --- Handcrafted Dialog Modeling --- p.9
Chapter 2.1.2 --- Statistical Dialog Modeling --- p.12
Chapter 2.2 --- Evaluation Metrics --- p.16
Chapter 2.2.1 --- Subjective User Judgments --- p.17
Chapter 2.2.2 --- Interaction Metrics --- p.18
Chapter 2.3 --- The PARADISE Framework --- p.19
Chapter 2.4 --- Chapter Summary --- p.22
Chapter 3 --- Implementation of a Dialog System based on POMDP --- p.23
Chapter 3.1 --- Partially Observable Markov Decision Processes (POMDPs) --- p.24
Chapter 3.1.1 --- Formal Definition --- p.24
Chapter 3.1.2 --- Value Iteration --- p.26
Chapter 3.1.3 --- Point-based Value Iteration --- p.27
Chapter 3.1.4 --- A Toy Example of POMDP: The NaiveBusInfo System --- p.27
Chapter 3.2 --- The SDS-POMDP Model --- p.31
Chapter 3.3 --- Composite Summary Point-based Value Iteration (CSPBVI) --- p.33
Chapter 3.4 --- Application of SDS-POMDP Model: The Buslnfo System --- p.35
Chapter 3.4.1 --- System Description --- p.35
Chapter 3.4.2 --- Demonstration Description --- p.39
Chapter 3.5 --- Chapter Summary --- p.42
Chapter 4 --- Collecting User Judgments on Spoken Dialogs with Crowdsourcing --- p.46
Chapter 4.1 --- Dialog Corpus and Automatic Dialog Classification --- p.47
Chapter 4.2 --- User Judgments Collection with Crowdsourcing --- p.50
Chapter 4.2.1 --- HITs on Dialog Evaluation --- p.51
Chapter 4.2.2 --- HITs on Inter-rater Agreement --- p.53
Chapter 4.2.3 --- Approval of Ratings --- p.54
Chapter 4.3 --- Collected Results and Analysis --- p.55
Chapter 4.3.1 --- Approval Rates and Comments from Mturk Workers --- p.55
Chapter 4.3.2 --- Consistency between Automatic Dialog Classification and Manual Ratings --- p.57
Chapter 4.3.3 --- Inter-rater Agreement Among Workers --- p.60
Chapter 4.4 --- Comparing Experts to Non-experts --- p.64
Chapter 4.4.1 --- Inter-rater Agreement on the Let's Go! System --- p.65
Chapter 4.4.2 --- Consistency Between Expert and Non-expert Annotations on SDC Systems --- p.66
Chapter 4.5 --- Chapter Summary --- p.68
Chapter 5 --- Collaborative Filtering for Performance Prediction --- p.70
Chapter 5.1 --- Item-Based Collaborative Filtering --- p.71
Chapter 5.2 --- CF Model for User Satisfaction Prediction --- p.72
Chapter 5.2.1 --- ICFM for User Satisfaction Prediction --- p.72
Chapter 5.2.2 --- Extended ICFM for User Satisfaction Prediction --- p.73
Chapter 5.3 --- Extraction of Interaction Features --- p.74
Chapter 5.4 --- Experimental Results and Analysis --- p.76
Chapter 5.4.1 --- Prediction of User Satisfaction --- p.76
Chapter 5.4.2 --- Analysis of Prediction Results --- p.79
Chapter 5.5 --- Verifying the Generalibility of CF Model --- p.81
Chapter 5.6 --- Evaluation of The Buslnfo System --- p.86
Chapter 5.7 --- Chapter Summary --- p.87
Chapter 6 --- Conclusions and Future Work --- p.89
Chapter 6.1 --- Thesis Summary --- p.89
Chapter 6.2 --- Future Work --- p.90
Bibliography --- p.92
Dušek, Ondřej. "Nové metody generování promluv v dialogových systémech." Doctoral thesis, 2017. http://www.nusl.cz/ntk/nusl-364953.
Full textVejman, Martin. "Development of an English public transport information dialogue system." Master's thesis, 2015. http://www.nusl.cz/ntk/nusl-331758.
Full textRato, João Pedro Cordeiro. "Conversação homem-máquina. Caracterização e avaliação do estado actual das soluções de speech recognition, speech synthesis e sistemas de conversação homem-máquina." Master's thesis, 2016. http://hdl.handle.net/10400.8/2375.
Full text