Dissertations / Theses on the topic 'Data-to-text'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Data-to-text.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Kyle, Cameron. "Data to information to text summaries of financial data." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29643.
Full textGkatzia, Dimitra. "Data-driven approaches to content selection for data-to-text generation." Thesis, Heriot-Watt University, 2015. http://hdl.handle.net/10399/3003.
Full textTurner, Ross. "Georeferenced data-to-text techniques and application /." Thesis, Available from the University of Aberdeen Library and Historic Collections Digital Resources, 2009. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?application=DIGITOOL-3&owner=resourcediscovery&custom_att_2=simple_viewer&pid=56243.
Full textŠtajner, Sanja. "New data-driven approaches to text simplification." Thesis, University of Wolverhampton, 2015. http://hdl.handle.net/2436/554413.
Full textJones, Greg 1963-2017. "RADIX 95n: Binary-to-Text Data Conversion." Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc500582/.
Full textŠtajner, Sanja. "New data-driven approaches to text simplification." Thesis, University of Wolverhampton, 2016. http://hdl.handle.net/2436/601113.
Full textRose, Øystein. "Text Mining in Health Records : Classification of Text to Facilitate Information Flow and Data Overview." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9629.
Full textThis project consists of two parts. In the first part we apply techniques from the field of text mining to classify sentences in encounter notes of the electronic health record (EHR) into classes of {it subjective}, {it objective} and {it plan} character. This is a simplification of the {it SOAP} standard, and is applied due to the way GPs structure the encounter notes. Structuring the information in a subjective, objective, and plan way, may enhance future information flow between the EHR and the personal health record (PHR). In the second part of the project we seek to use apply the most adequate to classify encounter notes from patient histories of patients suffering from diabetes. We believe that the distribution of sentences of a subjective, objective, and plan character changes according to different phases of diseases. In our work we experiment with several preprocessing techniques, classifiers, and amounts of data. Of the classifiers considered, we find that Complement Naive Bayes (CNB) produces the best result, both when the preprocessing of the data has taken place and not. On the raw dataset, CNB yields an accuracy of 81.03%, while on the preprocessed dataset, CNB yields an accuracy of 81.95%. The Support Vector Machines (SVM) classifier algorithm yields results comparable to the results obtained by use of CNB, while the J48 classifier algorithm performs poorer. Concerning preprocessing techniques, we find that use of techniques reducing the dimensionality of the datasets improves the results for smaller attribute sets, but worsens the result for larger attribute sets. The trend is opposite for preprocessing techniques that expand the set of attributes. However, finding the ratio between the size of the dataset and the number of attributes, where the preprocessing techniques improve the result, is difficult. Hence, preprocessing techniques are not applied in the second part of the project. From the result of the classification of the patient histories we have extracted graphs that show how the sentence class distribution after the first diagnosis of diabetes is set. Although no empiric research is carried out, we believe that such graphs may, through further research, facilitate the recognition of points of interest in the patient history. From the same results we also create graphs that show the average distribution of sentences of subjective, objective, and plan character for 429 patients after the first diagnosis of diabetes is set. From these graphs we find evidence that there is an overrepresentation of subjective sentences in encounter notes where the diagnosis of diabetes is first set. However, we believe that similar experiments for several diseases, may uncover patterns or trends concerning the diseases in focus.
Ma, Yimin. "Text classification on imbalanced data: Application to systematic reviews automation." Thesis, University of Ottawa (Canada), 2007. http://hdl.handle.net/10393/27532.
Full textSalah, Aghiles. "Von Mises-Fisher based (co-)clustering for high-dimensional sparse data : application to text and collaborative filtering data." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB093/document.
Full textCluster analysis or clustering, which aims to group together similar objects, is undoubtedly a very powerful unsupervised learning technique. With the growing amount of available data, clustering is increasingly gaining in importance in various areas of data science for several reasons such as automatic summarization, dimensionality reduction, visualization, outlier detection, speed up research engines, organization of huge data sets, etc. Existing clustering approaches are, however, severely challenged by the high dimensionality and extreme sparsity of the data sets arising in some current areas of interest, such as Collaborative Filtering (CF) and text mining. Such data often consists of thousands of features and more than 95% of zero entries. In addition to being high dimensional and sparse, the data sets encountered in the aforementioned domains are also directional in nature. In fact, several previous studies have empirically demonstrated that directional measures—that measure the distance between objects relative to the angle between them—, such as the cosine similarity, are substantially superior to other measures such as Euclidean distortions, for clustering text documents or assessing the similarities between users/items in CF. This suggests that in such context only the direction of a data vector (e.g., text document) is relevant, not its magnitude. It is worth noting that the cosine similarity is exactly the scalar product between unit length data vectors, i.e., L 2 normalized vectors. Thus, from a probabilistic perspective using the cosine similarity is equivalent to assuming that the data are directional data distributed on the surface of a unit-hypersphere. Despite the substantial empirical evidence that certain high dimensional sparse data sets, such as those encountered in the above domains, are better modeled as directional data, most existing models in text mining and CF are based on popular assumptions such as Gaussian, Multinomial or Bernoulli which are inadequate for L 2 normalized data. In this thesis, we focus on the two challenging tasks of text document clustering and item recommendation, which are still attracting a lot of attention in the domains of text mining and CF, respectively. In order to address the above limitations, we propose a suite of new models and algorithms which rely on the von Mises-Fisher (vMF) assumption that arises naturally for directional data lying on a unit-hypersphere
Natarajan, Jeyakumar. "Text mining of biomedical literature and its applications to microarray data analysis and interpretation." Thesis, University of Ulster, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445041.
Full textMalatji, Promise Tshepiso. "The development of accented English synthetic voices." Thesis, University of Limpopo, 2019. http://hdl.handle.net/10386/2917.
Full textA Text-to-speech (TTS) synthesis system is a software system that receives text as input and produces speech as output. A TTS synthesis system can be used for, amongst others, language learning, and reading out text for people living with different disabilities, i.e., physically challenged, visually impaired, etc., by native and non-native speakers of the target language. Most people relate easily to a second language spoken by a non-native speaker they share a native language with. Most online English TTS synthesis systems are usually developed using native speakers of English. This research study focuses on developing accented English synthetic voices as spoken by non-native speakers in the Limpopo province of South Africa. The Modular Architecture for Research on speech sYnthesis (MARY) TTS engine is used in developing the synthetic voices. The Hidden Markov Model (HMM) method was used to train the synthetic voices. Secondary training text corpus is used to develop the training speech corpus by recording six speakers reading the text corpus. The quality of developed synthetic voices is measured in terms of their intelligibility, similarity and naturalness using a listening test. The results in the research study are classified based on evaluators’ occupation and gender and the overall results. The subjective listening test indicates that the developed synthetic voices have a high level of acceptance in terms of similarity and intelligibility. A speech analysis software is used to compare the recorded synthesised speech and the human recordings. There is no significant difference in the voice pitch of the speakers and the synthetic voices except for one synthetic voice.
Odd, Joel, and Emil Theologou. "Utilize OCR text to extract receipt data and classify receipts with common Machine Learning algorithms." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148350.
Full textBergh, Adrienne. "A Machine Learning Approach to Predicting Alcohol Consumption in Adolescents From Historical Text Messaging Data." Chapman University Digital Commons, 2019. https://digitalcommons.chapman.edu/cads_theses/2.
Full textShokat, Imran. "Computational Analyses of Scientific Publications Using Raw and Manually Curated Data with Applications to Text Visualization." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-78995.
Full textHill, Geoffrey. "Sensemaking in Big Data: Conceptual and Empirical Approaches to Actionable Knowledge Generation from Unstructured Text Streams." Kent State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=kent1433597354.
Full textZhou, Wubai. "Data Mining Techniques to Understand Textual Data." FIU Digital Commons, 2017. https://digitalcommons.fiu.edu/etd/3493.
Full textPereira, José Casimiro. "Natural language generation in the context of multimodal interaction in Portuguese : Data-to-text based in automatic translation." Doctoral thesis, Universidade de Aveiro, 2017. http://hdl.handle.net/10773/21767.
Full textResumo em português não disponivel
To enable the interaction by text and/or speech it is essential that we devise systems capable of translating internal data into sentences or texts that can be shown on screen or heard by users. In this context, it is essential that these natural language generation (NLG) systems provide sentences in the native languages of the users (in our case European Portuguese) and enable an easy development and integration process while providing an output that is perceived as natural. The creation of high quality NLG systems is not an easy task, even for a small domain. The main di culties arise from: classic approaches being very demanding in know-how and development time; a lack of variability in generated sentences of most generation methods; a di culty in easily accessing complete tools; shortage of resources, such as large corpora; and support being available in only a limited number of languages. The main goal of this work was to propose, develop and test a method to convert Data-to-Portuguese, which can be developed with the smallest amount possible of time and resources, but being capable of generating utterances with variability and quality. The thesis defended argues that this goal can be achieved adopting data-driven language generation { more precisely generation based in language translation { and following an Engineering Research Methodology. In this thesis, two Data2Text NLG systems are presented. They were designed to provide a way to quickly develop an NLG system which can generate sentences with good quality. The proposed systems use tools that are freely available and can be developed by people with low linguistic skills. One important characteristic is the use of statistical machine translation techniques and this approach requires only a small natural language corpora resulting in easier and cheaper development when compared to more common approaches. The main result of this thesis is the demonstration that, by following the proposed approach, it is possible to create systems capable of translating information/data into good quality sentences in Portuguese. This is done without major e ort regarding resources creation and with the common knowledge of an experienced application developer. The systems created, particularly the hybrid system, are capable of providing a good solution for problems in data to text conversion.
Thorstensson, Niklas. "A knowledge-based grapheme-to-phoneme conversion for Swedish." Thesis, University of Skövde, Department of Computer Science, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-731.
Full textA text-to-speech system is a complex system consisting of several different modules such as grapheme-to-phoneme conversion, articulatory and prosodic modelling, voice modelling etc.
This dissertation is aimed at the creation of the initial part of a text-to-speech system, i.e. the grapheme-to-phoneme conversion, designed for Swedish. The problem area at hand is the conversion of orthographic text into a phonetic representation that can be used as a basis for a future complete text-to speech system.
The central issue of the dissertation is the grapheme-to-phoneme conversion and the elaboration of rules and algorithms required to achieve this task. The dissertation aims to prove that it is possible to make such a conversion by a rule-based algorithm with reasonable performance. Another goal is to find a way to represent phonotactic rules in a form suitable for parsing. It also aims to find and analyze problematic structures in written text compared to phonetic realization.
This work proposes a knowledge-based grapheme-to-phoneme conversion system for Swedish. The system suggested here is implemented, tested, evaluated and compared to other existing systems. The results achieved are promising, and show that the system is fast, with a high degree of accuracy.
Yu, Shuren. "How to Leverage Text Data in a Decision Support System? : A Solution Based on Machine Learning and Qualitative Analysis Methods." Thesis, Umeå universitet, Institutionen för informatik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-163899.
Full textPrice, Sarah Jane. "What are we missing by ignoring text records in the Clinical Practice Research Datalink? : using three symptoms of cancer as examples to estimate the extent of data in text format that is hidden to research." Thesis, University of Exeter, 2016. http://hdl.handle.net/10871/21692.
Full textStojanovic, Milan. "Teknik för dokumentering avmöten och konferenser." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-247247.
Full textDokumentering av möten och konferenser utförs på de flesta företag av en eller flera personer som sitter vid en dator och antecknar det som har sagts under mötet. Det kan medföra att det som skrivs ner inte stämmer med det som har sagts eller att det uppfattades felaktigt av personen som antecknar. Den mänskliga faktorn är ganska stor. Detta arbete kommer att fokusera på att ta fram förslag på nya tekniker som minskar eller eliminerar den mänskliga faktorn, och därmed förbättrar dokumenteringen av möten och konferenser. Det föreställer ett problem för många företag och institutioner, däribland för Seavus Stockholm, där denna studie utförs. Det antas att de flesta företag inte dokumenterar deras möten och konferenser i video eller ljudformat, och därmed kommer denna studie bara att handla om dokumentering i textformat.Målet med denna studie var att undersöka hur man, med hjälp av moderna tekniker och nya tillämpningar, kan implementera nya funktioner och bygga ett modernt konferenssystem, för att förbättra dokumenteringen av möten och konferenser. Tal till text i kombination med talarigenkänning är något som ännu inte har implementerats för ett sådant ändamål, och det kan underlätta dokumenteringen av möten och konferenser.För att slutföra studien kombinerades flera metoder för att uppnå de önskade målen.Först definierades projektens omfattning och mål. Därefter, baserat på analys och observationer av företagets dokumenteringsprocess, skapades ett designförslag. Därefter genomfördes intervjuer med intressenterna där förslagen presenterades och en kravspecifikation skapades. Då studerades teorin för att skapa förståelse för hur olika tekniker arbetar, för att sedan designa och skapa ett förslag till arkitekturen.Resultatet av denna studie innehåller ett förslag till arkitektur, som visar att det är möjligt att implementera dessa tekniker för att förbättra dokumentationsprocessen. Dessutom presenteras möjliga användningsfall och interaktionsdiagram som visar hur systemet kan fungera.Även om beviset av konceptet anses vara tillfredsställande, ytterligare arbete och test behövs för att fullt ut implementera och integrera konceptet i verkligheten.
Gerrish, Charlotte. "European Copyright Law and the Text and Data Mining Exceptions and Limitations : With a focus on the DSM Directive, is the EU Approach a Hindrance or Facilitator to Innovation in the Region?" Thesis, Uppsala universitet, Juridiska institutionen, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385195.
Full textEriksson, Ruth, and Miranda Luis Galaz. "Ett digitalt läromedel för barn med lässvårigheter." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-189205.
Full textThe digital age is changing society. New technology provides opportunities to produce and organize knowledge in new ways. The technology available in schools today can also be used to optimize literacy training for students with reading difficulties. This thesis examines how a digital teaching material for literacy training for children with reading difficulties can be designed and implemented, and shows that this is possible to achieve. A digital learning material of good quality should be based on a scientifically accepted method of literacy training. This thesis uses Gunnel Wendick’s training model which is already used by many special education teachers. The training model is used with word lists, without computers, tablets or the like. We analyze Wendick’s training model and employ it, in a creative way, to design a digital equivalent to the original model. Our goal is to create a digital learning material that implements Wendick’s training model, and thus make it possible to use in various smart devices. With this we hope to facilitate the work of both the special education teachers and children with reading difficulties and to make the procedures more appealing and creative. In our study, we examine various technical possibilities to implement Wendick’s training model. We choose to create a prototype of a web application, with suitable functionality for both administrators, special education teachers and students. The prototype’s functionality can be divided into two parts, the administrative part and the exercise part. The administrative part covers the user interface and functionality for handling students and other relevant data. The exercise part includes training views and their functionality. The functionality of the exercises is intended to train the auditory channel, the phonological awarenesswith the goal of reading accurately, and the orthographic decoding - with the goal that students should automate their decoding, that is, to perceive the words as an image. In the development of the digital teaching material, we used proven principles in software technologies and proven implementation techniques. It compiles high-level requirements, the domain model and defines the appropriate use cases. To implement the application, we used the Java EE platform, Web Speech API, Prime Faces specifications, and more. Our prototype is a good start to inspire further development, with the hope that a full web application will be created, that will transform the practices in our schools.
陳我智 and Ngor-chi Chan. "Text-to-speech conversion for Putonghua." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1990. http://hub.hku.hk/bib/B31209580.
Full textAlmqvist, Daniel, and Magnus Jansson. "Förbättrat informationsflöde med hjälp av Augmented Reality." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-177463.
Full textAugmented Reality is a technology where an object is introduced in front of a picture or a similar media using the camera on a mobile device. There are several different ways to use the Augmented Reality technology, research in the field has therefore been made. An example of an area where the technology can be used is advertisement. Since advertisement is something everyone is confronted with daily, but usually the advertisement can be seen as boring or is something many do not even notice. Through a Augmented Reality prototype, users can register both patterns and speech and get the required data from a database. It can create an interactive event that displays the information in a unique way, where everyone, even people with disabilities can take part of the information they usually can not take part of. This interactive event gives life to the previously tedious advertisement or information posters. The result of the report is a prototype on the mobile platform Android using Augmented Reality technology and the prototype has many features. It can use voice recognition and keywords to access additional information about the keyword. The testing of this prototype shows that many are in favour of the use of the prototype and they see it as an interesting way to get the information. That is why they are willing use the application themselves to get their own advertising in a unique and appealing way.
Luffarelli, Marta. "A text mining approach to materiality assessment." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23127/.
Full textWu, Qinyi. "Partial persistent sequences and their applications to collaborative text document editing and processing." Diss., Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/44916.
Full textPilipiec, Patrick. "Using Machine Learning to Understand Text for Pharmacovigilance: A Systematic Review." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-83458.
Full textPafilis, Evangelos. "Web-based named entity recognition and data integration to accelerate molecular biology research." [S.l. : s.n.], 2008. http://nbn-resolving.de/urn:nbn:de:bsz:16-opus-89706.
Full textKaterattanakul, Nitsawan. "A pilot study in an application of text mining to learning system evaluation." Diss., Rolla, Mo. : Missouri University of Science and Technology, 2010. http://scholarsmine.mst.edu/thesis/pdf/Katerattanakul_09007dcc807b614f.pdf.
Full textVita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed June 19, 2010) Includes bibliographical references (p. 72-75).
Adapa, Supriya. "TensorFlow Federated Learning: Application to Decentralized Data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Find full textStrandberg, Per Erik. "On text mining to identify gene networks with a special reference to cardiovascular disease." Thesis, Linköping University, The Department of Physics, Chemistry and Biology, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2810.
Full textThe rate at which articles gets published grows exponentially and the possibility to access texts in machine-readable formats is also increasing. The need of an automated system to gather relevant information from text, text mining, is thus growing.
The goal of this thesis is to find a biologically relevant gene network for atherosclerosis, themain cause of cardiovascular disease, by inspecting gene cooccurrences in abstracts from PubMed. In addition to this gene nets for yeast was generated to evaluate the validity of using text mining as a method.
The nets found were validated in many ways, they were for example found to have the well known power law link distribution. They were also compared to other gene nets generated by other, often microbiological, methods from different sources. In addition to classic measurements of similarity like overlap, precision, recall and f-score a new way to measure similarity between nets are proposed and used. The method uses an urn approximation and measures the distance from comparing two unrelated nets in standard deviations. The validity of this approximation is supported both analytically and with simulations for both Erd¨os-R´enyi nets and nets having a power law link distribution. The new method explains that very poor overlap, precision, recall and f-score can still be very far from random and also how much overlap one could expect at random. The cutoff was also investigated.
Results are typically in the order of only 1% overlap but with the remarkable distance of 100 standard deviations from what one could have expected at random. Of particular interest is that one can only expect an overlap of 2 edges with a variance of 2 when comparing two trees with the same set of nodes. The use of a cutoff at one for cooccurrence graphs is discussed and motivated by for example the observation that this eliminates about 60-70% of the false positives but only 20-30% of the overlapping edges. This thesis shows that text mining of PubMed can be used to generate a biologically relevant gene subnet of the human gene net. A reasonable extension of this work is to combine the nets with gene expression data to find a more reliable gene net.
Thun, Anton. "Matching Job Applicants to Free Text Job Ads Using Semantic Networks and Natural Language Inference." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281250.
Full textAutomatiserade e-rekryteringssystem har varit ett forskningsfokus det senaste årtioendet på grund av hur mycket arbete som krävs för att sålla passande jobbsökande till en jobbpost, vars CV och jobbannons vanligen skickas som fritext. Medan rekryteringsorganisationer har data omfattande sökandes CV:n och jobbannonsbeskrivningar, är CV:n oftast konfidentiella, vilket begränsar direktanvändande av djupinlärningsmetoder. Detta leder till ett problem där traditionella data-agnostiska metoder har större potential att uppnå bra resultat i att bestämma hur lämplig en jobbsökande är för en given jobbpost. Dock är det möjligt att träna språkmodeller oberoende av det faktiska problemet– och därav oberoende av tillgänglig data – med tillkomsten av inlärningsöverföring. I den här rapporten används en språkmodell som finjusterats på naturlig språkinferens (NSI) via tvärspråklig inlärningsöverföring för jobbmatchningsproblemet. Den jämförs mot en semantisk metod som använder svenska taxonomier för att konstruera nätverk med hierarki- och synonymrelationer. Då NSI kan appliceras på godtyckliga meningspar, undersöks även textsegmentering för att förbättra metodernas prestanda. Resultaten visar att NSI-metoden är signifikant bättre än en slumpmässig lämplighetsklassificerare, men överträffas av den semantiska metoden som hade 34% bättre prestanda på datasetet som användes. Användandet av textsegmentering hade försumbar effekt för den samlade prestandan, men visades uppnå bättre rankning av de mest lämpande jobbsökande relativt expertbedömning av deras relevans.
Kongthon, Alisa. "A text mining framework for discovering technological intelligence to support science and technology management." Diss., Available online, Georgia Institute of Technology, 2004:, 2004. http://etd.gatech.edu/theses/available/etd-04052004-162415/unrestricted/kongthon%5Falisa%5F200405%5Fphd.pdf.
Full textZhu, Donghua, Committee Member ; Cozzens, Susan, Committee Member ; Huo, Xiaoming, Committee Member ; Porter, Alan, Committee Chair ; Lu, Jye-Chyi, Committee Member. Vita. Includes bibliographical references (leaves 191-195).
Courseault, Cherie Renee. "A Text Mining Framework Linking Technical Intelligence from Publication Databases to Strategic Technology Decisions." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5214.
Full textWigington, Curtis Michael. "End-to-End Full-Page Handwriting Recognition." BYU ScholarsArchive, 2018. https://scholarsarchive.byu.edu/etd/7099.
Full textChiarella, Andrew Francesco 1971. "Enabling the collective to assist the individual : a self-organising systems approach to social software and the creation of collaborative text signals." Thesis, McGill University, 2008. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=115618.
Full textForty undergraduate students read two texts on topics from psychology using CoREAD. Students were asked to read each text in order to write a summary of it. After each new student read the text, the text signals were changed to reflect the current group of students. As such, each student read the text with different text signals presented.
The data were analysed for each text to determine if the text signals that emerged were stable and valid representations of the semantic content of the text. As well, the students' summaries were analysed to determine if students who read the text after the text signals had stabilised produced better summaries. Three methods demonstrated that CoREAD was capable of generating stable typographical text signals. The high importance text signals also appeared to capture the semantic content of the texts. For both texts, a summary made of the high signals performed as well as a benchmark summary. The results did not suggest that the stable text signals assisted readers to produce better summaries, however. Readers may not respond to these collaborative text signals as they would to authorial text signals, which previous research has shown to be beneficial (Lorch, 1989). The CoREAD project has demonstrated that readers can produce stable and valid text signals through an unplanned, self-organising process.
Levefeldt, Christer. "Evaluation of NETtalk as a means to extract phonetic features from text for synchronization with speech." Thesis, University of Skövde, Department of Computer Science, 1998. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-173.
Full textThe background for this project is a wish to automate synchronization of text and speech. The idea is to present speech through speakers synchronized word-for-word with text appearing on a monitor.
The solution decided upon is to use artificial neural networks, ANNs, to convert both text and speech into streams made up of sets of phonetic features and then matching these two streams against each other. Several text-to-feature ANN designs based on the NETtalk system are implemented and evaluated. The extraction of phonetic features from speech and the synchronization itself are not implemented, but some assessments are made regarding their possible performances. The performance of a finished system is not possible to determine, but a NETtalk-based ANN is believed to be suitable for such a system using phonetic features for synchronization.
Dall, Rasmus. "Statistical parametric speech synthesis using conversational data and phenomena." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/29016.
Full textAbade, André da Silva. "Uma abordagem de teste estrutural de uma transformações M2T baseada em hipergrafos." Universidade Federal de São Carlos, 2016. https://repositorio.ufscar.br/handle/ufscar/8721.
Full textApproved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-05-04T13:50:02Z (GMT) No. of bitstreams: 1 DissASA.pdf: 6143481 bytes, checksum: ae99305f43474756b358bade1f0bd0c7 (MD5)
Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-05-04T13:50:10Z (GMT) No. of bitstreams: 1 DissASA.pdf: 6143481 bytes, checksum: ae99305f43474756b358bade1f0bd0c7 (MD5)
Made available in DSpace on 2017-05-04T13:53:49Z (GMT). No. of bitstreams: 1 DissASA.pdf: 6143481 bytes, checksum: ae99305f43474756b358bade1f0bd0c7 (MD5) Previous issue date: 2016-01-05
Não recebi financiamento
Context: MDD (Model-Driven Development) is a software development paradigm in which the main artefacts are models, from which source code or other artefacts are generated. Even though MDD allows different views of how to decompose a problem and how to design a software to solve it, this paradigm introduces new challenges related to the input models, transformations and output artefacts. Problem Statement: Thus, software testing is a fundamental activity to reveal defects and improve confidence in the software products developed in this context. Several techniques and testing criteria have been proposed and investigated. Among them, functional testing has been extensively explored primarily in the M2M (Model-to-Model) transformations, while structural testing for M2T (Model-to-Text) transformations still poses challenges and lacks appropriate approaches. Objective: This work aims to to present a proposal for the structural testing of M2T transformations through the characterisation of input models as complex data, templates and output artefacts involved in this process. Method: The proposed approach was organised in five phases. Its strategy proposes that the complex data (grammars and metamodels) are represented by directed hypergraphs, allowing that a combinatorial-based traversal algorithm creates subsets of the input models that will be used as test cases for the M2T transformations. In this perspective, we carried out two exploratory studies with the specific purpose of feasibility analysis of the proposed approach. Results and Conclusion: The evaluation of results from the exploratory studies, through the analysis of some testing coverage criteria, demonstrated the relevance and feasibility of the approach for characterizing complex data for M2T transformations testing. Moreover, structuring the testing strategy in phases enables the revision and adjustment of activities, in addition to assisting the replication of the approach within different applications that make use of the MDD paradigm.
Contexto: O MDD (Model-Driven Development ou Desenvolvimento Dirigido por Modelos) e um paradigma de desenvolvimento de software em que os principais artefatos são os modelos, a partir dos quais o código ou outros artefatos são gerados. Esse paradigma, embora possibilite diferentes visões de como decompor um problema e projetar um software para soluciona-lo, introduz novos desafios, qualificados pela complexidade dos modelos de entrada, as transformações e os artefatos de saída. Definição do Problema: Dessa forma, o teste de software e uma atividade fundamental para revelar defeitos e aumentar a confiança nos produtos de software desenvolvidos nesse contexto. Diversas técnicas e critérios de teste vem sendo propostos e investigados. Entre eles, o teste funcional tem sido bastante explorado primordialmente nas transformações M2M (Model-to-Model ou Modelo para Modelo), enquanto que o teste estrutural em transformações M2T (Model-to-Text ou Modelo para Texto) ainda possui alguns desafios e carência de novas abordagens. Objetivos: O objetivo deste trabalho e apresentar uma proposta para o teste estrutural de transformações M2T, por meio da caracterização dos dados complexos dos modelos de entrada, templates e artefatos de saída envolvidos neste processo. Metodologia: A abordagem proposta foi organizada em cinco fases e sua estratégia propõe que os dados complexos (gramáticas e metamodelos) sejam representados por meio de hipergrafos direcionados, permitindo que um algoritmo de percurso em hipergrafos, usando combinatória, crie subconjuntos dos modelos de entrada que serão utilizados como casos de teste para as transformações M2T. Nesta perspectiva, realizou-se dois estudos exploratórios com propósito específico da analise de viabilidade quanto a abordagem proposta. Resultados: A avaliação dos estudos exploratórios proporcionou, por meio da analise dos critérios de cobertura aplicados, um conjunto de dados que demonstram a relevância e viabilidade da abordagem quanto a caracterização de dados complexos para os testes em transformações M2T. A segmentação das estratégias em fases possibilita a revisão e adequação das atividades do processo, além de auxiliar na replicabilidade da abordagem em diferentes aplicações que fazem uso do paradigma MDD.
Mhlana, Siphe. "Development of isiXhosa text-to-speech modules to support e-Services in marginalized rural areas." Thesis, University of Fort Hare, 2011. http://hdl.handle.net/10353/495.
Full textDol, Zulkifli. "A strategy for a systematic approach to biomarker discovery validation : a study on lung cancer microarray data set." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/a-strategy-for-a-systematic-approach-to-biomarker-discovery-validation--a-study-on-lung-cancer-microarray-data-set(8e439385-27d1-44ac-8b20-259b4a8f6716).html.
Full textAlshaer, Mohammad. "An Efficient Framework for Processing and Analyzing Unstructured Text to Discover Delivery Delay and Optimization of Route Planning in Realtime." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1105/document.
Full textInternet of Things (IoT) is leading to a paradigm shift within the logistics industry. The advent of IoT has been changing the logistics service management ecosystem. Logistics services providers today use sensor technologies such as GPS or telemetry to collect data in realtime while the delivery is in progress. The realtime collection of data enables the service providers to track and manage their shipment process efficiently. The key advantage of realtime data collection is that it enables logistics service providers to act proactively to prevent outcomes such as delivery delay caused by unexpected/unknown events. Furthermore, the providers today tend to use data stemming from external sources such as Twitter, Facebook, and Waze. Because, these sources provide critical information about events such as traffic, accidents, and natural disasters. Data from such external sources enrich the dataset and add value in analysis. Besides, collecting them in real-time provides an opportunity to use the data for on-the-fly analysis and prevent unexpected outcomes (e.g., such as delivery delay) at run-time. However, data are collected raw which needs to be processed for effective analysis. Collecting and processing data in real-time is an enormous challenge. The main reason is that data are stemming from heterogeneous sources with a huge speed. The high-speed and data variety fosters challenges to perform complex processing operations such as cleansing, filtering, handling incorrect data, etc. The variety of data – structured, semi-structured, and unstructured – promotes challenges in processing data both in batch-style and real-time. Different types of data may require performing operations in different techniques. A technical framework that enables the processing of heterogeneous data is heavily challenging and not currently available. In addition, performing data processing operations in real-time is heavily challenging; efficient techniques are required to carry out the operations with high-speed data, which cannot be done using conventional logistics information systems. Therefore, in order to exploit Big Data in logistics service processes, an efficient solution for collecting and processing data in both realtime and batch style is critically important. In this thesis, we developed and experimented with two data processing solutions: SANA and IBRIDIA. SANA is built on Multinomial Naïve Bayes classifier whereas IBRIDIA relies on Johnson's hierarchical clustering (HCL) algorithm which is hybrid technology that enables data collection and processing in batch style and realtime. SANA is a service-based solution which deals with unstructured data. It serves as a multi-purpose system to extract the relevant events including the context of the event (such as place, location, time, etc.). In addition, it can be used to perform text analysis over the targeted events. IBRIDIA was designed to process unknown data stemming from external sources and cluster them on-the-fly in order to gain knowledge/understanding of data which assists in extracting events that may lead to delivery delay. According to our experiments, both of these approaches show a unique ability to process logistics data. However, SANA is found more promising since the underlying technology (Naïve Bayes classifier) out-performed IBRIDIA from performance measuring perspectives. It is clearly said that SANA was meant to generate a graph knowledge from the events collected immediately in realtime without any need to wait, thus reaching maximum benefit from these events. Whereas, IBRIDIA has an important influence within the logistics domain for identifying the most influential category of events that are affecting the delivery. Unfortunately, in IBRIRDIA, we should wait for a minimum number of events to arrive and always we have a cold start. Due to the fact that we are interested in re-optimizing the route on the fly, we adopted SANA as our data processing framework
Shatnawi, Safwan. "A data mining approach to ontology learning for automatic content-related question-answering in MOOCs." Thesis, Robert Gordon University, 2016. http://hdl.handle.net/10059/2122.
Full textSpens, Henrik, and Johan Lindgren. "Using cloud services and machine learning to improve customer support : Study the applicability of the method on voice data." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-340639.
Full textMilosevic, Nikola. "A multi-layered approach to information extraction from tables in biomedical documents." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/a-multilayered-approach-to-information-extraction-from-tables-in-biomedical-documents(c2edce9c-ae7f-48fa-81c2-14d4bb87423e).html.
Full textZhang, Zhuo. "A planning approach to migrating domain-specific legacy systems into service oriented architecture." Thesis, De Montfort University, 2012. http://hdl.handle.net/2086/9020.
Full textVan, Niekerk Daniel Rudolph. "Automatic speech segmentation with limited data / by D.R. van Niekerk." Thesis, North-West University, 2009. http://hdl.handle.net/10394/3978.
Full textThesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
Green, Charles A. "An empirical study on the effects of a collaboration-aware computer system and several communication media alternatives on product quality and time to complete in a co-authoring environment." Thesis, Virginia Tech, 1992. http://hdl.handle.net/10919/40617.
Full textMaster of Science
Munnecom, Lorenna, and Miguel Chaves de Lemos Pacheco. "Exploration of an Automated Motivation Letter Scoring System to Emulate Human Judgement." Thesis, Högskolan Dalarna, Mikrodataanalys, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:du-34563.
Full text