To see the other types of publications on this topic, follow the link: Discovery Learning model.

Dissertations / Theses on the topic 'Discovery Learning model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 37 dissertations / theses for your research on the topic 'Discovery Learning model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Hedden, Chet. "A guided exploration model of problem-solving discovery learning /." Thesis, Connect to this title online; UW restricted, 1998. http://hdl.handle.net/1773/7683.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Miller, Chreston. "Structural Model Discovery in Temporal Event Data Streams." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/19341.

Full text
Abstract:
This dissertation presents a unique approach to human behavior analysis based on expert guidance and intervention through interactive construction and modification of behavior models. Our focus is to introduce the research area of behavior analysis, the challenges faced by this field, current approaches available, and present a new analysis approach: Interactive Relevance Search and Modeling (IRSM). More intelligent ways of conducting data analysis have been explored in recent years. Ma- chine learning and data mining systems that utilize pattern classification and discovery in non-textual data promise to bring new generations of powerful "crawlers" for knowledge discovery, e.g., face detection and crowd surveillance. Many aspects of data can be captured by such systems, e.g., temporal information, extractable visual information - color, contrast, shape, etc. However, these captured aspects may not uncover all salient information in the data or provide adequate models/patterns of phenomena of interest. This is a challenging problem for social scientists who are trying to identify high-level, conceptual patterns of human behavior from observational data (e.g., media streams). The presented research addresses how social scientists may derive patterns of human behavior captured in media streams. Currently, media streams are being segmented into sequences of events describing the actions captured in the streams, such as the interactions among humans. This segmentation creates a challenging data space to search characterized by non- numerical, temporal, descriptive data, e.g., Person A walks up to Person B at time T. This dissertation will present an approach that allows one to interactively search, identify, and discover temporal behavior patterns within such a data space. Therefore, this research addresses supporting exploration and discovery in behavior analysis through a formalized method of assisted exploration. The model evolution presented sup- ports the refining of the observer\'s behavior models into representations of their understanding. The benefit of the new approach is shown through experimentation on its identification accuracy and working with fellow researchers to verify the approach\'s legitimacy in analysis of their data.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
3

Kayesh, Humayun. "Deep Learning for Causal Discovery in Texts." Thesis, Griffith University, 2022. http://hdl.handle.net/10072/415822.

Full text
Abstract:
Causality detection in text data is a challenging natural language processing task. This is a trivial task for human beings as they acquire vast background knowledge throughout their lifetime. For example, a human knows from their experience that heavy rain may cause flood or plane accidents may cause death. However, it is challenging to automatically detect such causal relationships in texts due to the availability of limited contextual information and the unstructured nature of texts. The task is even more challenging for social media short texts such as Tweets as often they are informal, short, and grammatically incorrect. Generating hand-crafted linguistic rules is an option but is not always effective to detect causal relationships in text because they are rigid and require grammatically correct sentences. Also, the rules are often domain-specific and not always portable to another domain. Therefore, supervised learning techniques are more appropriate in the above scenario. Traditional machine learning-based model also suffers from the high dimensional features of texts. This is why deep learning-based approaches are becoming increasingly popular for natural language processing tasks such as causality detection. However, deep learning models often require large datasets with high-quality features to perform well. Extracting deeply-learnable causal features and applying them to a carefully designed deep learning model is important. Also, preparing a large human-labeled training dataset is expensive and time-consuming. Even if a large training dataset is available, it is computationally expensive to train a deep learning model due to the complex structure of neural networks. We focus on addressing the following challenges: (i) extracting highquality causal features, (ii) designing an effective deep learning model to learn from the causal features, and (iii) reducing the dependency on large training datasets. Our main goals in this thesis are as follows: (i) we aim to study the different aspects of causality and causal discovery in text in depth. (ii) We aim to develop strategies to model causality in text, (iii) and finally, we aim to develop frameworks to design effective and efficient deep neural network structures to discover causality in texts.<br>Thesis (PhD Doctorate)<br>Doctor of Philosophy (PhD)<br>School of Info & Comm Tech<br>Science, Environment, Engineering and Technology<br>Full Text
APA, Harvard, Vancouver, ISO, and other styles
4

Rijn, Dirk Hendrik van. "Exploring the limited effect of inductive discovery learning computational models and model-based analyses /." [Amsterdam : Amsterdam : EPOS, experimenteel-psychologische onderzoekschool] ; Universiteit van Amsterdam [Host], 2003. http://dare.uva.nl/document/68567.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tuovinen, L. (Lauri). "From machine learning to learning with machines:remodeling the knowledge discovery process." Doctoral thesis, Oulun yliopisto, 2014. http://urn.fi/urn:isbn:9789526205243.

Full text
Abstract:
Abstract Knowledge discovery (KD) technology is used to extract knowledge from large quantities of digital data in an automated fashion. The established process model represents the KD process in a linear and technology-centered manner, as a sequence of transformations that refine raw data into more and more abstract and distilled representations. Any actual KD process, however, has aspects that are not adequately covered by this model. In particular, some of the most important actors in the process are not technological but human, and the operations associated with these actors are interactive rather than sequential in nature. This thesis proposes an augmentation of the established model that addresses this neglected dimension of the KD process. The proposed process model is composed of three sub-models: a data model, a workflow model, and an architectural model. Each sub-model views the KD process from a different angle: the data model examines the process from the perspective of different states of data and transformations that convert data from one state to another, the workflow model describes the actors of the process and the interactions between them, and the architectural model guides the design of software for the execution of the process. For each of the sub-models, the thesis first defines a set of requirements, then presents the solution designed to satisfy the requirements, and finally, re-examines the requirements to show how they are accounted for by the solution. The principal contribution of the thesis is a broader perspective on the KD process than what is currently the mainstream view. The augmented KD process model proposed by the thesis makes use of the established model, but expands it by gathering data management and knowledge representation, KD workflow and software architecture under a single unified model. Furthermore, the proposed model considers issues that are usually either overlooked or treated as separate from the KD process, such as the philosophical aspect of KD. The thesis also discusses a number of technical solutions to individual sub-problems of the KD process, including two software frameworks and four case-study applications that serve as concrete implementations and illustrations of several key features of the proposed process model<br>Tiivistelmä Tiedonlouhintateknologialla etsitään automoidusti tietoa suurista määristä digitaalista dataa. Vakiintunut prosessimalli kuvaa tiedonlouhintaprosessia lineaarisesti ja teknologiakeskeisesti sarjana muunnoksia, jotka jalostavat raakadataa yhä abstraktimpiin ja tiivistetympiin esitysmuotoihin. Todellisissa tiedonlouhintaprosesseissa on kuitenkin aina osa-alueita, joita tällainen malli ei kata riittävän hyvin. Erityisesti on huomattava, että eräät prosessin tärkeimmistä toimijoista ovat ihmisiä, eivät teknologiaa, ja että heidän toimintansa prosessissa on luonteeltaan vuorovaikutteista eikä sarjallista. Tässä väitöskirjassa ehdotetaan vakiintuneen mallin täydentämistä siten, että tämä tiedonlouhintaprosessin laiminlyöty ulottuvuus otetaan huomioon. Ehdotettu prosessimalli koostuu kolmesta osamallista, jotka ovat tietomalli, työnkulkumalli ja arkkitehtuurimalli. Kukin osamalli tarkastelee tiedonlouhintaprosessia eri näkökulmasta: tietomallin näkökulma käsittää tiedon eri olomuodot sekä muunnokset olomuotojen välillä, työnkulkumalli kuvaa prosessin toimijat sekä niiden väliset vuorovaikutukset, ja arkkitehtuurimalli ohjaa prosessin suorittamista tukevien ohjelmistojen suunnittelua. Väitöskirjassa määritellään aluksi kullekin osamallille joukko vaatimuksia, minkä jälkeen esitetään vaatimusten täyttämiseksi suunniteltu ratkaisu. Lopuksi palataan tarkastelemaan vaatimuksia ja osoitetaan, kuinka ne on otettu ratkaisussa huomioon. Väitöskirjan pääasiallinen kontribuutio on se, että se avaa tiedonlouhintaprosessiin valtavirran käsityksiä laajemman tarkastelukulman. Väitöskirjan sisältämä täydennetty prosessimalli hyödyntää vakiintunutta mallia, mutta laajentaa sitä kokoamalla tiedonhallinnan ja tietämyksen esittämisen, tiedon louhinnan työnkulun sekä ohjelmistoarkkitehtuurin osatekijöiksi yhdistettyyn malliin. Lisäksi malli kattaa aiheita, joita tavallisesti ei oteta huomioon tai joiden ei katsota kuuluvan osaksi tiedonlouhintaprosessia; tällaisia ovat esimerkiksi tiedon louhintaan liittyvät filosofiset kysymykset. Väitöskirjassa käsitellään myös kahta ohjelmistokehystä ja neljää tapaustutkimuksena esiteltävää sovellusta, jotka edustavat teknisiä ratkaisuja eräisiin yksittäisiin tiedonlouhintaprosessin osaongelmiin. Kehykset ja sovellukset toteuttavat ja havainnollistavat useita ehdotetun prosessimallin merkittävimpiä ominaisuuksia
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Xuan. "Product Defect Discovery and Summarization from Online User Reviews." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/85581.

Full text
Abstract:
Product defects concern various groups of people, such as customers, manufacturers, government officials, etc. Thus, defect-related knowledge and information are essential. In keeping with the growth of social media, online forums, and Internet commerce, people post a vast amount of feedback on products, which forms a good source for the automatic acquisition of knowledge about defects. However, considering the vast volume of online reviews, how to automatically identify critical product defects and summarize the related information from the huge number of user reviews is challenging, even when we target only the negative reviews. As a kind of opinion mining research, existing defect discovery methods mainly focus on how to classify the type of product issues, which is not enough for users. People expect to see defect information in multiple facets, such as product model, component, and symptom, which are necessary to understand the defects and quantify their influence. In addition, people are eager to seek problem resolutions once they spot defects. These challenges cannot be solved by existing aspect-oriented opinion mining models, which seldom consider the defect entities mentioned above. Furthermore, users also want to better capture the semantics of review text, and to summarize product defects more accurately in the form of natural language sentences. However, existing text summarization models including neural networks can hardly generalize to user review summarization due to the lack of labeled data. In this research, we explore topic models and neural network models for product defect discovery and summarization from user reviews. Firstly, a generative Probabilistic Defect Model (PDM) is proposed, which models the generation process of user reviews from key defect entities including product Model, Component, Symptom, and Incident Date. Using the joint topics in these aspects, which are produced by PDM, people can discover defects which are represented by those entities. Secondly, we devise a Product Defect Latent Dirichlet Allocation (PDLDA) model, which describes how negative reviews are generated from defect elements like Component, Symptom, and Resolution. The interdependency between these entities is modeled by PDLDA as well. PDLDA answers not only what the defects look like, but also how to address them using the crowd wisdom hidden in user reviews. Finally, the problem of how to summarize user reviews more accurately, and better capture the semantics in them, is studied using deep neural networks, especially Hierarchical Encoder-Decoder Models. For each of the research topics, comprehensive evaluations are conducted to justify the effectiveness and accuracy of the proposed models, on heterogeneous datasets. Further, on the theoretical side, this research contributes to the research stream on product defect discovery, opinion mining, probabilistic graphical models, and deep neural network models. Regarding impact, these techniques will benefit related users such as customers, manufacturers, and government officials.<br>Ph. D.<br>Product defects concern various groups of people, such as customers, manufacturers, and government officials. Thus, defect-related knowledge and information are essential. In keeping with the growth of social media, online forums, and Internet commerce, people post a vast amount of feedback on products, which forms a good source for the automatic acquisition of knowledge about defects. However, considering the vast volume of online reviews, how to automatically identify critical product defects and summarize the related information from the huge number of user reviews is challenging, even when we target only the negative reviews. People expect to see defect information in multiple facets, such as product model, component, and symptom, which are necessary to understand the defects and quantify their influence. In addition, people are eager to seek problem resolutions once they spot defects. Furthermore, users also want to better summarize product defects more accurately in the form of natural language sentences. These requirements cannot be satisfied by existing methods, which seldom consider the defect entities mentioned above, or hardly generalize to user review summarization. In this research, we develop novel Machine Learning (ML) algorithms for product defect discovery and summarization. Firstly, we study how to identify product defects and their related attributes, such as Product Model, Component, Symptom, and Incident Date. Secondly, we devise a novel algorithm, which can discover product defects and the related Component, Symptom, and Resolution, from online user reviews. This method tells not only what the defects look like, but also how to address them using the crowd wisdom hidden in user reviews. Finally, we address the problem of how to summarize user reviews in the form of natural language sentences using a paraphrase-style method. On the theoretical side, this research contributes to multiple research areas in Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning. Regarding impact, these techniques will benefit related users such as customers, manufacturers, and government officials.
APA, Harvard, Vancouver, ISO, and other styles
7

Liang, Wen. "Integrated feature, neighbourhood, and model optimization for personalised modelling and knowledge discovery." Click here to access this resource online, 2009. http://hdl.handle.net/10292/749.

Full text
Abstract:
“Machine learning is the process of discovering and interpreting meaningful information, such as new correlations, patterns and trends by sifting through large amounts of data stored in repositories, using pattern recognition technologies as well as statistical and mathematical techniques” (Larose, 2005). From my understanding, machine learning is a process of using different analysis techniques to observe previously unknown, potentially meaningful information, and discover strong patterns and relationships from a large dataset. Professor Kasabov (2007b) classified computational models into three categories (e.g. global, local, and personalised) which have been widespread and used in the areas of data analysis and decision support in general, and in the areas of medicine and bioinformatics in particular. Most recently, the concept of personalised modelling has been widely applied to various disciplines such as personalised medicine, personalised drug design for known diseases (e.g. cancer, diabetes, brain disease, etc.) as well as for other modelling problems in ecology, business, finance, crime prevention, and so on. The philosophy behind the personalised modelling approach is that every person is different from others, thus he/she will benefit from having a personalised model and treatment. However, personalised modelling is not without issues, such as defining the correct number of neighbours or defining an appropriate number of features. As a result, the principal goal of this research is to study and address these issues and to create a novel framework and system for personalised modelling. The framework would allow users to select and optimise the most important features and nearest neighbours for a new input sample in relation to a certain problem based on a weighted variable distance measure in order to obtain more precise prognostic accuracy and personalised knowledge, when compared with global modelling and local modelling approaches.
APA, Harvard, Vancouver, ISO, and other styles
8

Prokopp, Christian Werner. "Semantic service discovery in the service ecosystem." Thesis, Queensland University of Technology, 2011. https://eprints.qut.edu.au/50872/1/Christian_Prokopp_Thesis.pdf.

Full text
Abstract:
Electronic services are a leitmotif in ‘hot’ topics like Software as a Service, Service Oriented Architecture (SOA), Service oriented Computing, Cloud Computing, application markets and smart devices. We propose to consider these in what has been termed the Service Ecosystem (SES). The SES encompasses all levels of electronic services and their interaction, with human consumption and initiation on its periphery in much the same way the ‘Web’ describes a plethora of technologies that eventuate to connect information and expose it to humans. Presently, the SES is heterogeneous, fragmented and confined to semi-closed systems. A key issue hampering the emergence of an integrated SES is Service Discovery (SD). A SES will be dynamic with areas of structured and unstructured information within which service providers and ‘lay’ human consumers interact; until now the two are disjointed, e.g., SOA-enabled organisations, industries and domains are choreographed by domain experts or ‘hard-wired’ to smart device application markets and web applications. In a SES, services are accessible, comparable and exchangeable to human consumers closing the gap to the providers. This requires a new SD with which humans can discover services transparently and effectively without special knowledge or training. We propose two modes of discovery, directed search following an agenda and explorative search, which speculatively expands knowledge of an area of interest by means of categories. Inspired by conceptual space theory from cognitive science, we propose to implement the modes of discovery using concepts to map a lay consumer’s service need to terminologically sophisticated descriptions of services. To this end, we reframe SD as an information retrieval task on the information attached to services, such as, descriptions, reviews, documentation and web sites - the Service Information Shadow. The Semantic Space model transforms the shadow's unstructured semantic information into a geometric, concept-like representation. We introduce an improved and extended Semantic Space including categorization calling it the Semantic Service Discovery model. We evaluate our model with a highly relevant, service related corpus simulating a Service Information Shadow including manually constructed complex service agendas, as well as manual groupings of services. We compare our model against state-of-the-art information retrieval systems and clustering algorithms. By means of an extensive series of empirical evaluations, we establish optimal parameter settings for the semantic space model. The evaluations demonstrate the model’s effectiveness for SD in terms of retrieval precision over state-of-the-art information retrieval models (directed search) and the meaningful, automatic categorization of service related information, which shows potential to form the basis of a useful, cognitively motivated map of the SES for exploratory search.
APA, Harvard, Vancouver, ISO, and other styles
9

Kučera, Petr. "Meta-učení v oblasti dolování dat." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236213.

Full text
Abstract:
This paper describes the use of meta-learning in the area of data mining. It describes the problems and tasks of data mining where meta-learning can be applied, with a focus on classification. It provides an overview of meta-learning techniques and their possible application in data mining, especially  model selection. It describes design and implementation of meta-learning system to support classification tasks in data mining. The system uses statistics and information theory to characterize data sets stored in the meta-knowledge base. The meta-classifier is created from the base and predicts the most suitable model for the new data set. The conclusion discusses results of the experiments with more than 20 data sets representing clasification tasks from different areas and suggests possible extensions of the project.
APA, Harvard, Vancouver, ISO, and other styles
10

Akpakpan, Nsikak Etim. "Analytic Extensions to the Data Model for Management Analytics and Decision Support in the Big Data Environment." ScholarWorks, 2018. https://scholarworks.waldenu.edu/dissertations/5538.

Full text
Abstract:
From 2006 to 2016, an estimated average of 50% of big data analytics and decision support projects failed to deliver acceptable and actionable outputs to business users. The resulting management inefficiency came with high cost, and wasted investments estimated at $2.7 trillion in 2016 for companies in the United States. The purpose of this quantitative descriptive study was to examine the data model of a typical data analytics project in a big data environment for opportunities to improve the information created for management problem-solving. The research questions focused on finding artifacts within enterprise data to model key business scenarios for management action. The foundations of the study were information and decision sciences theories, especially information entropy and high-dimensional utility theories. The design-based research in a nonexperimental format was used to examine the data model for the functional forms that mapped the available data to the conceptual formulation of the management problem by combining ontology learning, data engineering, and analytic formulation methodologies. Semantic, symbolic, and dimensional extensions emerged as key functional forms of analytic extension of the data model. The data-modeling approach was applied to 15-terabyte secondary data set from a multinational medical product distribution company with profit growth problem. The extended data model simplified the composition of acceptable analytic insights, the derivation of business solutions, and the design of programs to address the ill-defined management problem. The implication for positive social change was the potential for overall improvement in management efficiency and increasing participation in advocacy and sponsorship of social initiatives.
APA, Harvard, Vancouver, ISO, and other styles
11

Gross, Tadeu Junior. "Structure learning of Bayesian networks via data perturbation." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/18/18153/tde-19022019-134517/.

Full text
Abstract:
Structure learning of Bayesian Networks (BNs) is an NP-hard problem, and the use of sub-optimal strategies is essential in domains involving many variables. One of them is to generate multiple approximate structures and then to reduce the ensemble to a representative structure. It is possible to use the occurrence frequency (on the structures ensemble) as the criteria for accepting a dominant directed edge between two nodes and thus obtaining the single structure. In this doctoral research, it was made an analogy with an adapted one-dimensional random-walk for analytically deducing an appropriate decision threshold to such occurrence frequency. The obtained closed-form expression has been validated across benchmark datasets applying the Matthews Correlation Coefficient as the performance metric. In the experiments using a recent medical dataset, the BN resulting from the analytical cutoff-frequency captured the expected associations among nodes and also achieved better prediction performance than the BNs learned with neighbours thresholds to the computed. In literature, the feature accounted along of the perturbed structures has been the edges and not the directed edges (arcs) as in this thesis. That modified strategy still was applied to an elderly dataset to identify potential relationships between variables of medical interest but using an increased threshold instead of the predict by the proposed formula - such prudence is due to the possible social implications of the finding. The motivation behind such an application is that in spite of the proportion of elderly individuals in the population has increased substantially in the last few decades, the risk factors that should be managed in advance to ensure a natural process of mental decline due to ageing remain unknown. In the learned structural model, it was graphically investigated the probabilistic dependence mechanism between two variables of medical interest: the suspected risk factor known as Metabolic Syndrome and the indicator of mental decline referred to as Cognitive Impairment. In this investigation, the concept known in the context of BNs as D-separation has been employed. Results of the carried out study revealed that the dependence between Metabolic Syndrome and Cognitive Variables indeed exists and depends on both Body Mass Index and age.<br>O aprendizado da estrutura de uma Rede Bayesiana (BN) é um problema NP-difícil, e o uso de estratégias sub-ótimas é essencial em domínios que envolvem muitas variáveis. Uma delas consiste em gerar várias estruturas aproximadas e depois reduzir o conjunto a uma estrutura representativa. É possível usar a frequência de ocorrência (no conjunto de estruturas) como critério para aceitar um arco dominante entre dois nós e assim obter essa estrutura única. Nesta pesquisa de doutorado, foi feita uma analogia com um passeio aleatório unidimensional adaptado para deduzir analiticamente um limiar de decisão apropriado para essa frequência de ocorrência. A expressão de forma fechada obtida foi validada usando bases de dados de referência e aplicando o Coeficiente de Correlação de Matthews como métrica de desempenho. Nos experimentos utilizando dados médicos recentes, a BN resultante da frequência de corte analítica capturou as associações esperadas entre os nós e também obteve melhor desempenho de predição do que as BNs aprendidas com limiares vizinhos ao calculado. Na literatura, a característica contabilizada ao longo das estruturas perturbadas tem sido as arestas e não as arestas direcionadas (arcos) como nesta tese. Essa estratégia modificada ainda foi aplicada a um conjunto de dados de idosos para identificar potenciais relações entre variáveis de interesse médico, mas usando um limiar aumentado em vez do previsto pela fórmula proposta - essa cautela deve-se às possíveis implicações sociais do achado. A motivação por trás dessa aplicação é que, apesar da proporção de idosos na população ter aumentado substancialmente nas últimas décadas, os fatores de risco que devem ser controlados com antecedência para garantir um processo natural de declínio mental devido ao envelhecimento permanecem desconhecidos. No modelo estrutural aprendido, investigou-se graficamente o mecanismo de dependência probabilística entre duas variáveis de interesse médico: o fator de risco suspeito conhecido como Síndrome Metabólica e o indicador de declínio mental denominado Comprometimento Cognitivo. Nessa investigação, empregou-se o conceito conhecido no contexto de BNs como D-separação. Esse estudo revelou que a dependência entre Síndrome Metabólica e Variáveis Cognitivas de fato existe e depende tanto do Índice de Massa Corporal quanto da idade.
APA, Harvard, Vancouver, ISO, and other styles
12

Balfer, Jenny [Verfasser]. "Development and Interpretation of Machine Learning Models for Drug Discovery / Jenny Balfer." Bonn : Universitäts- und Landesbibliothek Bonn, 2015. http://d-nb.info/1080561374/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Palmieri, Elena. "Learning declarative process models from positive and negative traces." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/22501/.

Full text
Abstract:
In the recent years, the growing number of recorded events made the interest in process mining techniques expand. These techniques make it possible to learn the model of a process, to compare a recent event log with an existing model or to enhance the process model using the information extracted from the log. Most of the existing process mining algorithms only make use of positive examples of a business process in order to extract its model, however, negative ones can bring major benefits. In this work, a discovery algorithm, inspired by the one presented by Mooney in 1995, that takes advantage of both positive and negative sequences of actions is presented in two different versions that return a declarative model connected respectively in disjunctive and conjunctive logic formulas.
APA, Harvard, Vancouver, ISO, and other styles
14

Bacciu, Davide. "A perceptual learning model to discover the hierarchical latent structure of image collections." Thesis, IMT Alti Studi Lucca, 2008. http://e-theses.imtlucca.it/7/1/bacciu_phdthesis_final.pdf.

Full text
Abstract:
Biology has been an unparalleled source of inspiration for the work of researchers in several scientific and engineering fields including computer vision. The starting point of this thesis is the neurophysiological properties of the human early visual system, in particular, the cortical mechanism that mediates learning by exploiting information about stimuli repetition. Repetition has long been considered a fundamental correlate of skill acquisition andmemory formation in biological aswell as computational learning models. However, recent studies have shown that biological neural networks have differentways of exploiting repetition in forming memory maps. The thesis focuses on a perceptual learning mechanism called repetition suppression, which exploits the temporal distribution of neural activations to drive an efficient neural allocation for a set of stimuli. This explores the neurophysiological hypothesis that repetition suppression serves as an unsupervised perceptual learning mechanism that can drive efficient memory formation by reducing the overall size of stimuli representation while strengthening the responses of the most selective neurons. This interpretation of repetition is different from its traditional role in computational learning models mainly to induce convergence and reach training stability, without using this information to provide focus for the neural representations of the data. The first part of the thesis introduces a novel computational model with repetition suppression, which forms an unsupervised competitive systemtermed CoRe, for Competitive Repetition-suppression learning. The model is applied to generalproblems in the fields of computational intelligence and machine learning. Particular emphasis is placed on validating the model as an effective tool for the unsupervised exploration of bio-medical data. In particular, it is shown that the repetition suppression mechanism efficiently addresses the issues of automatically estimating the number of clusters within the data, as well as filtering noise and irrelevant input components in highly dimensional data, e.g. gene expression levels from DNA Microarrays. The CoRe model produces relevance estimates for the each covariate which is useful, for instance, to discover the best discriminating bio-markers. The description of the model includes a theoretical analysis using Huber’s robust statistics to show that the model is robust to outliers and noise in the data. The convergence properties of themodel also studied. It is shown that, besides its biological underpinning, the CoRe model has useful properties in terms of asymptotic behavior. By exploiting a kernel-based formulation for the CoRe learning error, a theoretically sound motivation is provided for the model’s ability to avoid local minima of its loss function. To do this a necessary and sufficient condition for global error minimization in vector quantization is generalized by extending it to distance metrics in generic Hilbert spaces. This leads to the derivation of a family of kernel-based algorithms that address the local minima issue of unsupervised vector quantization in a principled way. The experimental results show that the algorithm can achieve a consistent performance gain compared with state-of-the-art learning vector quantizers, while retaining a lower computational complexity (linear with respect to the dataset size). Bridging the gap between the low level representation of the visual content and the underlying high-level semantics is a major research issue of current interest. The second part of the thesis focuses on this problem by introducing a hierarchical and multi-resolution approach to visual content understanding. On a spatial level, CoRe learning is used to pool together the local visual patches by organizing them into perceptually meaningful intermediate structures. On the semantical level, it provides an extension of the probabilistic Latent Semantic Analysis (pLSA) model that allows discovery and organization of the visual topics into a hierarchy of aspects. The proposed hierarchical pLSA model is shown to effectively address the unsupervised discovery of relevant visual classes from pictorial collections, at the same time learning to segment the image regions containing the discovered classes. Furthermore, by drawing on a recent pLSA-based image annotation system, the hierarchical pLSA model is extended to process and representmulti-modal collections comprising textual and visual data. The results of the experimental evaluation show that the proposed model learns to attach textual labels (available only at the level of the whole image) to the discovered image regions, while increasing the precision/ recall performance with respect to flat, pLSA annotation model.
APA, Harvard, Vancouver, ISO, and other styles
15

Kamper, Herman. "Unsupervised neural and Bayesian models for zero-resource speech processing." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/25432.

Full text
Abstract:
Zero-resource speech processing is a growing research area which aims to develop methods that can discover linguistic structure and representations directly from unlabelled speech audio. Such unsupervised methods would allow speech technology to be developed in settings where transcriptions, pronunciation dictionaries, and text for language modelling are not available. Similar methods are required for cognitive models of language acquisition in human infants, and for developing robotic applications that are able to automatically learn language in a novel linguistic environment. There are two central problems in zero-resource speech processing: (i) finding frame-level feature representations which make it easier to discriminate between linguistic units (phones or words), and (ii) segmenting and clustering unlabelled speech into meaningful units. The claim of this thesis is that both top-down modelling (using knowledge of higher-level units to to learn, discover and gain insight into their lower-level constituents) as well as bottom-up modelling (piecing together lower-level features to give rise to more complex higher-level structures) are advantageous in tackling these two problems. The thesis is divided into three parts. The first part introduces a new autoencoder-like deep neural network for unsupervised frame-level representation learning. This correspondence autoencoder (cAE) uses weak top-down supervision from an unsupervised term discovery system that identifies noisy word-like terms in unlabelled speech data. In an intrinsic evaluation of frame-level representations, the cAE outperforms several state-of-the-art bottom-up and top-down approaches, achieving a relative improvement of more than 60% over the previous best system. This shows that the cAE is particularly effective in using top-down knowledge of longer-spanning patterns in the data; at the same time, we find that the cAE is only able to learn useful representations when it is initialized using bottom-up pretraining on a large set of unlabelled speech. The second part of the thesis presents a novel unsupervised segmental Bayesian model that segments unlabelled speech data and clusters the segments into hypothesized word groupings. The result is a complete unsupervised tokenization of the input speech in terms of discovered word types|the system essentially performs unsupervised speech recognition. In this approach, a potential word segment (of arbitrary length) is embedded in a fixed-dimensional vector space. The model, implemented as a Gibbs sampler, then builds a whole-word acoustic model in this embedding space while jointly performing segmentation. We first evaluate the approach in a small-vocabulary multi-speaker connected digit recognition task, where we report unsupervised word error rates (WER) by mapping the unsupervised decoded output to ground truth transcriptions. The model achieves around 20% WER, outperforming a previous HMM-based system by about 10% absolute. To achieve this performance, the acoustic word embedding function (which maps variable-duration segments to single vectors) is refined in a top-down manner by using terms discovered by the model in an outer loop of segmentation. The third and final part of the study extends the small-vocabulary system in order to handle larger vocabularies in conversational speech data. To our knowledge, this is the first full-coverage segmentation and clustering system that is applied to large-vocabulary multi-speaker data. To improve efficiency, the system incorporates a bottom-up syllable boundary detection method to eliminate unlikely word boundaries. We compare the system on English and Xitsonga datasets to several state-of-the-art baselines. We show that by imposing a consistent top-down segmentation while also using bottom-up knowledge from detected syllable boundaries, both single-speaker and multi-speaker versions of our system outperform a purely bottom-up single-speaker syllable-based approach. We also show that the discovered clusters can be made less speaker- and gender-specific by using features from the cAE (which incorporates both top-down and bottom-up learning). The system's discovered clusters are still less pure than those of two multi-speaker unsupervised term discovery systems, but provide far greater coverage. In summary, the different models and systems presented in this thesis show that both top-down and bottom-up modelling can improve representation learning, segmentation and clustering of unlabelled speech data.
APA, Harvard, Vancouver, ISO, and other styles
16

Sun, Feng-Tso. "Nonparametric Discovery of Human Behavior Patterns from Multimodal Data." Research Showcase @ CMU, 2014. http://repository.cmu.edu/dissertations/359.

Full text
Abstract:
Recent advances in sensor technologies and the growing interest in context- aware applications, such as targeted advertising and location-based services, have led to a demand for understanding human behavior patterns from sensor data. People engage in routine behaviors. Automatic routine discovery goes beyond low-level activity recognition such as sitting or standing and analyzes human behaviors at a higher level (e.g., commuting to work). The goal of the research presented in this thesis is to automatically discover high-level semantic human routines from low-level sensor streams. One recent line of research is to mine human routines from sensor data using parametric topic models. The main shortcoming of parametric models is that they assume a fixed, pre-specified parameter regardless of the data. Choosing an appropriate parameter usually requires an inefficient trial-and-error model selection process. Furthermore, it is even more difficult to find optimal parameter values in advance for personalized applications. The research presented in this thesis offers a novel nonparametric framework for human routine discovery that can infer high-level routines without knowing the number of latent low-level activities beforehand. More specifically, the frame-work automatically finds the size of the low-level feature vocabulary from sensor feature vectors at the vocabulary extraction phase. At the routine discovery phase, the framework further automatically selects the appropriate number of latent low-level activities and discovers latent routines. Moreover, we propose a new generative graphical model to incorporate multimodal sensor streams for the human activity discovery task. The hypothesis and approaches presented in this thesis are evaluated on public datasets in two routine domains: two daily-activity datasets and a transportation mode dataset. Experimental results show that our nonparametric framework can automatically learn the appropriate model parameters from multimodal sensor data without any form of manual model selection procedure and can outperform traditional parametric approaches for human routine discovery tasks.
APA, Harvard, Vancouver, ISO, and other styles
17

Zhou, Linfei [Verfasser], and Christian [Akademischer Betreuer] Böhm. "Indexing and knowledge discovery of gaussian mixture models and multiple-instance learning / Linfei Zhou ; Betreuer: Christian Böhm." München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2018. http://d-nb.info/1152210807/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Godard, Pierre. "Unsupervised word discovery for computational language documentation." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS062/document.

Full text
Abstract:
La diversité linguistique est actuellement menacée : la moitié des langues connues dans le monde pourraient disparaître d'ici la fin du siècle. Cette prise de conscience a inspiré de nombreuses initiatives dans le domaine de la linguistique documentaire au cours des deux dernières décennies, et 2019 a été proclamée Année internationale des langues autochtones par les Nations Unies, pour sensibiliser le public à cette question et encourager les initiatives de documentation et de préservation. Néanmoins, ce travail est coûteux en temps, et le nombre de linguistes de terrain, limité. Par conséquent, le domaine émergent de la documentation linguistique computationnelle (CLD) vise à favoriser le travail des linguistes à l'aide d'outils de traitement automatique. Le projet Breaking the Unwritten Language Barrier (BULB), par exemple, constitue l'un des efforts qui définissent ce nouveau domaine, et réunit des linguistes et des informaticiens. Cette thèse examine le problème particulier de la découverte de mots dans un flot non segmenté de caractères, ou de phonèmes, transcrits à partir du signal de parole dans un contexte de langues très peu dotées. Il s'agit principalement d'une procédure de segmentation, qui peut également être couplée à une procédure d'alignement lorsqu'une traduction est disponible. En utilisant deux corpus en langues bantoues correspondant à un scénario réaliste pour la linguistique documentaire, l'un en Mboshi (République du Congo) et l'autre en Myene (Gabon), nous comparons diverses méthodes monolingues et bilingues de découverte de mots sans supervision. Nous montrons ensuite que l'utilisation de connaissances linguistiques expertes au sein du formalisme des Adaptor Grammars peut grandement améliorer les résultats de la segmentation, et nous indiquons également des façons d'utiliser ce formalisme comme outil de décision pour le linguiste. Nous proposons aussi une variante tonale pour un algorithme de segmentation bayésien non-paramétrique, qui utilise un schéma de repli modifié pour capturer la structure tonale. Pour tirer parti de la supervision faible d'une traduction, nous proposons et étendons, enfin, une méthode de segmentation neuronale basée sur l'attention, et améliorons significativement la performance d'une méthode bilingue existante<br>Language diversity is under considerable pressure: half of the world’s languages could disappear by the end of this century. This realization has sparked many initiatives in documentary linguistics in the past two decades, and 2019 has been proclaimed the International Year of Indigenous Languages by the United Nations, to raise public awareness of the issue and foster initiatives for language documentation and preservation. Yet documentation and preservation are time-consuming processes, and the supply of field linguists is limited. Consequently, the emerging field of computational language documentation (CLD) seeks to assist linguists in providing them with automatic processing tools. The Breaking the Unwritten Language Barrier (BULB) project, for instance, constitutes one of the efforts defining this new field, bringing together linguists and computer scientists. This thesis examines the particular problem of discovering words in an unsegmented stream of characters, or phonemes, transcribed from speech in a very-low-resource setting. This primarily involves a segmentation procedure, which can also be paired with an alignment procedure when a translation is available. Using two realistic Bantu corpora for language documentation, one in Mboshi (Republic of the Congo) and the other in Myene (Gabon), we benchmark various monolingual and bilingual unsupervised word discovery methods. We then show that using expert knowledge in the Adaptor Grammar framework can vastly improve segmentation results, and we indicate ways to use this framework as a decision tool for the linguist. We also propose a tonal variant for a strong nonparametric Bayesian segmentation algorithm, making use of a modified backoff scheme designed to capture tonal structure. To leverage the weak supervision given by a translation, we finally propose and extend an attention-based neural segmentation method, improving significantly the segmentation performance of an existing bilingual method
APA, Harvard, Vancouver, ISO, and other styles
19

Das, Manirupa. "Neural Methods Towards Concept Discovery from Text via Knowledge Transfer." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1572387318988274.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Guimarães, Francisco José Rosales Santana. "Ontologias com suporte em metadados para interoperabilidade entre arquitetura empresarial e business intelligence." Doctoral thesis, Universidade de Évora, 2018. http://hdl.handle.net/10174/23561.

Full text
Abstract:
Uma empresa é uma forma de organização que pode ser vista como um sistema, e como tal, passível de ser representada através de um modelo que permita capturar os conceitos que a definem considerando a sua estrutura (e.g. cliente, canal, produto) e semântica (e.g. relação entre cliente e canal). Com base neste modelo torna-se possível conhecer de forma holística a organização, delinear estratégias, implementar sistemas de informação e monitorizar o seu desempenho. A modelação de uma organização ocorre em processos cíclicos, como é o caso de planeamento estratégico, desenho de processos ou desenho de soluções de sistemas de informação. Nestes processos, utilizam-se vários modelos que em conjunto representam o conhecimento sobre a própria organização. Apesar de existirem vários modelos, no contexto da nossa investigação consideramos que a arquitetura empresarial (AE) apresenta um referencial agregador e que permite criar uma visão holística da organização, além de que é suportada por linguagens de notação (e.g. UML, ArchiMate) e pode-se utilizar ferramentas para gestão destes modelos, o que possibilita uma normalização e reutilização de componentes. Para além da utilização numa perspetiva de gestão da organização, a AE pode ser igualmente a base para se detalhar requisitos funcionais e técnicos para implementação em aplicações informáticas no domínio de sistemas de informação. Estas aplicações podem ser operacionais, orientadas a tornar os processos mais eficientes e eficazes, ou informacionais, referidas nesta tese como sistemas de business intelligence (BI), focadas em monitorizar o desempenho através de eventos ocorridos (métricas), analisados em perspetivas de negócio (dimensões). Cada modelo, independentemente da sua visão gráfica, é instanciado sob a forma de dados, mas com definições destes dados representadas em estruturas conhecidas como metadados. Como tal, os metadados são normalmente vistos como dados sobre dados. A importância dos metadados é hoje reconhecida no domínio da AE, data governance e BI. No entanto, um dos problemas está associado à ausência de interoperabilidade entre estes sistemas e seus metadados, visto que é necessária uma transposição entre modelos, desde a definição até ao suporte ao funcionamento da organização ao nível do seu sistema de informação. A falta de interoperabilidade, leva a um desalinhamento entre os conceitos utilizados em AE e a sua implementação em sistemas de informação, com impacto ao nível do esforço para a implementação e manutenção destes sistemas. Em particular, o desalinhamento apresenta maior impacto no caso dos sistemas de BI que utilizam conceitos de negócio sob a forma de dimensões e métricas que devem estar alinhados com os que são utilizados na definição da organização e dos seus objetivos estratégicos, desenhados na AE. Esta tese centra a sua investigação neste problema de interoperabilidade no domínio da representação de conhecimento, considerando uma arquitetura de solução em que os metadados são vistos como uma ontologia organizacional, modelada em AE, enriquecida com outros conceitos (e.g. glossários, definição de bases de dados), reutilizada em sistemas de BI e permitindo a sua utilização em interfaces com utilizador baseada em processamento de língua natural. Reduz-se desta forma o esforço e complexidade de gestão de AE e BI, além de que torna assim possível um melhor alinhamento entre os conceitos utilizados na medição de desempenho e os conceitos definidos na modelação da organização, enquanto contributo para o alinhamento entre o negócio e sistemas de informação. Esta hipótese de solução define assim uma arquitetura e abordagem para suporte à inteligência organizacional; ABSTRACT: A company is a form of organization that can be seen as a system, and as such, capable of being represented through a model that allows to capture the concepts that define it considering its structure (e.g. client, channel, product) and semantics (e.g. relationship between client and channel). Based on this model it becomes possible to know the organization in a holistic way, to delineate strategies, to implement information systems and to monitor its performance. The modelling of an organization occurs in cyclical processes, as is the case of strategic planning, process design or design of information systems solutions. In these processes, several models are used, which together represent knowledge about the organization itself. Although there are several models, in the context of our research we consider that the enterprise architecture (EA) presents an aggregate reference and allows to create a holistic view of the organization, besides being supported by notation languages (e.g. UML, ArchiMate) it can use tools for the management of these models, which enables standardization and reuse of components. In addition to being used in a management perspective of the organization, the EA may also be the basis for detailing functional and technical requirements for implementation in IT applications in the field of information systems. These applications can be operational, oriented to make the processes more efficient and effective, or informational, referred to, in this thesis, as business intelligence (BI) systems, focused on monitoring performance through events that have occurred (metrics), analysed in business perspectives (dimensions). Each model, regardless of its graphical view, is instantiated in the form of data, but with definitions of this data represented in structures known as metadata. As such, metadata is typically viewed as data about data. The importance of metadata is now recognized in the field of EA, data governance and BI. However, one of the problems is associated with the lack of interoperability between these systems and their metadata, since a transposition between models is necessary, from the definition of the organization to the support to the functioning of the organization at the level of its information system. The lack of interoperability leads to misalignment between the concepts used in EA and their implementation in systems and information, with an impact on the level of the effort for the implementation and maintenance of these systems. This misalignment has a greater impact in the case of BI systems that use business concepts in the form of dimensions and metrics that should be aligned with those used in defining the organization and its strategic objectives, drawn in the EA. This thesis focuses its investigation on this problem of interoperability in the field of knowledge representation, considering a solution architecture in which metadata is seen as an organizational ontology, modelled in EA, enriched with other concepts (e.g. glossaries, definition of databases), reused in BI systems and allowing its use in user interfaces based on natural language processing. In this way, the effort and complexity of managing EA and BI is reduced, thus making possible a better alignment between the concepts used in performance measurement and the concepts defined in the modelling of the organization, as a contribution to the alignment between the business and information systems. This solution hypothesis thus defines an architecture and approach to support organizational intelligence.
APA, Harvard, Vancouver, ISO, and other styles
21

Srinivasamurthy, Ajay. "A Data-driven bayesian approach to automatic rhythm analysis of indian art music." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/398986.

Full text
Abstract:
Las colecciones de música son cada vez mayores y más variadas, haciendo necesarias nuevas fórmulas para su organización automática. El análisis automático del ritmo tiene como fin la extracción de información rítmica de grabaciones musicales y es una de las principales áreas de investigación en la disciplina de recuperación de la información musical (MIR por sus siglas en inglés). La dimensión rítmica de la música es específica a una cultura y por tanto su análisis requiere métodos que incluyan el contexto cultural. Las complejidades rítmicas de la música clásica de la India, una de las mayores tradiciones musicales del mundo, no han sido tratadas hasta la fecha en MIR, motivo por el cual la elegimos como nuestro principal objeto de estudio. Nuestra intención es abordar cuestiones de análisis rítmico aún no tratadas en MIR con el fin de contribuir a la disciplina con nuevos métodos sensibles al contexto cultural y generalizables a otras tradiciones musicales. El objetivo de la tesis es el desarrollo de técnicas de procesamiento de señales y aprendizaje automático dirigidas por datos para el análisis, descripción y descubrimiento automáticos de estructuras y patrones rítmicos en colecciones de audio de música clásica de la India. Tras identificar retos y posibilidades, así como varias tareas de investigación relevantes para este objetivo, detallamos la elaboración del corpus de estudio y conjuntos de datos, fundamentales para métodos dirigidos por datos. A continuación, nos centramos en las tareas de análisis métrico y descubrimiento de patrones de percusión. El análisis métrico consiste en la alineación de eventos métricos a diferentes niveles con una grabación de audio. En la tesis formulamos las tareas de deducción de metro, seguimiento de metro y seguimiento informado de metro de acuerdo a la tradición estudiada, se evalúan diferentes modelos bayesianos capaces de incorporar explícitamente información de estructuras métricas de niveles superiores y se proponen nuevas extensiones. Los métodos propuestos superan las limitaciones de las propuestas existentes y los resultados indican la efectividad del análisis informado de metro. La percusión en la música clásica de la India utiliza onomatopeyas para la transmisión del repertorio y la técnica. Utilizamos estas sílabas para definir, representar y descubrir patrones en grabaciones de solos de percusión. A tal fin generamos una transcripción automática basada en un modelo oculto de Márkov, seguida de una búsqueda aproximada de subcadenas usando una biblioteca de patrones de percusión derivada de datos. Experimentos preliminares en patrones de percusión de ópera de Pekín, y en grabaciones de solos de tabla y mridangam, demuestran la utilidad de estas sílabas, identificando nuevos retos para el desarrollo de sistemas prácticos de descubrimiento. Las tecnologías resultantes de esta investigación son parte de un conjunto de herramientas desarrollado en el proyecto CompMusic para el mejor entendimiento y organización de la música clásica de la India, con el objetivo de proveer una experiencia mejorada de escucha y descubrimiento de música. Estos datos y herramientas pueden ser también relevantes para estudios musicológicos dirigidos por datos y otras tareas de MIR que puedan beneficiarse de análisis automáticos de ritmo.<br>Large and growing collections of a wide variety of music are now available on demand to music listeners, necessitating novel ways of automatically structuring these collections using different dimensions of music. Rhythm is one of the basic music dimensions and its automatic analysis, which aims to extract musically meaningful rhythm related information from music, is a core task in Music Information Research (MIR). Musical rhythm, similar to most musical dimensions, is culture-specific and hence its analysis requires culture-aware approaches. Indian art music is one of the major music traditions of the world and has complexities in rhythm that have not been addressed by the current state of the art in MIR, motivating us to choose it as the primary music tradition for study. Our intent is to address unexplored rhythm analysis problems in Indian art music to push the boundaries of the current MIR approaches by making them culture-aware and generalizable to other music traditions. The thesis aims to build data-driven signal processing and machine learning approaches for automatic analysis, description and discovery of rhythmic structures and patterns in audio music collections of Indian art music. After identifying challenges and opportunities, we present several relevant research tasks that open up the field of automatic rhythm analysis of Indian art music. Data-driven approaches require well curated data corpora for research and efforts towards creating such corpora and datasets are documented in detail. We then focus on the topics of meter analysis and percussion pattern discovery in Indian art music. Meter analysis aims to align several hierarchical metrical events with an audio recording. Meter analysis tasks such as meter inference, meter tracking and informed meter tracking are formulated for Indian art music. Different Bayesian models that can explicitly incorporate higher level metrical structure information are evaluated for the tasks and novel extensions are proposed. The proposed methods overcome the limitations of existing approaches and their performance indicate the effectiveness of informed meter analysis. Percussion in Indian art music uses onomatopoeic oral mnemonic syllables for the transmission of repertoire and technique, providing a language for percussion. We use these percussion syllables to define, represent and discover percussion patterns in audio recordings of percussion solos. We approach the problem of percussion pattern discovery using hidden Markov model based automatic transcription followed by an approximate string search using a data derived percussion pattern library. Preliminary experiments on Beijing opera percussion patterns, and on both tabla and mridangam solo recordings in Indian art music demonstrate the utility of percussion syllables, identifying further challenges to building practical discovery systems. The technologies resulting from the research in the thesis are a part of the complete set of tools being developed within the CompMusic project for a better understanding and organization of Indian art music, aimed at providing an enriched experience with listening and discovery of music. The data and tools should also be relevant for data-driven musicological studies and other MIR tasks that can benefit from automatic rhythm analysis.<br>Les col·leccions de música són cada vegada més grans i variades, fet que fa necessari buscar noves fórmules per a organitzar automàticament aquestes col·leccions. El ritme és una de les dimensions bàsiques de la música, i el seu anàlisi automàtic és una de les principals àrees d'investigació en la disciplina de l'recuperació de la informació musical (MIR, acrònim de la traducció a l'anglès). El ritme, com la majoria de les dimensions musicals, és específic per a cada cultura i per tant, el seu anàlisi requereix de mètodes que incloguin el context cultural. La complexitat rítmica de la música clàssica de l'Índia, una de les tradicions musicals més grans al món, no ha estat encara treballada en el camp d'investigació de MIR - motiu pel qual l'escollim com a principal material d'estudi. La nostra intenció és abordar les problemàtiques que presenta l'anàlisi rítmic de la música clàssica de l'Índia, encara no tractades en MIR, amb la finalitat de contribuir en la disciplina amb nous models sensibles al context cultural i generalitzables a altres tradicions musicals. L'objectiu de la tesi consisteix en desenvolupar tècniques de processament de senyal i d'aprenentatge automàtic per a l'anàlisi, descripció i descobriment automàtic d'estructures i patrons rítmics en col·leccions de música clàssica de l'Índia. Després d'identificar els reptes i les oportunitats, així com les diverses tasques d'investigació rellevants per a aquest objectiu, detallem el procés d'elaboració del corpus de dades, fonamentals per als mètodes basats en dades. A continuació, ens centrem en les tasques d'anàlisis mètric i descobriment de patrons de percussió. L'anàlisi mètric consisteix en alinear els diversos esdeveniments mètrics -a diferents nivells- que es produeixen en una gravació d'àudio. En aquesta tesi formulem les tasques de deducció, seguiment i seguiment informat de la mètrica. D'acord amb la tradició musical estudiada, s'avaluen diferents models bayesians que poden incorporar explícitament estructures mètriques d'alt nivell i es proposen noves extensions per al mètode. Els mètodes proposats superen les limitacions dels mètodes ja existents i el seu rendiment indica l'efectivitat dels mètodes informats d'anàlisis mètric. La percussió en la música clàssica de l'Índia utilitza onomatopeies per a la transmissió del repertori i de la tècnica, fet que construeix un llenguatge per a la percussió. Utilitzem aquestes síl·labes percussives per a definir, representar i descobrir patrons en enregistraments de solos de percussió. Enfoquem el problema del descobriment de patrons percussius amb un model de transcripció automàtica basat en models ocults de Markov, seguida d'una recerca aproximada de strings utilitzant una llibreria de patrons de percussions derivada de dades. Experiments preliminars amb patrons de percussió d'òpera de Pequín, i amb gravacions de solos de tabla i mridangam, demostren la utilitat de les síl·labes percussives. Identificant, així, nous horitzons per al desenvolupament de sistemes pràctics de descobriment. Les tecnologies resultants d'aquesta recerca són part de les eines desenvolupades dins el projecte de CompMusic, que té com a objectiu millorar l'experiència d'escoltar i descobrir música per a la millor comprensió i organització de la música clàssica de l'Índia, entre d'altres. Aquestes dades i eines poden ser rellevants per a estudis musicològics basats en dades i, també, altres tasques MIR poden beneficiar-se de l'anàlisi automàtic del ritme.
APA, Harvard, Vancouver, ISO, and other styles
22

Ballout, Ali. "Apprentissage actif pour la découverte d'axiomes." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4026.

Full text
Abstract:
Cette thèse aborde le défi de l'évaluation des formules logiques candidates, avec un accent particulier sur les axiomes, en combinant de manière synergique l'apprentissage automatique et le raisonnement symbolique. Cette approche innovante facilite la découverte automatique d'axiomes, principalement dans la phase d'évaluation des axiomes candidats générés. La recherche vise à résoudre le problème de la validation efficace et précise de ces candidats dans le contexte plus large de l'acquisition de connaissances sur le Web sémantique.Reconnaissant l'importance des heuristiques de génération existantes pour les axiomes candidats, cette recherche se concentre sur l'avancement de la phase d'évaluation de ces candidats. Notre approche consiste à utiliser ces candidats basés sur des heuristiques, puis à évaluer leur compatibilité et leur cohérence avec les bases de connaissances existantes. Le processus d'évaluation, qui nécessite généralement beaucoup de calculs, est révolutionné par le développement d'un modèle prédictif qui évalue efficacement l'adéquation de ces axiomes en tant que substitut du raisonnement traditionnel. Ce modèle innovant réduit considérablement les exigences en matière de calcul, en utilisant le raisonnement comme un "oracle" occasionnel pour classer les axiomes complexes lorsque cela est nécessaire.L'apprentissage actif joue un rôle essentiel dans ce cadre. Il permet à l'algorithme d'apprentissage automatique de sélectionner des données spécifiques pour l'apprentissage, améliorant ainsi son efficacité et sa précision avec un minimum de données étiquetées. La thèse démontre cette approche dans le contexte du Web sémantique, où le raisonneur joue le rôle d'"oracle" et où les nouveaux axiomes potentiels représentent des données non étiquetées.Cette recherche contribue de manière significative aux domaines du raisonnement automatique, du traitement du langage naturel et au-delà, en ouvrant de nouvelles possibilités dans des domaines tels que la bio-informatique et la preuve automatique de théorèmes. En mariant efficacement l'apprentissage automatique et le raisonnement symbolique, ces travaux ouvrent la voie à des processus de découverte de connaissances plus sophistiqués et autonomes, annonçant un changement de paradigme dans la manière dont nous abordons et exploitons la vaste étendue de données sur le web sémantique<br>This thesis addresses the challenge of evaluating candidate logical formulas, with a specific focus on axioms, by synergistically combining machine learning with symbolic reasoning. This innovative approach facilitates the automatic discovery of axioms, primarily in the evaluation phase of generated candidate axioms. The research aims to solve the issue of efficiently and accurately validating these candidates in the broader context of knowledge acquisition on the semantic Web.Recognizing the importance of existing generation heuristics for candidate axioms, this research focuses on advancing the evaluation phase of these candidates. Our approach involves utilizing these heuristic-based candidates and then evaluating their compatibility and consistency with existing knowledge bases. The evaluation process, which is typically computationally intensive, is revolutionized by developing a predictive model that effectively assesses the suitability of these axioms as a surrogate for traditional reasoning. This innovative model significantly reduces computational demands, employing reasoning as an occasional "oracle" to classify complex axioms where necessary.Active learning plays a pivotal role in this framework. It allows the machine learning algorithm to select specific data for learning, thereby improving its efficiency and accuracy with minimal labeled data. The thesis demonstrates this approach in the context of the semantic Web, where the reasoner acts as the "oracle," and the potential new axioms represent unlabeled data.This research contributes significantly to the fields of automated reasoning, natural language processing, and beyond, opening up new possibilities in areas like bioinformatics and automated theorem proving. By effectively marrying machine learning with symbolic reasoning, this work paves the way for more sophisticated and autonomous knowledge discovery processes, heralding a paradigm shift in how we approach and leverage the vast expanse of data on the semantic Web
APA, Harvard, Vancouver, ISO, and other styles
23

Cockroft, Nicholas T. "Applications of Cheminformatics for the Analysis of Proteolysis Targeting Chimeras and the Development of Natural Product Computational Target Fishing Models." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu156596730476322.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Pessôa, Neto Agnaldo Cavalcante de Albuquerque. "Um modelo híbrido baseado em ontologias e RBC para a concepção de um ambiente de descoberta que proporcione a aprendizagem de conceitos na formação de teorias por intermédio da metáfora de contos infantis." Universidade Federal de Alagoas, 2006. http://repositorio.ufal.br/handle/riufal/809.

Full text
Abstract:
The actual work shows a model of discovery learning in order to realize a discovery environment (PARAGUAÇU, 1997) to demonstrate to the apprentice students in science, the understanding of how the concepts that are used in the creation of scientific theories are related. The subject is reached with the idea that is possible to create scientific theories in scientific models (FRIGG; HARTMANN, 2006; RUDNER, 1969), and that these models can be used to help in such learning. However, with the availability of such models, instead of introducing scientific terms related to some scientific topics, it intends to use the metaphor of Fairy Tales, what means, the vocabulary use of terms where the apprentice can understand by intuition on how a scientific theory is elaborated. On the other hand, in order to create and formalize this scientific model it was created the idea that was proposed by MIDES Architecture MIDES (PARAGUAÇU et al., 2003), which means the creation of a scientific model with the representation in XML (W3SCHOOLS, 2005b) in four views of knowledge: Hierarchy, Relational, Causal, and by Asking. So, the idea of this work is to show how the creation in XML is made, and to do so, it s necessary to make a review of the following subjects: learning environments; teaching based on cases; and some general aspects of a creation of a scientific theory, and about the creation of a theory like an axiomatic system, as well as to present the ideas for the elaboration of discovery learning models. When this review is done, we have the necessary knowledge to propose an architecture able to integrate two applications by the use of XML, that is, the first application is to a teacher s community that elaborate theories in scientific models using the metaphor of the Fairy Tales, and the second one, for students that desire to learn how the creation of a theory is made, by the use of models that were introduced by the teacher s community.<br>O presente trabalho apresenta um modelo de aprendizagem por descoberta no âmbito de realização de um ambiente de descoberta (PARAGUAÇU, 1997) para proporcionar a alunos aprendizes em ciência, o entendimento de como os conceitos que são utilizados na formação de teorias científicas estão relacionados. O assunto é abordado com a suposição de que é possível formular teorias científicas em modelos científicos (FRIGG; HARTMANN, 2006; RUDNER, 1969), e que estes modelos podem ser disponibilizados para proporcionar tal aprendizagem. Porém, com a disponibilidade de tais modelos, em vez de introduzir termos científicos relacionados a alguma disciplina científica, pretende-se para tal realização utilizar a metáfora de contos infantis, ou seja, utilizar um vocabulário de termos onde o aprendiz possa entender intuitivamente como é elaborada uma teoria científica. Por outro lado, para proporcionar a formalização deste modelo científico, foi adotada a idéia proposta pela arquitetura MIDES (PARAGUAÇU et al., 2003), ou seja, a realização de um modelo científico com uma representação em XML (W3SCHOOLS, 2005b), em quatro visões de conhecimento: hierárquica, relacional, causal e de questionamento. Sendo assim, pretende-se no decorrer deste trabalho mostrar como é realizada esta formalização em XML e, para isso, é necessário revisar os seguintes assuntos: ambientes de aprendizagem; ontologias; ensino baseado em casos; e alguns aspectos gerais sobre a elaboração de uma teoria científica e sobre a formulação de uma teoria como um sistema axiomático, como também apresentar as idéias para a elaboração de modelos de aprendizagem por descoberta. Feita esta revisão, tem-se o embasamento necessário para propor uma arquitetura que possa integrar duas aplicações por intermédio deste modelo XML, ou seja, a primeira aplicação serve para uma comunidade de professores que elaboram teorias em modelos científicos, utilizando a metáfora de contos, e a segunda, para alunos que desejam aprender como é realizada a formação de uma teoria, por intermédio dos modelos que foram disponibilizados pela comunidade de professores.
APA, Harvard, Vancouver, ISO, and other styles
25

Şentürk, Sertan. "Computational analysis of audio recordings and music scores for the description and discovery of Ottoman-Turkish Makam music." Doctoral thesis, Universitat Pompeu Fabra, 2017. http://hdl.handle.net/10803/402102.

Full text
Abstract:
This thesis addresses several shortcomings on the current state of the art methodologies in music information retrieval (MIR). In particular, it proposes several computational approaches to automatically analyze and describe music scores and audio recordings of Ottoman-Turkish makam music (OTMM). The main contributions of the thesis are the music corpus that has been created to carry out the research and the audio-score alignment methodology developed for the analysis of the corpus. In addition, several novel computational analysis methodologies are presented in the context of common MIR tasks of relevance for OTMM. Some example tasks are predominant melody extraction, tonic identification, tempo estimation, makam recognition, tuning analysis, structural analysis and melodic progression analysis. These methodologies become a part of a complete system called Dunya-makam for the exploration of large corpora of OTMM. The thesis starts by presenting the created CompMusic Ottoman- Turkish makam music corpus. The corpus includes 2200 music scores, more than 6500 audio recordings, and accompanying metadata. The data has been collected, annotated and curated with the help of music experts. Using criteria such as completeness, coverage and quality, we validate the corpus and show its research potential. In fact, our corpus is the largest and most representative resource of OTMM that can be used for computational research. Several test datasets have also been created from the corpus to develop and evaluate the specific methodologies proposed for different computational tasks addressed in the thesis. The part focusing on the analysis of music scores is centered on phrase and section level structural analysis. Phrase boundaries are automatically identified using an existing state-of-the-art segmentation methodology. Section boundaries are extracted using heuristics specific to the formatting of the music scores. Subsequently, a novel method based on graph analysis is used to establish similarities across these structural elements in terms of melody and lyrics, and to label the relations semiotically. The audio analysis section of the thesis reviews the state-of-the-art for analysing the melodic aspects of performances of OTMM. It proposes adaptations of existing predominant melody extraction methods tailored to OTMM. It also presents improvements over pitch-distribution-based tonic identification and makam recognition methodologies. The audio-score alignment methodology is the core of the thesis. It addresses the culture-specific challenges posed by the musical characteristics, music theory related representations and oral praxis of OTMM. Based on several techniques such as subsequence dynamic time warping, Hough transform and variable-length Markov models, the audio-score alignment methodology is designed to handle the structural differences between music scores and audio recordings. The method is robust to the presence of non-notated melodic expressions, tempo deviations within the music performances, and differences in tonic and tuning. The methodology utilizes the outputs of the score and audio analysis, and links the audio and the symbolic data. In addition, the alignment methodology is used to obtain score-informed description of audio recordings. The scoreinformed audio analysis not only simplifies the audio feature extraction steps that would require sophisticated audio processing approaches, but also substantially improves the performance compared with results obtained from the state-of-the-art methods solely relying on audio data. The analysis methodologies presented in the thesis are applied to the CompMusic Ottoman-Turkish makam music corpus and integrated into a web application aimed at culture-aware music discovery. Some of the methodologies have already been applied to other music traditions such as Hindustani, Carnatic and Greek music. Following open research best practices, all the created data, software tools and analysis results are openly available. The methodologies, the tools and the corpus itself provide vast opportunities for future research in many fields such as music information retrieval, computational musicology and music education.<br>Esta tesis aborda varias limitaciones de las metodologías más avanzadas en el campo de recuperación de información musical (MIR por sus siglas en inglés). En particular, propone varios métodos computacionales para el análisis y la descripción automáticas de partituras y grabaciones de audio de música de makam turco-otomana (MMTO). Las principales contribuciones de la tesis son el corpus de música que ha sido creado para el desarrollo de la investigación y la metodología para alineamiento de audio y partitura desarrollada para el análisis del corpus. Además, se presentan varias metodologías nuevas para análisis computacional en el contexto de las tareas comunes de MIR que son relevantes para MMTO. Algunas de estas tareas son, por ejemplo, extracción de la melodía predominante, identificación de la tónica, estimación de tempo, reconocimiento de makam, análisis de afinación, análisis estructural y análisis de progresión melódica. Estas metodologías constituyen las partes de un sistema completo para la exploración de grandes corpus de MMTO llamado Dunya-makam. La tesis comienza presentando el corpus de música de makam turcootomana de CompMusic. El corpus incluye 2200 partituras, más de 6500 grabaciones de audio, y los metadatos correspondientes. Los datos han sido recopilados, anotados y revisados con la ayuda de expertos. Utilizando criterios como compleción, cobertura y calidad, validamos el corpus y mostramos su potencial para investigación. De hecho, nuestro corpus constituye el recurso de mayor tamaño y representatividad disponible para la investigación computacional de MMTO. Varios conjuntos de datos para experimentación han sido igualmente creados a partir del corpus, con el fin de desarrollar y evaluar las metodologías específicas propuestas para las diferentes tareas computacionales abordadas en la tesis. La parte dedicada al análisis de las partituras se centra en el análisis estructural a nivel de sección y de frase. Los márgenes de frase son identificados automáticamente usando uno de los métodos de segmentación existentes más avanzados. Los márgenes de sección son extraídos usando una heurística específica al formato de las partituras. A continuación, se emplea un método de nueva creación basado en análisis gráfico para establecer similitudes a través de estos elementos estructurales en cuanto a melodía y letra, así como para etiquetar relaciones semióticamente. La sección de análisis de audio de la tesis repasa el estado de la cuestión en cuanto a análisis de los aspectos melódicos en grabaciones de MMTO. Se proponen modificaciones de métodos existentes para extracción de melodía predominante para ajustarlas a MMTO. También se presentan mejoras de metodologías tanto para identificación de tónica basadas en distribución de alturas, como para reconocimiento de makam. La metodología para alineación de audio y partitura constituye el grueso de la tesis. Aborda los retos específicos de esta cultura según vienen determinados por las características musicales, las representaciones relacionadas con la teoría musical y la praxis oral de MMTO. Basada en varias técnicas tales como deformaciones dinámicas de tiempo subsecuentes, transformada de Hough y modelos de Markov de longitud variable, la metodología de alineamiento de audio y partitura está diseñada para tratar las diferencias estructurales entre partituras y grabaciones de audio. El método es robusto a la presencia de expresiones melódicas no anotadas, desviaciones de tiempo en las grabaciones, y diferencias de tónica y afinación. La metodología utiliza los resultados del análisis de partitura y audio para enlazar el audio y los datos simbólicos. Además, la metodología de alineación se usa para obtener una descripción informada por partitura de las grabaciones de audio. El análisis de audio informado por partitura no sólo simplifica los pasos para la extracción de características de audio que de otro modo requerirían sofisticados métodos de procesado de audio, sino que también mejora sustancialmente su rendimiento en comparación con los resultados obtenidos por los métodos más avanzados basados únicamente en datos de audio. Las metodologías analíticas presentadas en la tesis son aplicadas al corpus de música de makam turco-otomana de CompMusic e integradas en una aplicación web dedicada al descubrimiento culturalmente específico de música. Algunas de las metodologías ya han sido aplicadas a otras tradiciones musicales, como música indostaní, carnática y griega. Siguiendo las mejores prácticas de investigación en abierto, todos los datos creados, las herramientas de software y los resultados de análisis está disponibles públicamente. Las metodologías, las herramientas y el corpus en sí mismo ofrecen grandes oportunidades para investigaciones futuras en muchos campos tales como recuperación de información musical, musicología computacional y educación musical.<br>Aquesta tesi adreça diverses deficiències en l’estat actual de les metodologies d’extracció d’informació de música (Music Information Retrieval o MIR). En particular, la tesi proposa diverses estratègies per analitzar i descriure automàticament partitures musicals i enregistraments d’actuacions musicals de música Makam Turca Otomana (OTMM en les seves sigles en anglès). Les contribucions principals de la tesi són els corpus musicals que s’han creat en el context de la tesi per tal de dur a terme la recerca i la metodologia de alineament d’àudio amb la partitura que s’ha desenvolupat per tal d’analitzar els corpus. A més la tesi presenta diverses noves metodologies d’anàlisi computacional d’OTMM per a les tasques més habituals en MIR. Alguns exemples d’aquestes tasques són la extracció de la melodia principal, la identificació del to musical, l’estimació de tempo, el reconeixement de Makam, l’anàlisi de la afinació, l’anàlisi de la estructura musical i l’anàlisi de la progressió melòdica. Aquest seguit de metodologies formen part del sistema Dunya-makam per a la exploració de grans corpus musicals d’OTMM. En primer lloc, la tesi presenta el corpus CompMusic Ottoman- Turkish makam music. Aquest inclou 2200 partitures musicals, més de 6500 enregistraments d’àudio i metadata complementària. Les dades han sigut recopilades i anotades amb ajuda d’experts en aquest repertori musical. El corpus ha estat validat en termes de d’exhaustivitat, cobertura i qualitat i mostrem aquí el seu potencial per a la recerca. De fet, aquest corpus és el la font més gran i representativa de OTMM que pot ser utilitzada per recerca computacional. També s’han desenvolupat diversos subconjunts de dades per al desenvolupament i evaluació de les metodologies específiques proposades per a les diverses tasques computacionals que es presenten en aquest tesi. La secció de la tesi que tracta de l’anàlisi de partitures musicals se centra en l’anàlisi estructural a nivell de secció i de frase musical. Els límits temporals de les frases musicals s’identifiquen automàticament gràcies a un metodologia de segmentació d’última generació. Els límits de les seccions s’extreuen utilitzant un seguit de regles heurístiques determinades pel format de les partitures musicals. Posteriorment s’utilitza un nou mètode basat en anàlisi gràfic per establir semblances entre aquest elements estructurals en termes de melodia i text. També s’utilitza aquest mètode per etiquetar les relacions semiòtiques existents. La següent secció de la tesi tracta sobre anàlisi d’àudio i en particular revisa les tecnologies d’avantguardia d’anàlisi dels aspectes melòdics en OTMM. S’hi proposen adaptacions dels mètodes d’extracció de melodia existents que s’ajusten a OTMM. També s’hi presenten millores en metodologies de reconeixement de makam i en identificació de tònica basats en distribució de to. La metodologia d’alineament d’àudio amb partitura és el nucli de la tesi. Aquesta aborda els reptes culturalment específics imposats per les característiques musicals, les representacions de la teoria musical i la pràctica oral particulars de l’OTMM. Utilitzant diverses tècniques tal i com Dynamic Time Warping, Hough Transform o models de Markov de durada variable, la metodologia d’alineament esta dissenyada per enfrontar les diferències estructurals entre partitures musicals i enregistraments d’àudio. El mètode és robust inclús en presència d’expressions musicals no anotades en la partitura, desviacions de tempo ocorregudes en les actuacions musicals i diferències de tònica i afinació. La metodologia aprofita els resultats de l’anàlisi de la partitura i l’àudio per enllaçar la informació simbòlica amb l’àudio. A més, la tècnica d’alineament s’utilitza per obtenir descripcions de l’àudio fonamentades en la partitura. L’anàlisi de l’àudio fonamentat en la partitura no només simplifica les fases d’extracció de característiques d’àudio que requeririen de mètodes de processament d’àudio sofisticats, sinó que a més millora substancialment els resultats comparat amb altres mètodes d´ultima generació que només depenen de contingut d’àudio. Les metodologies d’anàlisi presentades s’han utilitzat per analitzar el corpus CompMusic Ottoman-Turkish makam music i s’han integrat en una aplicació web destinada al descobriment musical de tradicions culturals específiques. Algunes de les metodologies ja han sigut també aplicades a altres tradicions musicals com la Hindustani, la Carnàtica i la Grega. Seguint els preceptes de la investigació oberta totes les dades creades, eines computacionals i resultats dels anàlisis estan disponibles obertament. Tant les metodologies, les eines i el corpus en si mateix proporcionen àmplies oportunitats per recerques futures en diversos camps de recerca tal i com la musicologia computacional, la extracció d’informació musical i la educació musical. Traducció d’anglès a català per Oriol Romaní Picas.
APA, Harvard, Vancouver, ISO, and other styles
26

Chen, Tso-Lin, and 陳作琳. "A Fuzzy Knowledge Discovery Model Using Fuzzy Decision Tree and Fuzzy Adaptive Learning Control Network." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/97631994096302009774.

Full text
Abstract:
碩士<br>中原大學<br>工業工程研究所<br>91<br>To explore business information and operation experience from relational databases is a challenge, because many cause-effect relationships and business rules are fuzzy. It is therefore difficult for a decision-maker to discover important factors. At first, we defined fuzzy sets of the membership functions by Dodgson’s function and quartile statistic. Next, we developed a data-mining model base on both the fuzzy decision tree and fuzzy adaptive learning control network—these two concepts help generate concrete rules. This research adopted a decision-tree based learning algorithm and back-propagation neuro network to develop a fuzzy decision tree and fuzzy adaptive learning control network. In order to refine rules, we took the advantage of the Chi-square test of homogeneity to reduce the connection of weight. This research also used the Prediction of Tardiness in Semi-conductor Testing and the Prediction of Grades by an Advanced General Knowledge Course as samples. The results showed that the two models generated a compact fuzzy rule-base that yielded high accuracy.
APA, Harvard, Vancouver, ISO, and other styles
27

Akhlaghi, Arash. "A Framework for Discovery and Diagnosis of Behavioral Transitions in Event-streams." 2013. http://scholarworks.gsu.edu/cs_diss/81.

Full text
Abstract:
Date stream mining techniques can be used in tracking user behaviors as they attempt to achieve their goals. Quality metrics over stream-mined models identify potential changes in user goal attainment. When the quality of some data mined models varies significantly from nearby models—as defined by quality metrics—then the user’s behavior is automatically flagged as a potentially significant behavioral change. Decision tree, sequence pattern and Hidden Markov modeling being used in this study. These three types of modeling can expose different aspect of user’s behavior. In case of decision tree modeling, the specific changes in user behavior can automatically characterized by differencing the data-mined decision-tree models. The sequence pattern modeling can shed light on how the user changes his sequence of actions and Hidden Markov modeling can identifies the learning transition points. This research describes how model-quality monitoring and these three types of modeling as a generic framework can aid recognition and diagnoses of behavioral changes in a case study of cognitive rehabilitation via emailing. The date stream mining techniques mentioned are used to monitor patient goals as part of a clinical plan to aid cognitive rehabilitation. In this context, real time data mining aids clinicians in tracking user behaviors as they attempt to achieve their goals. This generic framework can be widely applicable to other real-time data-intensive analysis problems. In order to illustrate this fact, the similar Hidden Markov modeling is being used for analyzing the transactional behavior of a telecommunication company for fraud detection. Fraud similarly can be considered as a potentially significant transaction behavioral change.
APA, Harvard, Vancouver, ISO, and other styles
28

Rangarajan, Sarathkumar. "QOS-aware Web service discovery, selection, composition and application." Thesis, 2020. https://vuir.vu.edu.au/42153/.

Full text
Abstract:
Since the beginning of the 21st century, service-oriented architecture (SOA) has emerged as an advancement of distributed computing. SOA is a framework where software modules are developed using straightforward interfaces, and each module serves a specific array of functions. It delivers enterprise applications individually or integrated into a more significant composite Web services. However, SOA implementation faces several challenges, hindering its broader adaptation. This thesis aims to highlight three significant challenges in the implementation of SOA. The abundance of functionally similar Web services and the lack of integrity with non-functional features such as Quality of Service (QoS) leads to the difficulties in the prediction of QoS. Thus, the first challenge to be addressed is to find an efficient scheme for the prediction of QoS. The use of software source code metrics is a widely accepted alternative solution. Source code metrics are measured at a micro level and aggregated at the macro level to represent the software adequately. However, the effect of aggregation schemes on QoS prediction using source code metrics remains unexplored. The inequality distribution model, the Theil index, is proposed in this research to aggregate micro level source code metrics for three different datasets and compare the quality of QoS prediction. The experiment results show that the Theil index is a practical solution for effective QoS prediction. The second challenge is to search and compose suitable Web services with- out the need for expertise in composition tools. Currently, the existing approaches need system engineers with extensive knowledge of SOA techniques. A keyword-based search is a common approach for information retrieval which does not require an understanding of a query language or the underlying data structure. The proposed framework uses a schema-based keyword search over the relational database for an efficient Web service search and composition. Experiments are conducted with the WS-Dream data set to evaluate Web service search and composition framework using adequate performance parameters. The results of a quality constraints experiments show that the schema-based keyword search can achieve a better success rate than the existing approaches. Building an efficient data architecture for SOA applications is the third challenge as real-world SOA applications are required to process a vast quantity of data to produce a valuable service on demand. Contemporary SOA data processing systems such as the Enterprise Data Warehouse (EDW) lack scalability. A data lake, a productive data environment, is proposed to improve data ingestion for SOA systems. The data lake architecture stores both structured and unstructured data using the Hadoop Distributed File System (HDFS). Experiment results compare the data ingestion time of data lake and EDW. In the evaluation, the data lake-based architecture is implemented for personalized medication suggestion system. The data lake shows that it can generate patient clusters more concisely than the current EDW-based approaches. In summary, this research can effectively address three significant challenges for the broader adaptation of SOAs. The Theil index-based data aggregation model helps QoS prediction without the dependence on the Web service registry. Service engineers with less knowledge of SOA techniques can exploit a schema-based keyword search for a Web service search and composition. The data lake shows its potential to act as a data architecture for SOA applications.
APA, Harvard, Vancouver, ISO, and other styles
29

Zhao, Kai-Wen, and 趙愷文. "Discover Monte Carlo Algorithm on Spin Ice Model Using Reinforcement Learning." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/w6p375.

Full text
Abstract:
碩士<br>國立臺灣大學<br>物理學研究所<br>106<br>Reinforcement learning is a fast-growing research field due to its outstanding exploration capability in dynamic environments. Inspired by psychological learning theories, the reinforcement learning framework contains a software agent with improvable policies that takes actions on the environment and attempts to achieve the goal according to given reward. A policy is a stochastic rule which governs the decision-making process of the agent and is updated based on the response of the environment. In this work, we apply reinforcement learning framework on the spin ice model. Spin ice is a frustrated magnetic system with strong topological constraint on the low-energy configurations. In the physics community, it is well-known that the loop Monte Carlo algorithm can update the system efficiently without breaking its local constraint. However, from a broader perspective, the global update schemes can be problem-dependent and require customized algorithm design. Therefore, we exploit a reinforcement learning method that parameterize transition operator with neural networks. By extending the Markov chain to Markov decision process, the algorithm can adaptively search for global update policy through its interactions with the physical model. It may serve as a general framework for the search of update patterns.
APA, Harvard, Vancouver, ISO, and other styles
30

Carvalho, Angélica Santos. "Recurrent Models for Drug Generation." Master's thesis, 2019. http://hdl.handle.net/10316/87303.

Full text
Abstract:
Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia<br>A descoberta de medicamentos visa identificar potenciais novos medicamentos através de um processo multidisciplinar, incluindo várias áreas científicas, como a biologia, a química e a farmacologia. Atualmente, múltiplas estratégias e metodologias têm sido desenvolvidas para descobrir, testar e otimizar novos medicamentos. No entanto, há um longo processo que vai desde a identificação de alvos até uma molécula comercializável. O objetivo principal desta dissertação é desenvolver um modelo computacional capaz de propor novos compostos. Para atingir este objetivo, foi explorado e treinado um modelo recorrente para gerar um novo Simplified molecular-input line-entry system (SMILES). As Artificial Neural Network (ANN) estudadas nesta dissertação foram Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) e Bidirectional Long-Short Term Memory (BLSTM). Um conjunto de dados consistente foi escolhido e os SMILES gerados pelo modelo foram sintática e bioquimicamente validados. Para restringir a geração de SMILES, foi utilizada uma técnica denominada Fragmentation Growing Procedure, onde é possível escolher um fragmento e gerar SMILES a partir dele. Para analisar a rede recorrente que melhor se ajusta e os respetivos parâmetros, foram realizados alguns testes e a rede contida no modelo que atingiu o melhor resultado, 98% SMILES válidos e 93% SMILES únicos, foi uma LSTM com 2 camadas. A técnica de restrição de geração foi utilizada no melhor modelo e atingiu 99% dos SMILES válidos e 79% dos SMILES únicos.<br>Drug discovery aims to identify potential new medicines through a multidisciplinary process, including several scientific areas, such as biology, chemistry and pharmacology. Nowadays, multiple strategies and methodologies have been developed to discover, test and optimise new drugs. However, there is a long process from target identification to an optimal marketable molecule. The main purpose of this dissertation is to develop computational models able to propose new drug compounds. In order to achieve this goal, the artificial neural networks explored and trained to generate new drugs in the form of Simplified Molecular-Input Line-Entry System (SMILES). The explored neural networks model were Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) and Bidirectional Long-Short Term Memory (BLSTM). A consistent dataset was chosen, and the generated SMILES by the model were syntactically and biochemically validated. In order to restrict the generation of SMILES, a technique denominated Fragmentation Growing Procedure was used, where made it possible to choose a fragment and generate SMILES from that. To analyse the recurrent network that fits the best and the respective parameters, some tests were performed, and the network contained in the model that reached the best result, 98% of valid SMILES and 93% of unique SMILES, was an LSTM with two layers. The technique to restrict the generation was used in the best model and reached 99% of valid SMILES and 79% of unique SMILES.
APA, Harvard, Vancouver, ISO, and other styles
31

Dlamini, Wisdom Mdumiseni Dabulizwe. "Spatial analysis of invasive alien plant distribution patterns and processes using Bayesian network-based data mining techniques." Thesis, 2016. http://hdl.handle.net/10500/20692.

Full text
Abstract:
Invasive alien plants have widespread ecological and socioeconomic impacts throughout many parts of the world, including Swaziland where the government declared them a national disaster. Control of these species requires knowledge on the invasion ecology of each species including how they interact with the invaded environment. Species distribution models are vital for providing solutions to such problems including the prediction of their niche and distribution. Various modelling approaches are used for species distribution modelling albeit with limitations resulting from statistical assumptions, implementation and interpretation of outputs. This study explores the usefulness of Bayesian networks (BNs) due their ability to model stochastic, nonlinear inter-causal relationships and uncertainty. Data-driven BNs were used to explore patterns and processes influencing the spatial distribution of 16 priority invasive alien plants in Swaziland. Various BN structure learning algorithms were applied within the Weka software to build models from a set of 170 variables incorporating climatic, anthropogenic, topo-edaphic and landscape factors. While all the BN models produced accurate predictions of alien plant invasion, the globally scored networks, particularly the hill climbing algorithms, performed relatively well. However, when considering the probabilistic outputs, the constraint-based Inferred Causation algorithm which attempts to generate a causal BN structure, performed relatively better. The learned BNs reveal that the main pathways of alien plants into new areas are ruderal areas such as road verges and riverbanks whilst humans and human activity are key driving factors and the main dispersal mechanism. However, the distribution of most of the species is constrained by climate particularly tolerance to very low temperatures and precipitation seasonality. Biotic interactions and/or associations among the species are also prevalent. The findings suggest that most of the species will proliferate by extending their range resulting in the whole country being at risk of further invasion. The ability of BNs to express uncertain, rather complex conditional and probabilistic dependencies and to combine multisource data makes them an attractive technique for species distribution modeling, especially as joint invasive species distribution models (JiSDM). Suggestions for further research are provided including the need for rigorous invasive species monitoring, data stewardship and testing more BN learning algorithms.<br>Environmental Sciences<br>D. Phil. (Environmental Science)
APA, Harvard, Vancouver, ISO, and other styles
32

Wei, Cheng-Kuan, and 魏誠寬. "Speaker Adaptation by Joint Learning the HMM states of Phoneme Models and Acoustic Tokens Discovered without Annotations." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/53834730378144175544.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

BALENA, Pasquale. "Local Knowledge and Social Sensors: Integrated Models of Text Analysis for Disaster Response." Doctoral thesis, 2017. http://hdl.handle.net/11589/100381.

Full text
Abstract:
The present doctoral research investigates the role of local knowledge in supporting disaster response, by applying cognitive, predictive and ontological models to the study of text message exchange in relevant Social Networks. The lack of studies on local knowledge in risk domains, along with the growing attention of scholars and decision makers to open governance processes, underpin the three main research questions that have been addressed. As for the relevance of local knowledge to disaster response, human communication – in real or simulated emergency situations – seems to be imbued with culturally mediated understandings of spatiality, relationality and actions. Innovative spatial data science tools are therefore needed for tacit and vernacular knowledge to be adequately modelled and operationalized. With respect to the use of text messages in disaster response, it appears that distributed systems (which combine crowdsourcing methods with collaborative hypertext editors) may effectively complement volunteered or public participation GIS – to harness the potential of all-purpose social networks to reach out to the wider internet community. Finally, to foster the interpretation of text messages, three separate taxonomies (regarding Spatial Location, Needs and Actors) – each being linked to a terminal entity in DOLCE foundational ontology –, helped develop a shared conceptualization of risk. Future developments of the present work could concern the further integration between machine learning and ontological models, and advancing text classification methods.
APA, Harvard, Vancouver, ISO, and other styles
34

Podhajská, Kristýna. "Badatelské vyučování matematice v tématu zlomek." Master's thesis, 2017. http://www.nusl.cz/ntk/nusl-346156.

Full text
Abstract:
The theoretical part deals with approaches to teaching, especially constructivistic, instructivistic and transmissive approach. The goal of this part is to compare the approaches and evaluate them considering the permanency of knowledge and active engagement of students. Simultaneously it aims at evaluating the approaches from the point of their of the mechanical knowledge. The charter dealing with inquiry based mathematics education is focused on teaching related to constructivistic approach. The next chapter describes tool called concept cartoons, which can be used as a mean to research. The theoretical part is focused on the topic of fraction, especially fraction interpretation and models and fraction representation. The second part of the thesis describes a survey realized at basic school. I narrowed the problem to fraction interpretation and representation according to survey of student's work. The next part of research is based on this analysis. Trying to accomplish the aims and find the answers to questions of the survey I chose strategy inspired by action research, because the main goal of the research was to improve and enhance my teaching knowledge and skills. The survey was processed qualitatively. The making of the thesis helped me make it clear, why the topic of fractions is so...
APA, Harvard, Vancouver, ISO, and other styles
35

Zenkl, David. "Kosinová a sinová věta na střední škole." Master's thesis, 2016. http://www.nusl.cz/ntk/nusl-344975.

Full text
Abstract:
This thesis is concerned with a constructivist approach to the introduction of the cosine and sine theorems at the secondary school. The aim was to develop recommendations for teaching which are based on the idea of motivating teaching cosine and sine theorems. This approach is based on available literature and builds on experience from my own teaching of this topic. By motivating teaching, I mean an approach that is consistent with the principles of constructivism and emphasizes pupils' active learning. Current textbooks for secondary schools were analyzed from a mathematical and didactic point of view. The aim of this analysis is to describe how the topic is elaborated in publications available to teachers, and to get inspiration for my own approach. My own teaching approach was based on the theory of generic models and has been implemented in two classes of the secondary grammar school. Data collected during teaching cosine and sine theorems (video recordings of lessons, field notes from teaching and pupil artifacts) were analyzed in a qualitative way. The thesis describes the teaching in detail, with an emphasis on key phases of the discovery of the two theorems. Pupils' involvement in this process is closely followed. Where teaching did not work as planned, possible reasons are found and...
APA, Harvard, Vancouver, ISO, and other styles
36

(6326255), Stefan M. Irby. "Evaluation of a Novel Biochemistry Course-Based Undergraduate Research Experience (CURE)." Thesis, 2019.

Find full text
Abstract:
<p>Course-based Undergraduate Research Experiences (CUREs) have been described in a range of educational contexts. Although various learning objectives, termed anticipated learning outcomes (ALOs) in this project, have been proposed, processes for identifying them may not be rigorous or well-documented, which can lead to inappropriate assessment and speculation about what students actually learn from CUREs. Additionally, evaluation of CUREs has primarily relied on student and instructor perception data rather than more reliable measures of learning.This dissertation investigated a novel biochemistry laboratory curriculum for a Course-based Undergraduate Research Experience (CURE) known as the Biochemistry Authentic Scientific Inquiry Lab (BASIL). Students participating in this CURE use a combination of computational and biochemical wet-lab techniques to elucidate the function of proteins of known structure but unknown function. The goal of the project was to evaluate the efficacy of the BASIL CURE curriculum for developing students’ research abilities across implementations. Towards achieving this goal, we addressed the following four research questions (RQs): <b>RQ1</b>) How can ALOs be rigorously identified for the BASIL CURE; <b>RQ2</b>) How can the identified ALOs be used to develop a matrix that characterizes the BASIL CURE; <b>RQ3</b>) What are students’ perceptions of their knowledge, confidence and competence regarding their abilities to perform the top-rated ALOs for this CURE; <b>RQ4</b>) What are appropriate assessments for student achievement of the identified ALOs and what is the nature of student learning, and related difficulties, developed by students during the BASIL CURE? To address these RQs, this project focused on the development and use of qualitative and quantitative methods guided by constructivism and situated cognition theoretical frameworks. Data was collected using a range of instruments including, content analysis, Qualtrics surveys, open-ended questions and interviews, in order to identify ALOs and to determine student learning for the BASIL CURE. Analysis of the qualitative data was through inductive coding guided by the concept-reasoning-mode (CRM) model and the assessment triangle, while analysis of quantitative data was done by using standard statistical techniques (e.g. conducting a parried t-test and effect size). The results led to the development of a novel method for identifying ALOs, namely a process for identifying course-based undergraduate research abilities (PICURA; RQ1; Irby, Pelaez, & Anderson 2018b). Application of PICURA to the BASIL CURE resulted in the identification and rating by instructors of a wide range of ALOs, termed course-based undergraduate research abilities (CURAs), which were formulated into a matrix (RQs 2; Irby, Pelaez, & Anderson, 2018a,). The matrix was, in turn, used to characterize the BASIL CURE and to inform the design of student assessments aimed at evaluating student development of the identified CURAs (RQs 4; Irby, Pelaez, & Anderson, 2018a). Preliminary findings from implementation of the open-ended assessments in a small case study of students, revealed a range of student competencies for selected top-rated CURAs as well as evidence for student difficulties (RQ4). In this way we were able to confirm that students are developing some of the ALOs as actual learning outcomes which we term VLOs or verified learning outcomes. In addition, a participant perception indicator (PPI) survey was used to gauge students’ perceptions of their gains in knowledge, experience, and confidence during the BASIL CURE and, therefore, to inform which CURAs should be specifically targeted for assessment in specific BASIL implementations (RQ3;). These results indicate that, across implementations of the CURE, students perceived significant gains with large effect sizes in their knowledge, experience, and confidence for items on the PPI survey (RQ3;). In our view, the results of this dissertation will make important contributions to the CURE literature, as well as to the biochemistry education and assessment literature in general. More specifically, it will significantly improve understanding of the nature of student learning from CUREs and how to identify ALOs and design assessments that reveal what students actually learn from such CUREs - an area where there has been a dearth of available knowledge in the past. The outcomes of this dissertation could also help instructors and administrators identify and align assessments with the actual features of a CURE (or courses in general), use the identified CURAs to ensure the material fits departmental or university needs, and evaluate the benefits of students participating in these innovative curricula. Future research will focus on expanding the development and validation of assessments so that practitioners can better evaluate the efficacy of their CUREs for developing the research competencies of their undergraduate students and continue to render improvements to their curricula.</p>
APA, Harvard, Vancouver, ISO, and other styles
37

Dilrajh, Kamla Moonsamy. "Fasilitering van leer in kommunikatiewe T²-Afrikaanstaalonderrig." Diss., 1998. http://hdl.handle.net/10500/18009.

Full text
Abstract:
Summaries in Afrikaans and English<br>In die studie is daar gepoog om aan te toon waarom die ondervindingsmod~l vir taalleer die aangewese model vir effektiewe tweedetaalleer is. Die kommunikatiewe onderrigbenaderingswyse, onderhandeling in die klaskamer en die belangrikheid van die prosessillabus in tweedetaalverwerwing is bespreek. Die taalonderwyser se rot as fasiliteerder van leer in kommunikatiewe FAfrikaanstaalonderrig in die interaktiewe klaskamer met klem op leerdergesentreerde onderrig is uiteengesit. Daar is verder aangetoon dat daar ten opsigte van die rot van die onderwyser 'n paradigmatiese verskuiwing moet plaasvind, veral noudat beginsels van uitkomsgebaseerde onderrig wat deel van kurrikulum 2005 vorm, in 1998/1999 in aile Suid-Afrikaanse skole ingestel is. Die onderwyser is nou 'n fasiliteerder van kennis, nie 'n oordraer daarvan nie. Belangrike aspekte van leer wat leerders se tweedetaalleer be'invloed, is bespreek, byvoorbeeld klaskamerkommunikasie, fasilitering, suggestopedia, faktore wat begrip van leerstof be'invloed, onderwyser - en leerdergedragswyses, positiewe /eeratmosfeer, behandeling van leerderfoute, Jeerderpersepsies, kommunikatiewe strategiee en evalueringsmetodes. 'n Verskeidenheid taallesse wat op T2-Afrikaans en die T2-taalklaskamer betrekking het, en wat verskillende onderrigteoriee, uitkomsgebaseerde onderrig en die ses taalvaardighede integreer, word in hoofstuk 5 ge'illustreer.<br>In this study it is shown why the discovery model of language learning is the appropriate model for effective language learning. The communicative teaching approach, classroom-negotiation and the importance of the process syllabus in second language acquisition is discussed. The language teacher's role as facilitator of learning, in communicative L2 - Afrikaans language teaching in the interactive classroom with a learner-centered focus is explained. It is further shown that the role of the teacher must undergo a paradigm shift especially now that principles of outcomes based education which forms part of curriculum 2005 has been introduced into all schools in South Africa in 1998/1999. The teacher is now a facilitator of knowledge and not a transmitter thereof. Important aspects of learning that influence learners' second language learning are discussed, for example classroom communication, facilitation, suggestopedia, factors that influence the understanding of subject matter, teacher and learner behaviours, positive learning atmosphere, treatment of Ieamer errors, learner perceptions, communicative strategies and methods of evaluation. A variety of language lessons which integrate various teaching theories, outcomes based education and the six language learning skills which are related to L 2-Afrikaans and the L 2-classroom are illustrated in Chapter 5.<br>Afrikaans and Theory of Literature<br>M.A. (Afrikaans)
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!