To see the other types of publications on this topic, follow the link: Architectures and machine learning models.

Dissertations / Theses on the topic 'Architectures and machine learning models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Architectures and machine learning models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Aihe, David. "A REINFORCEMENT LEARNING TECHNIQUE FOR ENHANCING HUMAN BEHAVIOR MODELS IN A CONTEXT-BASED ARCHITECTURE." Doctoral diss., University of Central Florida, 2008. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2408.

Full text
Abstract:
A reinforcement-learning technique for enhancing human behavior models in a context-based learning architecture is presented. Prior to the introduction of this technique, human models built and developed in a Context-Based reasoning framework lacked learning capabilities. As such, their performance and quality of behavior was always limited by what the subject matter expert whose knowledge is modeled was able to articulate or demonstrate. Results from experiments performed show that subject matter experts are prone to making errors and at times they lack information on situations that are inherently necessary for the human models to behave appropriately and optimally in those situations. The benefits of the technique presented is two fold; 1) It shows how human models built in a context-based framework can be modified to correctly reflect the knowledge learnt in a simulator; and 2) It presents a way for subject matter experts to verify and validate the knowledge they share. The results obtained from this research show that behavior models built in a context-based framework can be enhanced by learning and reflecting the constraints in the environment. From the results obtained, it was shown that after the models are enhanced, the agents performed better based on the metrics evaluated. Furthermore, after learning, the agent was shown to recognize unknown situations and behave appropriately in previously unknown situations. The overall performance and quality of behavior of the agent improved significantly.<br>Ph.D.<br>School of Electrical Engineering and Computer Science<br>Engineering and Computer Science<br>Computer Engineering PhD
APA, Harvard, Vancouver, ISO, and other styles
2

Faccin, João Guilherme. "Preference and context-based BDI plan selection using machine learning : from models to code generation." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2016. http://hdl.handle.net/10183/138209.

Full text
Abstract:
A tecnologia de agentes surge como uma solução que fornece flexibilidade e robustez para lidar com domínios dinâmicos e complexos. Tal flexibilidade pode ser alcançada através da adoção de abordagens já existentes baseadas em agentes, como a arquitetura BDI, que provê agentes com características mentais de crenças, desejos e intenções. Essa arquitetura é altamente personalizável, deixando lacunas a serem preenchidas de acordo com aplicações específicas. Uma dessas lacunas é o algoritmo de seleção de planos, responsável por selecionar um plano para ser executado pelo agente buscando atingir um objetivo, e tendo grande influência no desempenho geral do agente. Grande parte das abordagens existentes requerem considerável esforço para personalização e ajuste a fim de serem utilizadas em aplicações específicas. Nessa dissertação, propomos uma abordagem para seleção de planos apta a aprender quais planos possivelmente terão os melhores resultados, baseando-se no contexto atual e nas preferências do agente. Nossa abordagem é composta por um meta-modelo, que deve ser instanciado a fim de especificar metadados de planos, e uma técnica que usa tais metadados para aprender e predizer resultados da execução destes planos. Avaliamos nossa abordagem experimentalmente e os resultados indicam que ela é efetiva. Adicionalmente, fornecemos uma ferramenta para apoiar o processo de desenvolvimento de agentes de software baseados em nosso trabalho. Essa ferramenta permite que desenvolvedores modelem e gerem código-fonte para agentes BDI com capacidades de aprendizado. Um estudo com usuários foi realizado para avaliar os benefícios de um método de desenvolvimento baseado em agentes BDI auxiliado por ferramenta. Evidências sugerem que nossa ferramenta pode auxiliar desenvolvedores que não sejam especialistas ou que não estejam familiarizados com a tecnologia de agentes.<br>Agent technology arises as a solution that provides flexibility and robustness to deal with dynamic and complex domains. Such flexibility can be achieved by the adoption of existing agent-based approaches, such as the BDI architecture, which provides agents with the mental attitudes of beliefs, desires and intentions. This architecture is highly customisable, leaving gaps to be fulfilled in particular applications. One of these gaps is the plan selection algorithm that is responsible for selecting a plan to be executed by an agent to achieve a goal, having an important influence on the overall agent performance. Most existing approaches require considerable effort for customisation and adjustment to be used in particular applications. In this dissertation, we propose a plan selection approach that is able to learn plans that provide possibly best outcomes, based on current context and agent’s preferences. Our approach is composed of a meta-model, which must be instantiated to specify plan metadata, and a technique that uses such metadata to learn and predict plan outcomes. We evaluated our approach experimentally, and results indicate it is effective. Additionally, we provide a tool to support the development process of software agents based on our work. This tool allows developers to model and generate source code for BDI agents with learning capabilities. A user study was performed to assess the improvements of a tool-supported BDI-agent-based development method, and evidences suggest that our tool can help developers that are not experts or are unfamiliar with the agent technology.
APA, Harvard, Vancouver, ISO, and other styles
3

Templeton, Julian. "Designing Robust Trust Establishment Models with a Generalized Architecture and a Cluster-Based Improvement Methodology." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42556.

Full text
Abstract:
In Multi-Agent Systems consisting of intelligent agents that interact with one another, where the agents are software entities which represent individuals or organizations, it is important for the agents to be equipped with trust evaluation models which allow the agents to evaluate the trustworthiness of other agents when dishonest agents may exist in an environment. Evaluating trust allows agents to find and select reliable interaction partners in an environment. Thus, the cost incurred by an agent for establishing trust in an environment can be compensated if this improved trustworthiness leads to an increased number of profitable transactions. Therefore, it is equally important to design effective trust establishment models which allow an agent to generate trust among other agents in an environment. This thesis focuses on providing improvements to the designs of existing and future trust establishment models. Robust trust establishment models, such as the Integrated Trust Establishment (ITE) model, may use dynamically updated variables to adjust the predicted importance of a task’s criteria for specific trustors. This thesis proposes a cluster-based approach to update these dynamic variables more accurately to achieve improved trust establishment performance. Rather than sharing these dynamic variables globally, a model can learn to adjust a trustee’s behaviours more accurately to trustor needs by storing the variables locally for each trustor and by updating groups of these variables together by using data from a corresponding group of similar trustors. This work also presents a generalized trust establishment model architecture to help models be easier to design and be more modular. This architecture introduces a new transaction-level preprocessing module to help improve a model’s performance and defines a trustor-level postprocessing module to encapsulate the designs of existing models. The preprocessing module allows a model to fine-tune the resources that an agent will provide during a transaction before it occurs. A trust establishment model, named the Generalized Trust Establishment Model (GTEM), is designed to showcase the benefits of using the preprocessing module. Simulated comparisons between a cluster-based version of ITE and ITE indicate that the cluster-based approach helps trustees better meet the expectations of trustors while minimizing the cost of doing so. Comparing GTEM to itself without the preprocessing module and to two existing models in simulated tests exhibits that the preprocessing module improves a trustee’s trustworthiness and better meets trustor desires at a faster rate than without using preprocessing.
APA, Harvard, Vancouver, ISO, and other styles
4

González, Marcos Tulio Amarís. "Performance prediction of application executed on GPUs using a simple analytical model and machine learning techniques." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-06092018-213258/.

Full text
Abstract:
The parallel and distributed platforms of High Performance Computing available today have became more and more heterogeneous (CPUs, GPUs, FPGAs, etc). Graphics Processing Units (GPU) are specialized co-processor to accelerate and improve the performance of parallel vector operations. GPUs have a high degree of parallelism and can execute thousands or millions of threads concurrently and hide the latency of the scheduler. GPUs have a deep hierarchical memory of different types as well as different configurations of these memories. Performance prediction of applications executed on these devices is a great challenge and is essential for the efficient use of resources in machines with these co-processors. There are different approaches for these predictions, such as analytical modeling and machine learning techniques. In this thesis, we present an analysis and characterization of the performance of applications executed on GPUs. We propose a simple and intuitive BSP-based model for predicting the CUDA application execution times on different GPUs. The model is based on the number of computations and memory accesses of the GPU, with additional information on cache usage obtained from profiling. We also compare three different Machine Learning (ML) approaches: Linear Regression, Support Vector Machines and Random Forests with BSP-based analytical model. This comparison is made in two contexts, first, data input or features for ML techniques were the same than analytical model, and, second, using a process of feature extraction, using correlation analysis and hierarchical clustering. We show that GPU applications that scale regularly can be predicted with simple analytical models, and an adjusting parameter. This parameter can be used to predict these applications in other GPUs. We also demonstrate that ML approaches provide reasonable predictions for different cases and ML techniques required no detailed knowledge of application code, hardware characteristics or explicit modeling. Consequently, whenever a large data set with information about similar applications are available or it can be created, ML techniques can be useful for deploying automated on-line performance prediction for scheduling applications on heterogeneous architectures with GPUs.<br>As plataformas paralelas e distribuídas de computação de alto desempenho disponíveis hoje se tornaram mais e mais heterogêneas (CPUs, GPUs, FPGAs, etc). As Unidades de processamento gráfico são co-processadores especializados para acelerar operações vetoriais em paralelo. As GPUs têm um alto grau de paralelismo e conseguem executar milhares ou milhões de threads concorrentemente e ocultar a latência do escalonador. Elas têm uma profunda hierarquia de memória de diferentes tipos e também uma profunda configuração da memória hierárquica. A predição de desempenho de aplicações executadas nesses dispositivos é um grande desafio e é essencial para o uso eficiente dos recursos computacionais de máquinas com esses co-processadores. Existem diferentes abordagens para fazer essa predição, como técnicas de modelagem analítica e aprendizado de máquina. Nesta tese, nós apresentamos uma análise e caracterização do desempenho de aplicações executadas em Unidades de Processamento Gráfico de propósito geral. Nós propomos um modelo simples e intuitivo fundamentado no modelo BSP para predizer a execução de funções kernels de CUDA sobre diferentes GPUs. O modelo está baseado no número de computações e acessos à memória da GPU, com informação adicional do uso das memórias cachês obtidas do processo de profiling. Nós também comparamos três diferentes enfoques de aprendizado de máquina (ML): Regressão Linear, Máquinas de Vetores de Suporte e Florestas Aleatórias com o nosso modelo analítico proposto. Esta comparação é feita em dois diferentes contextos, primeiro, dados de entrada ou features para as técnicas de aprendizado de máquinas eram as mesmas que no modelo analítico, e, segundo, usando um processo de extração de features, usando análise de correlação e clustering hierarquizado. Nós mostramos que aplicações executadas em GPUs que escalam regularmente podem ser preditas com modelos analíticos simples e um parâmetro de ajuste. Esse parâmetro pode ser usado para predizer essas aplicações em outras GPUs. Nós também demonstramos que abordagens de ML proveem predições aceitáveis para diferentes casos e essas abordagens não exigem um conhecimento detalhado do código da aplicação, características de hardware ou modelagens explícita. Consequentemente, sempre e quando um banco de dados com informação de \\textit esteja disponível ou possa ser gerado, técnicas de ML podem ser úteis para aplicar uma predição automatizada de desempenho para escalonadores de aplicações em arquiteturas heterogêneas contendo GPUs.
APA, Harvard, Vancouver, ISO, and other styles
5

Kundu, Sajib. "Improving Resource Management in Virtualized Data Centers using Application Performance Models." FIU Digital Commons, 2013. http://digitalcommons.fiu.edu/etd/874.

Full text
Abstract:
The rapid growth of virtualized data centers and cloud hosting services is making the management of physical resources such as CPU, memory, and I/O bandwidth in data center servers increasingly important. Server management now involves dealing with multiple dissimilar applications with varying Service-Level-Agreements (SLAs) and multiple resource dimensions. The multiplicity and diversity of resources and applications are rendering administrative tasks more complex and challenging. This thesis aimed to develop a framework and techniques that would help substantially reduce data center management complexity. We specifically addressed two crucial data center operations. First, we precisely estimated capacity requirements of client virtual machines (VMs) while renting server space in cloud environment. Second, we proposed a systematic process to efficiently allocate physical resources to hosted VMs in a data center. To realize these dual objectives, accurately capturing the effects of resource allocations on application performance is vital. The benefits of accurate application performance modeling are multifold. Cloud users can size their VMs appropriately and pay only for the resources that they need; service providers can also offer a new charging model based on the VMs performance instead of their configured sizes. As a result, clients will pay exactly for the performance they are actually experiencing; on the other hand, administrators will be able to maximize their total revenue by utilizing application performance models and SLAs. This thesis made the following contributions. First, we identified resource control parameters crucial for distributing physical resources and characterizing contention for virtualized applications in a shared hosting environment. Second, we explored several modeling techniques and confirmed the suitability of two machine learning tools, Artificial Neural Network and Support Vector Machine, to accurately model the performance of virtualized applications. Moreover, we suggested and evaluated modeling optimizations necessary to improve prediction accuracy when using these modeling tools. Third, we presented an approach to optimal VM sizing by employing the performance models we created. Finally, we proposed a revenue-driven resource allocation algorithm which maximizes the SLA-generated revenue for a data center.
APA, Harvard, Vancouver, ISO, and other styles
6

Evgeniou, Theodoros K. (Theodoros Kostantinos) 1974. "Learning with kernel machine architectures." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86442.

Full text
Abstract:
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.<br>Includes bibliographical references (p. 99-106).<br>by Theodoros K. Evgeniou.<br>Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
7

de, la Rúa Martínez Javier. "Scalable Architecture for Automating Machine Learning Model Monitoring." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280345.

Full text
Abstract:
Last years, due to the advent of more sophisticated tools for exploratory data analysis, data management, Machine Learning (ML) model training and model serving into production, the concept of MLOps has gained more popularity. As an effort to bring DevOps processes to the ML lifecycle, MLOps aims at more automation in the execution of diverse and repetitive tasks along the cycle and at smoother interoperability between teams and tools involved. In this context, the main cloud providers have built their own ML platforms [4, 34, 61], offered as services in their cloud solutions. Moreover, multiple frameworks have emerged to solve concrete problems such as data testing, data labelling, distributed training or prediction interpretability, and new monitoring approaches have been proposed [32, 33, 65]. Among all the stages in the ML lifecycle, one of the most commonly overlooked although relevant is model monitoring. Recently, cloud providers have presented their own tools to use within their platforms [4, 61] while work is ongoing to integrate existent frameworks [72] into open-source model serving solutions [38]. Most of these frameworks are either built as an extension of an existent platform (i.e lack portability), follow a scheduled batch processing approach at a minimum rate of hours, or present limitations for certain outliers and drift algorithms due to the platform architecture design in which they are integrated. In this work, a scalable automated cloudnative architecture is designed and evaluated for ML model monitoring in a streaming approach. An experimentation conducted on a 7-node cluster with 250.000 requests at different concurrency rates shows maximum latencies of 5.9, 29.92 and 30.86 seconds after request time for 75% of distance-based outliers detection, windowed statistics and distribution-based data drift detection, respectively, using windows of 15 seconds length and 6 seconds of watermark delay.<br>Under de senaste åren har konceptet MLOps blivit alltmer populärt på grund av tillkomsten av mer sofistikerade verktyg för explorativ dataanalys, datahantering, modell-träning och model serving som tjänstgör i produktion. Som ett försök att föra DevOps processer till Machine Learning (ML)-livscykeln, siktar MLOps på mer automatisering i utförandet av mångfaldiga och repetitiva uppgifter längs cykeln samt på smidigare interoperabilitet mellan team och verktyg inblandade. I det här sammanhanget har de största molnleverantörerna byggt sina egna ML-plattformar [4, 34, 61], vilka erbjuds som tjänster i deras molnlösningar. Dessutom har flera ramar tagits fram för att lösa konkreta problem såsom datatestning, datamärkning, distribuerad träning eller tolkning av förutsägelse, och nya övervakningsmetoder har föreslagits [32, 33, 65]. Av alla stadier i ML-livscykeln förbises ofta modellövervakning trots att det är relevant. På senare tid har molnleverantörer presenterat sina egna verktyg att kunna användas inom sina plattformar [4, 61] medan arbetet pågår för att integrera befintliga ramverk [72] med lösningar för modellplatformer med öppen källkod [38]. De flesta av dessa ramverk är antingen byggda som ett tillägg till en befintlig plattform (dvs. saknar portabilitet), följer en schemalagd batchbearbetningsmetod med en lägsta hastighet av ett antal timmar, eller innebär begränsningar för vissa extremvärden och drivalgoritmer på grund av plattformsarkitekturens design där de är integrerade. I det här arbetet utformas och utvärderas en skalbar automatiserad molnbaserad arkitektur för MLmodellövervakning i en streaming-metod. Ett experiment som utförts på ett 7nodskluster med 250.000 förfrågningar vid olika samtidigheter visar maximala latenser på 5,9, 29,92 respektive 30,86 sekunder efter tid för förfrågningen för 75% av avståndsbaserad detektering av extremvärden, windowed statistics och distributionsbaserad datadriftdetektering, med hjälp av windows med 15 sekunders längd och 6 sekunders fördröjning av vattenstämpel.
APA, Harvard, Vancouver, ISO, and other styles
8

Fox, Sean. "Specialised Architectures and Arithmetic for Machine Learning." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/26893.

Full text
Abstract:
Machine learning has risen to prominence in recent years thanks to advancements in computer technology, the abundance of data, and numerous breakthroughs in a broad range of applications. Unfortunately, as the demand for machine learning has grown, so too has the amount of computation required for training. Combine this trend with declines observed in performance scaling of standard computer architectures, and it has become increasingly difficult to support machine learning training at increased speed and scale, especially in embedded devices which are smaller and have stricter constraints. Research points towards the development of purpose-built hardware accelerators to overcome the computing challenge, and this thesis explains how specialised hardware architectures and specialised computer arithmetic can achieve performance not possible with standard technology, e.g. Graphics Processing Units (GPUs) and floating-point arithmetic. Based on the implementation of kernel methods and deep neural network (DNN) algorithms using Field Programmable Gate Arrays (FPGAs), this thesis shows how specialised arithmetic is crucial for accurately training large models with less memory, while specialised architectures are needed to increase computational parallelism and reduce off-chip memory transfers. These outcomes are an important step towards moving more machine intelligence into e.g. mobile phones, video cameras, radios, and satellites.
APA, Harvard, Vancouver, ISO, and other styles
9

Moss, Duncan J. M. "FPGA Architectures for Low Precision Machine Learning." Thesis, The University of Sydney, 2017. http://hdl.handle.net/2123/18182.

Full text
Abstract:
Machine learning is fast becoming a cornerstone in many data analytic, image processing and scientific computing applications. Depending on the deployment scale, these tasks can either be performed on embedded devices, or larger cloud computing platforms. However, one key trend is an exponential increase in the required compute power as data is collected and processed at a previously unprecedented scale. In an effort to reduce the computational complexity there has been significant work on reduced precision representations. Unlike Central Processing Units, Graphical Processing Units and Applications Specific Integrated Circuits which have fixed datapaths, Field Programmable Gate Arrays (FPGA) are flexible and uniquely positioned to take advantage of reduced precision representations. This thesis presents FPGA architectures for low precision machine learning algorithms, considering three distinct levels: the application, the framework and the operator. Firstly, a spectral anomaly detection application is presented, designed for low latency and real-time processing of radio signals. Two types of detector are explored, a neural network autoencoder and least squares bitmap detector. Secondly, a generalised matrix multiplication framework for the Intel HARPv2 is outlined. The framework was designed specifically for machine learning applications; containing runtime configurable optimisations for reduced precision deep learning. Finally, a new machine learning specific operator is presented. A bit-dependent multiplication algorithm designed to conditionally add only the relevant parts of the operands and arbitrarily skip over redundant computation. Demonstrating optimisations on all three levels; the application, the framework and the operator, illustrates that FPGAs can achieve state-of-the-art performance in important machine learning workloads where high performance is critical; while simultaneously reducing implementation complexity.
APA, Harvard, Vancouver, ISO, and other styles
10

Lounici, Sofiane. "Watermarking machine learning models." Electronic Thesis or Diss., Sorbonne université, 2022. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2022SORUS282.pdf.

Full text
Abstract:
La protection de la propriété intellectuelle des modèles d’apprentissage automatique apparaît de plus en plus nécessaire, au vu des investissements et de leur impact sur la société. Dans cette thèse, nous proposons d’étudier le tatouage de modèles d’apprentissage automatique. Nous fournissons un état de l’art sur les techniques de tatouage actuelles, puis nous le complétons en considérant le tatouage de modèles au-delà des tâches de classification d’images. Nous définissons ensuite les attaques de contrefaçon contre le tatouage pour les plateformes d’hébergement de modèles, et nous présentons une nouvelle technique de tatouages par biais algorithmique. De plus, nous proposons une implémentation des techniques présentées<br>The protection of the intellectual property of machine learning models appears to be increasingly necessary, given the investments and their impact on society. In this thesis, we propose to study the watermarking of machine learning models. We provide a state of the art on current watermarking techniques, and then complement it by considering watermarking beyond image classification tasks. We then define forging attacks against watermarking for model hosting platforms and present a new fairness-based watermarking technique. In addition, we propose an implementation of the presented techniques
APA, Harvard, Vancouver, ISO, and other styles
11

Markatopoulou, Foteini. "Machine learning architectures for video annotation and retrieval." Thesis, Queen Mary, University of London, 2018. http://qmro.qmul.ac.uk/xmlui/handle/123456789/44693.

Full text
Abstract:
In this thesis we are designing machine learning methodologies for solving the problem of video annotation and retrieval using either pre-defined semantic concepts or ad-hoc queries. Concept-based video annotation refers to the annotation of video fragments with one or more semantic concepts (e.g. hand, sky, running), chosen from a predefined concept list. Ad-hoc queries refer to textual descriptions that may contain objects, activities, locations etc., and combinations of the former. Our contributions are: i) A thorough analysis on extending and using different local descriptors towards improved concept-based video annotation and a stacking architecture that uses in the first layer, concept classifiers trained on local descriptors and improves their prediction accuracy by implicitly capturing concept relations, in the last layer of the stack. ii) A cascade architecture that orders and combines many classifiers, trained on different visual descriptors, for the same concept. iii) A deep learning architecture that exploits concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. At a second level, we build on ideas from structured output learning, and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). iv) A fully-automatic ad-hoc video search architecture that combines concept-based video annotation and textual query analysis, and transforms concept-based keyframe and query representations into a common semantic embedding space. Our architectures have been extensively evaluated on the TRECVID SIN 2013, the TRECVID AVS 2016, and other large-scale datasets presenting their effectiveness compared to other similar approaches.
APA, Harvard, Vancouver, ISO, and other styles
12

Frindt, Faundez Catharina, and Sivan Dawood. "Automatic Generation of Real-Time Machine Learning Architectures." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-45541.

Full text
Abstract:
An era is rising where more embedded systems are being moved to the edge. Everything from automated vehicles to smartphones with more complex machine learning architectures needs to be provided. Hence, the requirements of contributing with efficiency emerge more. Not only is this required for offline applications, but the demand is also rising for real-time applications. Therefore, software developers that are experts in real-time machine learning architecture may be in a situation where this architecture needs to be implemented on an embedded system that can provide the efficiency that is being demanded.FPGAs can provide with these demands. However, it is time-consuming to implement hardware description language (HDL) if not an expert. Our project has a main focus on building a frontend tool that gen-erates a dataflow programming language called CAL from a cus-tomized implementation for a specific machine learning model. The dataflow programming language CAL is used to accomplish an ef-ficient generation of hardware circuits. In this project, our primary focus is latency.The execution time of a software implementation has been compared to a hardware implementation where a Raspberry Pi 3b has con-tributed with the software implementation. A design space explo-ration has been done where different designs from the same model have been analyzed. In addition, the modules have also been ana-lyzed separately. In the analysis, latency is the factor explored.Results present a much faster execution time on the hardware im-plementation than the software implementation. Final results demon-strate a lower overall delay for modules implemented in parallel over modules implemented in serial. A parallel implementation reduced the overall delay with 242%.
APA, Harvard, Vancouver, ISO, and other styles
13

Hardoon, David Roi. "Semantic models for machine learning." Thesis, University of Southampton, 2006. https://eprints.soton.ac.uk/262019/.

Full text
Abstract:
In this thesis we present approaches to the creation and usage of semantic models by the analysis of the data spread in the feature space. We aim to introduce the general notion of using feature selection techniques in machine learning applications. The applied approaches obtain new feature directions on data, such that machine learning applications would show an increase in performance. We review three principle methods that are used throughout the thesis. Firstly Canonical Correlation Analysis (CCA), which is a method of correlating linear relationships between two multidimensional variables. CCA can be seen as using complex labels as a way of guiding feature selection towards the underlying semantics. CCA makes use of two views of the same semantic object to extract a representation of the semantics. Secondly Partial Least Squares (PLS), a method similar to CCA. It selects feature directions that are useful for the task at hand, though PLS only uses one view of an object and the label as the corresponding pair. PLS could be thought of as a method that looks for directions that are good for distinguishing the different labels. The third method is the Fisher kernel. A method that aims to extract more information of a generative model than simply by their output probabilities. The aim is to analyse how the Fisher score depends on the model and which aspects of the model are important in determining the Fisher score. We focus our theoretical investigation primarily on CCA and its kernel variant. Providing a theoretical analysis of the method's stability using Rademacher complexity, hence deriving the error bound for new data. We conclude the thesis by applying the described approaches to problems in the various fields of image, text, music application and medical analysis, describing several novel applications on relevant real-world data. The aim of the thesis is to provide a theoretical understanding of semantic models, while also providing a good application foundation on how these models can be practically used.
APA, Harvard, Vancouver, ISO, and other styles
14

Stark, Randall J. "Connectionist variable binding architectures." Thesis, University of Sussex, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.260835.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Losing, Viktor [Verfasser]. "Memory Models for Incremental Learning Architectures / Viktor Losing." Bielefeld : Universitätsbibliothek Bielefeld, 2019. http://d-nb.info/1191896420/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Bone, Nicholas. "Models of programs and machine learning." Thesis, University of Oxford, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.244565.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Zhu, Xiaodan. "On Cross-Series Machine Learning Models." W&M ScholarWorks, 2020. https://scholarworks.wm.edu/etd/1616444550.

Full text
Abstract:
Sparse high dimensional time series are common in industry, such as in supply chain demand and retail sales. Accurate and reliable forecasting of high dimensional time series is essential for supply chain planning and business management. In practical applications, sparse high dimensional time series prediction faces three challenges: (1) simple models cannot capture complex patterns, (2) insufficient data prevents us from pursuing more advanced models, and (3) time series in the same dataset may have widely different properties. These challenges prevent the currently prevalent models and theoretically successful advanced models (e.g., neural networks) from working in actual use. We focus our research on a pharmaceutical (pharma) demand forecasting problem. To overcome the challenges faced by sparse high dimensional time series, we develop a cross-series learning framework that trains a machine learning model on multiple related time series and uses cross-series information to improve forecasting accuracy. Cross-series learning is further optimized by dividing the global time series into subgroups based on three grouping schemes to balance the tradeoff between sample size and sample quality. Moreover, downstream inventory is introduced as an additional feature to support demand forecasting. Combining the cross-series learning framework with advanced machine learning models, we significantly improve the accuracy of pharma demand predictions. To verify the generalizability of cross-series learning, a generic forecasting framework containing the operations required for cross-series learning is developed and applied to retail sales forecasting. We further confirm the benefits of cross-series learning for advanced models, especially RNN. In addition to the grouping schemes based on product characteristics, we also explore two grouping schemes based on time series clustering, which do not require domain knowledge and can be applied to other fields. Using a retail sales dataset, our cross-series machine learning models are still superior to the baseline models. This dissertation develops a collection of cross-series learning techniques optimized for sparse high dimensional time series that can be applied to pharma manufacturers, retailers, and possibly other industries. Extensive experiments are carried out on real datasets to provide empirical value and insights for relevant theoretical studies. In practice, our work guides the actual use of cross-series learning.
APA, Harvard, Vancouver, ISO, and other styles
18

Amerineni, Rajesh. "BRAIN-INSPIRED MACHINE LEARNING CLASSIFICATION MODELS." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/dissertations/1806.

Full text
Abstract:
This dissertation focuses on the development of three classes of brain-inspired machine learning classification models. The models attempt to emulate (a) multi-sensory integration, (b) context-integration, and (c) visual information processing in the brain.The multi-sensory integration models are aimed at enhancing object classification through the integration of semantically congruent unimodal stimuli. Two multimodal classification models are introduced: the feature integrating (FI) model and the decision integrating (DI) model. The FI model, inspired by multisensory integration in the subcortical superior colliculus, combines unimodal features which are subsequently classified by a multimodal classifier. The DI model, inspired by integration in primary cortical areas, classifies unimodal stimuli independently using unimodal classifiers and classifies the combined decisions using a multimodal classifier. The multimodal classifier models are be implemented using multilayer perceptrons and multivariate statistical classifiers. Experiments involving the classification of noisy and attenuated auditory and visual representations of ten digits are designed to demonstrate the properties of the multimodal classifiers and to compare the performances of multimodal and unimodal classifiers. The experimental results show that the multimodal classification systems exhibit an important aspect of the “inverse effectiveness principle” by yielding significantly higher classification accuracies when compared with those of the unimodal classifiers. Furthermore, the flexibility offered by the generalized models enables the simulations and evaluations of various combinations of multimodal stimuli and classifiers under varying uncertainty conditions. The context-integrating model emulates the brain’s ability to use contextual information to uniquely resolve the interpretation of ambiguous stimuli. A deep learning neural network classification model that emulates this ability by integrating weighted bidirectional context into the classification process is introduced. The model, referred to as the CINET, is implemented using a convolution neural network (CNN), which is shown to be ideal for combining target and context stimuli and for extracting coupled target-context features. The CINET parameters can be manipulated to simulate congruent and incongruent context environments and to manipulate target-context stimuli relationships. The formulation of the CINET is quite general; consequently, it is not restricted to stimuli in any particular sensory modality nor to the dimensionality of the stimuli. A broad range of experiments are designed to demonstrate the effectiveness of the CINET in resolving ambiguous visual stimuli and in improving the classification of non-ambiguous visual stimuli in various contextual environments. The fact that the performance improves through the inclusion of context can be exploited to design robust brain-inspired machine learning algorithms. It is interesting to note that the CINET is a classification model that is inspired by a combination of brain’s ability to integrate contextual information and the CNN, which is inspired by the hierarchical processing of visual information in the visual cortex. A convolution neural network (CNN) model, inspired by the hierarchical processing of visual information in the brain, is introduced to fuse information from an ensemble of multi-axial sensors in order to classify strikes such as boxing punches and taekwondo kicks in combat sports. Although CNNs are not an obvious choice for non-array data nor for signals with non-linear variations, it will be shown that CNN models can effectively classify multi-axial multi-sensor signals. Experiments involving the classification of three-axis accelerometer and three-axes gyroscope signals measuring boxing punches and taekwondo kicks showed that the performance of the fusion classifiers were significantly superior to the uni-axial classifiers. Interestingly, the classification accuracies of the CNN fusion classifiers were significantly higher than those of the DTW fusion classifiers. Through training with representative signals and the local feature extraction property, the CNNs tend to be invariant to the latency shifts and non-linear variations. Moreover, by increasing the number of network layers and the training set, the CNN classifiers offer the potential for even better performance as well as the ability to handle a larger number of classes. Finally, due to the generalized formulations, the classifier models can be easily adapted to classify multi-dimensional signals of multiple sensors in various other applications.
APA, Harvard, Vancouver, ISO, and other styles
19

MARRAS, MIRKO. "Machine Learning Models for Educational Platforms." Doctoral thesis, Università degli Studi di Cagliari, 2020. http://hdl.handle.net/11584/285377.

Full text
Abstract:
Scaling up education online and onlife is presenting numerous key challenges, such as hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely. However, thanks to the wider availability of learning-related data and increasingly higher performance computing, Artificial Intelligence has the potential to turn such challenges into an unparalleled opportunity. One of its sub-fields, namely Machine Learning, is enabling machines to receive data and learn for themselves, without being programmed with rules. Bringing this intelligent support to education at large scale has a number of advantages, such as avoiding manual error-prone tasks and reducing the chance that learners do any misconduct. Planning, collecting, developing, and predicting become essential steps to make it concrete into real-world education. This thesis deals with the design, implementation, and evaluation of Machine Learning models in the context of online educational platforms deployed at large scale. Constructing and assessing the performance of intelligent models is a crucial step towards increasing reliability and convenience of such an educational medium. The contributions result in large data sets and high-performing models that capitalize on Natural Language Processing, Human Behavior Mining, and Machine Perception. The model decisions aim to support stakeholders over the instructional pipeline, specifically on content categorization, content recommendation, learners’ identity verification, and learners’ sentiment analysis. Past research in this field often relied on statistical processes hardly applicable at large scale. Through our studies, we explore opportunities and challenges introduced by Machine Learning for the above goals, a relevant and timely topic in literature. Supported by extensive experiments, our work reveals a clear opportunity in combining human and machine sensing for researchers interested in online education. Our findings illustrate the feasibility of designing and assessing Machine Learning models for categorization, recommendation, authentication, and sentiment prediction in this research area. Our results provide guidelines on model motivation, data collection, model design, and analysis techniques concerning the above applicative scenarios. Researchers can use our findings to improve data collection on educational platforms, to reduce bias in data and models, to increase model effectiveness, and to increase the reliability of their models, among others. We expect that this thesis can support the adoption of Machine Learning models in educational platforms even more, strengthening the role of data as a precious asset. The thesis outputs are publicly available at https://www.mirkomarras.com.
APA, Harvard, Vancouver, ISO, and other styles
20

Wåhlin, Lova. "Towards Machine Learning Enabled Automatic Design of IT-Network Architectures." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-249213.

Full text
Abstract:
There are many machine learning techniques that cannot be performed on graph-data. Techniques such as graph embedding, i.e mapping a graph to a vector, can open up a variety of machine learning solutions. This thesis addresses to what extent static graph embedding techniques can capture important characteristics of an IT-architecture graph, with the purpose of embedding the graphs in a common euclidean vector space that can serve as the state space in a reinforcement learning setup. The metric used for evaluating the performance of the embedding is the security of the graph, i.e the time it would take for an unauthorized attacker to penetrate the IT-architecture graph. The algorithms evaluated in this work are the node embedding methods node2vec and gat2vec and the graph embedding method graph2vec. The predictive results of the embeddings are compared with two baseline methods. The results of each of the algorithms mostly display a significant predictive performance improvement compared to the baseline, where the F1 score in some cases is doubled. Indeed, the results indicate that static graph embedding methods can in fact capture some information about the security of an IT-architecture. However, no conclusion can be made whether a static graph embedding is actually the best contender for posing as the state space in a reinforcement learning framework. To make a certain conclusion other options has to be researched, such as dynamic graph embedding methods.<br>Det är många maskininlärningstekniker som inte kan appliceras på data i form av en graf. Tekniker som graph embedding, med andra ord att mappa en graf till ett vektorrum, can öppna upp för en större variation av maskininlärningslösningar. Det här examensarbetet evaluerar hur väl statiska graph embeddings kan fånga viktiga säkerhetsegenskaper hos en IT-arkitektur som är modellerad som en graf, med syftet att användas i en reinforcement learning algoritm. Dom egenskaper i grafen som används för att validera embedding metoderna är hur lång tid det skulle ta för en obehörig attackerare att penetrera IT-arkitekturen. Algorithmerna som implementeras är node embedding metoderna node2vec och gat2vec, samt graph embedding metoden graph2vec. Dom prediktiva resultaten är jämförda med två basmetoder. Resultaten av alla tre metoderna visar tydliga förbättringar relativt basmetoderna, där F1 värden i några fall uppvisar en fördubbling. Det går alltså att dra slutsatsen att att alla tre metoder kan fånga upp säkerhetsegenskaper i en IT-arkitektur. Dock går det inte att säga att statiska graph embeddings är den bästa lösningen till att representera en graf i en reinforcement learning algoritm, det finns andra komplikationer med statiska metoder, till exempel att embeddings från dessa metoder inte kan generaliseras till data som inte var använd till träning. För att kunna dra en absolut slutsats krävs mer undersökning, till exempel av dynamiska graph embedding metoder.
APA, Harvard, Vancouver, ISO, and other styles
21

FONTANELLA, ALESSANDRO. "High Performance Architectures for Hyperspectral Image Processing and Machine Learning." Doctoral thesis, Università degli studi di Pavia, 2019. http://hdl.handle.net/11571/1244486.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Kim, Been. "Interactive and interpretable machine learning models for human machine collaboration." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/98680.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2015.<br>Cataloged from PDF version of thesis.<br>Includes bibliographical references (pages 135-143).<br>I envision a system that enables successful collaborations between humans and machine learning models by harnessing the relative strength to accomplish what neither can do alone. Machine learning techniques and humans have skills that complement each other - machine learning techniques are good at computation on data at the lowest level of granularity, whereas people are better at abstracting knowledge from their experience, and transferring the knowledge across domains. The goal of this thesis is to develop a framework for human-in-the-loop machine learning that enables people to interact effectively with machine learning models to make better decisions, without requiring in-depth knowledge about machine learning techniques. Many of us interact with machine learning systems everyday. Systems that mine data for product recommendations, for example, are ubiquitous. However these systems compute their output without end-user involvement, and there are typically no life or death consequences in the case the machine learning result is not acceptable to the user. In contrast, domains where decisions can have serious consequences (e.g., emergency response panning, medical decision-making), require the incorporation of human experts' domain knowledge. These systems also must be transparent to earn experts' trust and be adopted in their workflow. The challenge addressed in this thesis is that traditional machine learning systems are not designed to extract domain experts' knowledge from natural workflow, or to provide pathways for the human domain expert to directly interact with the algorithm to interject their knowledge or to better understand the system output. For machine learning systems to make a real-world impact in these important domains, these systems must be able to communicate with highly skilled human experts to leverage their judgment and expertise, and share useful information or patterns from the data. In this thesis, I bridge this gap by building human-in-the-loop machine learning models and systems that compute and communicate machine learning results in ways that are compatible with the human decision-making process, and that can readily incorporate human experts' domain knowledge. I start by building a machine learning model that infers human teams' planning decisions from the structured form of natural language of team meetings. I show that the model can infer a human teams' final plan with 86% accuracy on average. I then design an interpretable machine learning model then "makes sense to humans" by exploring and communicating patterns and structure in data to support human decision-making. Through human subject experiments, I show that this interpretable machine learning model offers statistically significant quantitative improvements in interpretability while preserving clustering performance. Finally, I design a machine learning model that supports transparent interaction with humans without requiring that a user has expert knowledge of machine learning technique. I build a human-in-the-loop machine learning system that incorporates human feedback and communicates its internal states to humans, using an intuitive medium for interaction with the machine learning model. I demonstrate the application of this model for an educational domain in which teachers cluster programming assignments to streamline the grading process.<br>by Been Kim.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
23

Shen, Chenyang. "Regularized models and algorithms for machine learning." HKBU Institutional Repository, 2015. https://repository.hkbu.edu.hk/etd_oa/195.

Full text
Abstract:
Multi-lable learning (ML), multi-instance multi-label learning (MIML), large network learning and random under-sampling system are four active research topics in machine learning which have been studied intensively recently. So far, there are still a lot of open problems to be figured out in these topics which attract worldwide attention of researchers. This thesis mainly focuses on several novel methods designed for these research tasks respectively. Then main difference between ML learning and traditional classification task is that in ML learning, one object can be characterized by several different labels (or classes). One important observation is that the labels received by similar objects in ML data are usually highly correlated with each other. In order to exploring this correlation of labels between objects which might be a key issue in ML learning, we consider to require the resulting label indicator to be low rank. In the proposed model, nuclear norm which is a famous convex relaxation of intractable matrix rank is introduced to label indicator in order to exploiting the underlying correlation in label domain. Motivated by the idea of spectral clustering, we also incorporate information from feature domain by constructing a graph among objects based on their features. Then with partial label information available, we integrate them together into a convex low rank based model designed for ML learning. The proposed model can be solved efficiently by using alternating direction method of multiplier (ADMM). We test the performance on several benchmark ML data sets and make comparisons with the state-of-art algorithms. The classification results demonstrate the efficiency and effectiveness of the proposed low rank based methods. One step further, we consider MIML learning problem which is usually more complicated than ML learning: besides the possibility of having multiple labels, each object can be described by multiple instances simultaneously which may significantly increase the size of data. To handle the MIML learning problem we first propose and develop a novel sparsity-based MIML learning algorithm. Our idea is to formulate and construct a transductive objective function for label indicator to be learned by using the method of random walk with restart that exploits the relationships among instances and labels of objects, and computes the affinities among the objects. Then sparsity can be introduced in the labels indicator of the objective function such that relevant and irrelevant objects with respect to a given class can be distinguished. The resulting sparsity-based MIML model can be given as a constrained convex optimization problem, and it can be solved very efficiently by using the augmented Lagrangian method (ALM). Experimental results on benchmark data have shown that the proposed sparse-MIML algorithm is computationally efficient, and effective in label prediction for MIML data. We demonstrate that the performance of the proposed method is better than the other testing MIML learning algorithms. Moreover, one big concern of an MIML learning algorithm is computational efficiency, especially when figuring out classification problem for large data sets. Most of the existing methods for solving MIML problems in literature may take a long computational time and have a huge storage cost for large MIML data sets. In this thesis, our main aim is to propose and develop an efficient Markov Chain based learning algorithm for MIML problems. Our idea is to perform labels classification among objects and features identification iteratively through two Markov chains constructed by using objects and features respectively. The classification of objects can be obtained by using labels propagation via training data in the iterative method. Because it is not necessary to compute and store a huge affinity matrix among objects/instances, both the storage and computational time can be reduced significantly. For instance, when we handle MIML image data set of 10000 objects and 250000 instances, the proposed algorithm takes about 71 seconds. Also experimental results on some benchmark data sets are reported to illustrate the effectiveness of the proposed method in one-error, ranking loss, coverage and average precision, and show that it is competitive with the other methods. In addition, we consider the module identification from large biological networks. Nowadays, the interactions among different genes, proteins and other small molecules are becoming more and more significant and have been studied intensively. One general way that helps people understand these interactions is to analyze networks constructed from genes/proteins. In particular, module structure as a common property of most biological networks has drawn much attention of researchers from different fields. However, biological networks might be corrupted by noise in the data which often lead to the miss-identification of module structure. Besides, some edges in network might be removed (or some nodes might be miss-connected) when improper parameters are selected which may also affect the module identified significantly. In conclusion, the module identification results are sensitive to noise as well as parameter selection of network. In this thesis, we consider employing multiple networks for consistent module detection in order to reduce the effect of noise and parameter settings. Instead of studying different networks separately, our idea is to combine multiple networks together by building them into tensor structure data. Then given any node as prior label information, tensor-based Markov chains are constructed iteratively for identification of the modules shared by the multiple networks. In addition, the proposed tensor-based Markov chain algorithm is capable of simultaneously evaluating the contribution from each network. It would be useful to measure the consistency of modules in the multiple networks. In the experiments, we test our method on two groups of gene co-expression networks from human beings. We also validate biological meaning of modules identified by the proposed method. Finally, we introduce random under-sampling techniques with application to X-ray computed tomography (CT). Under-sampling techniques are realized to be powerful tools of reducing the scale of problem especially for large data analysis. However, information loss seems to be un-avoidable which inspires different under-sampling strategies for preserving more useful information. Here we focus on under-sampling for the real-world CT reconstruction problem. The main motivation is to reduce the total radiation dose delivered to patient which has arisen significant clinical concern for CT imaging. We compare two popular regular CT under-sampling strategies with ray random under-sampling. The results support the conclusion that random under-sampling always outperforms regular ones especially for the high down-sampling ratio cases. Moreover, based on the random ray under-sampling strategy, we propose a novel scatter removal method which further improves performance of ray random under-sampling in CT reconstruction.
APA, Harvard, Vancouver, ISO, and other styles
24

Ahlin, Mikael, and Felix Ranby. "Predicting Marketing Churn Using Machine Learning Models." Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-161408.

Full text
Abstract:
For any organisation that engages in marketing actions there is a need to understand how people react to communication messages that are sent. Since the introduction of General Data Protection Regulation, the requirements for personal data usage have increased and people are able to effect the way their personal information is used by companies. For instance people have the possibility to unsubscribe from communication that is sent, this is called Opt-Out and can be viewed as churning from communication channels. When a customer Opt-Out the organisation loses the opportunity to send personalised marketing to that individual which in turn result in lost revenue.  The aim with this thesis is to investigate the Opt-Out phenomena and build a model that is able to predict the risk of losing a customer from the communication channels. The risk of losing a customer is measured as the estimated probability that a specic individual will Opt-Out in the near future. To predict future Opt-Outs the project uses machine learning algorithms on aggregated communication and customer data. Of the algorithms that were tested the best and most stable performance was achieved by an Extreme Gradient Boosting algorithm that used simulated variables. The performance of the model is best described by an AUC score of 0.71 and a lift score of 2.21, with an adjusted threshold on data two months into the future from when the model was trained. With a model that uses simulated variables the computational cost goes up. However, the increase in performance is signicant and it can be concluded that the choice to include information about specic communications is considered relevant for the outcome of the predictions. A boosted method such as the Extreme Gradient Boosting algorithm generates stable results which lead to a longer time between model retraining sessions.
APA, Harvard, Vancouver, ISO, and other styles
25

BALLANTE, ELENA. "Statistical and Machine Learning models for Neurosciences." Doctoral thesis, Università degli studi di Pavia, 2021. http://hdl.handle.net/11571/1447634.

Full text
Abstract:
This thesis addresses several problems encountered in the field of statistical and machine learning methods for data analysis in neurosciences. The thesis is divided into three parts. The first part of the thesis is related to classification tree models. In the research field of polarization measures, a new polarization measure is defined. The function is incorporated in the decision tree algorithm as a splitting function in order to tackle some weaknesses of classical impurity measures. The new algorithm is called Polarized Classification Tree model. The model is tested on simulated and real data sets and compared with decision tree models where the classical impurity measures are deployed. In the second part of the thesis a new index for assessing and selecting the best model in a classification task when the target variable is ordinal is developed. The index proposed is compared to the traditional measures on simulated data sets and it is applied in a real case study related to Attenuated Psychosis Syndrome. The third part covers the topic of smoothing methods for quaternion time series data in the context of motion data classification. Different proper methods to smoothing time series in quaternion algebra are reviewed and a new method is proposed. The new method is compared with a method proposed in the literature in terms of classification performances on a real data set and five data sets obtained introducing different degrees of noise. The results confirmed the hypothesis made on the basis of the theoretical information available from the two methods, i.e. the logarithm is smoother and generally provides better results than the existing method in terms of classification performances.
APA, Harvard, Vancouver, ISO, and other styles
26

GUIDOTTI, DARIO. "Verification and Repair of Machine Learning Models." Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1082694.

Full text
Abstract:
In these last few years, machine learning (ML) has gained incredible traction in the Artificial Intelligence community, and ML models have found successful applications in many different domains across computer science. However, it is hard to provide any formal guarantee on the behavior of ML models, and therefore their reliability is still in doubt, especially concerning their deployment in safety and security-critical applications. Verification and repair emerged as promising solutions to address some of these problems. In this dissertation, we present our contributions to these two lines of research: in particular, we focus on verifying and repairing machine-learned controllers, leveraging learning techniques to enhance the verification and repair of neural networks, and developing novel tools and algorithms for verifying neural networks. Part of our research is made available in the library pyNeVer, which provides capabilities for training, verification, and management of neural networks.
APA, Harvard, Vancouver, ISO, and other styles
27

Mohanty, Siddharth. "Autotuning wavefront patterns for heterogeneous architectures." Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/10557.

Full text
Abstract:
Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizations are often not portable, and the whole process must be repeated when moving to a new system, or sometimes even to a different problem size. Pattern based parallel programming models were originally designed to provide programmers with an abstract layer, hiding tedious parallel boilerplate code, and allowing a focus on only application specific issues. However, the constrained algorithmic model associated with each pattern also enables the creation of pattern-specific optimization strategies. These can capture more complex variations than would be accessible by analysis of equivalent unstructured source code. These variations create complex optimization spaces. Machine learning offers well established techniques for exploring such spaces. In this thesis we use machine learning to create autotuning strategies for heterogeneous parallel implementations of applications which follow the wavefront pattern. In a wavefront, computation starts from one corner of the problem grid and proceeds diagonally like a wave to the opposite corner in either two or three dimensions. Our framework partitions and optimizes the work created by these applications across systems comprising multicore CPUs and multiple GPU accelerators. The tuning opportunities for a wavefront include controlling the amount of computation to be offloaded onto GPU accelerators, choosing the number of CPU and GPU threads to process tasks, tiling for both CPU and GPU memory structures, and trading redundant halo computation against communication for multiple GPUs. Our exhaustive search of the problem space shows that these parameters are very sensitive to the combination of architecture, wavefront instance and problem size. We design and investigate a family of autotuning strategies, targeting single and multiple CPU + GPU systems, and both two and three dimensional wavefront instances. These yield an average of 87% of the performance found by offline exhaustive search, with up to 99% in some cases.
APA, Harvard, Vancouver, ISO, and other styles
28

Lu, Zonghao. "A case study about different network architectures in Federated Machine Learning." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-425193.

Full text
Abstract:
Modern artificial intelligence (AI) technology is developing rapidly in recent years. Data is an important factor driving the development of AI.With the development of mobile Internet, more and more data is generated in different fields every day along with data sensitivity issues. As asignificant part of personal privacy, personal data must be respected and protected. Federated learning (FL) is a machine learning technology that can protect privacy because it keeps everyone’s data local. Many types of research have already confirmed that the bottleneck of federated learning is the communication between the client and servers. Different ways of communication methods have various characteristics, resulting in differences inefficiency. We present a benchmark for our FL system using HTTP and gRPC communication protocol respectively to show that gRPC framework is faster and has better scalability than HTTP protocol mainly because of the different architectures and way of compacting message of these two different communication protocols. In addition, we found that the system may get crashed when the loads increased. A registration mechanism is proposed to deal with the problem of insufficient computing resources and apply a new model updatestrategy to make the training process finish in a shorter time.Tryckt av:
APA, Harvard, Vancouver, ISO, and other styles
29

PULIGHEDDU, CORRADO. "Machine Learning-Powered Management Architectures for Edge Services in 5G Networks." Doctoral thesis, Politecnico di Torino, 2022. https://hdl.handle.net/11583/2973797.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Markou, Markos N. "Models of novelty detection based on machine learning." Thesis, University of Exeter, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.426165.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Shepherd, T. "Dynamical models and machine learning for supervised segmentation." Thesis, University College London (University of London), 2009. http://discovery.ucl.ac.uk/18729/.

Full text
Abstract:
This thesis is concerned with the problem of how to outline regions of interest in medical images, when the boundaries are weak or ambiguous and the region shapes are irregular. The focus on machine learning and interactivity leads to a common theme of the need to balance conflicting requirements. First, any machine learning method must strike a balance between how much it can learn and how well it generalises. Second, interactive methods must balance minimal user demand with maximal user control. To address the problem of weak boundaries,methods of supervised texture classification are investigated that do not use explicit texture features. These methods enable prior knowledge about the image to benefit any segmentation framework. A chosen dynamic contour model, based on probabilistic boundary tracking, combines these image priors with efficient modes of interaction. We show the benefits of the texture classifiers over intensity and gradient-based image models, in both classification and boundary extraction. To address the problem of irregular region shape, we devise a new type of statistical shape model (SSM) that does not use explicit boundary features or assume high-level similarity between region shapes. First, the models are used for shape discrimination, to constrain any segmentation framework by way of regularisation. Second, the SSMs are used for shape generation, allowing probabilistic segmentation frameworks to draw shapes from a prior distribution. The generative models also include novel methods to constrain shape generation according to information from both the image and user interactions. The shape models are first evaluated in terms of discrimination capability, and shown to out-perform other shape descriptors. Experiments also show that the shape models can benefit a standard type of segmentation algorithm by providing shape regularisers. We finally show how to exploit the shape models in supervised segmentation frameworks, and evaluate their benefits in user trials.
APA, Harvard, Vancouver, ISO, and other styles
32

Liu, Xiaoyang. "Machine Learning Models in Fullerene/Metallofullerene Chromatography Studies." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/93737.

Full text
Abstract:
Machine learning methods are now extensively applied in various scientific research areas to make models. Unlike regular models, machine learning based models use a data-driven approach. Machine learning algorithms can learn knowledge that are hard to be recognized, from available data. The data-driven approaches enhance the role of algorithms and computers and then accelerate the computation using alternative views. In this thesis, we explore the possibility of applying machine learning models in the prediction of chromatographic retention behaviors. Chromatographic separation is a key technique for the discovery and analysis of fullerenes. In previous studies, differential equation models have achieved great success in predictions of chromatographic retentions. However, most of the differential equation models require experimental measurements or theoretical computations for many parameters, which are not easy to obtain. Fullerenes/metallofullerenes are rigid and spherical molecules with only carbon atoms, which makes the predictions of chromatographic retention behaviors as well as other properties much simpler than other flexible molecules that have more variations on conformations. In this thesis, I propose the polarizability of a fullerene molecule is able to be estimated directly from the structures. Structural motifs are used to simplify the model and the models with motifs provide satisfying predictions. The data set contains 31947 isomers and their polarizability data and is split into a training set with 90% data points and a complementary testing set. In addition, a second testing set of large fullerene isomers is also prepared and it is used to testing whether a model can be trained by small fullerenes and then gives ideal predictions on large fullerenes.<br>Machine learning models are capable to be applied in a wide range of areas, such as scientific research. In this thesis, machine learning models are applied to predict chromatography behaviors of fullerenes based on the molecular structures. Chromatography is a common technique for mixture separations, and the separation is because of the difference of interactions between molecules and a stationary phase. In real experiments, a mixture usually contains a large family of different compounds and it requires lots of work and resources to figure out the target compound. Therefore, models are extremely import for studies of chromatography. Traditional models are built based on physics rules, and involves several parameters. The physics parameters are measured by experiments or theoretically computed. However, both of them are time consuming and not easy to be conducted. For fullerenes, in my previous studies, it has been shown that the chromatography model can be simplified and only one parameter, polarizability, is required. A machine learning approach is introduced to enhance the model by predicting the molecular polarizabilities of fullerenes based on structures. The structure of a fullerene is represented by several local structures. Several types of machine learning models are built and tested on our data set and the result shows neural network gives the best predictions.
APA, Harvard, Vancouver, ISO, and other styles
33

Gosch, Aron. "Exploration of 5G Traffic Models using Machine Learning." Thesis, Linköpings universitet, Databas och informationsteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-168160.

Full text
Abstract:
The Internet is a major communication tool that handles massive information exchanges, sees a rapidly increasing usage, and offers an increasingly wide variety of services.   In addition to these trends, the services themselves have highly varying quality of service (QoS), requirements and the network providers must take into account the frequent releases of new network standards like 5G. This has resulted in a significant need for new theoretical models that can capture different network traffic characteristics. Such models are important both for understanding the existing traffic in networks, and to generate better synthetic traffic workloads that can be used to evaluate future generations of network solutions using realistic workload patterns under a broad range of assumptions and based on how the popularity of existing and future application classes may change over time. To better meet these changes, new flexible methods are required. In this thesis, a new framework aimed towards analyzing large quantities of traffic data is developed and used to discover key characteristics of application behavior for IP network traffic. Traffic models are created by breaking down IP log traffic data into different abstraction layers with descriptive values. The aggregated statistics are then clustered using the K-means algorithm, which results in groups with closely related behaviors. Lastly, the model is evaluated with cluster analysis and three different machine learning algorithms to classify the network behavior of traffic flows. From the analysis framework a set of observed traffic models with distinct behaviors are derived that may be used as building blocks for traffic simulations in the future. Based on the framework we have seen that machine learning achieve high performance on the classification of network traffic, with a Multilayer Perceptron getting the best results. Furthermore, the study has produced a set of ten traffic models that have been demonstrated to be able to reconstruct traffic for various network entities.<br><p>Due to COVID-19 the presentation was performed over ZOOM.</p>
APA, Harvard, Vancouver, ISO, and other styles
34

Awaysheh, Abdullah Mamdouh. "Data Standardization and Machine Learning Models for Histopathology." Diss., Virginia Tech, 2017. http://hdl.handle.net/10919/85040.

Full text
Abstract:
Machine learning can provide insight and support for a variety of decisions. In some areas of medicine, decision-support models are capable of assisting healthcare practitioners in making accurate diagnoses. In this work we explored the application of these techniques to distinguish between two diseases in veterinary medicine; inflammatory bowel disease (IBD) and alimentary lymphoma (ALA). Both disorders are common gastrointestinal (GI) diseases in humans and animals that share very similar clinical and pathological outcomes. Because of these similarities, distinguishing between these two diseases can sometimes be challenging. In order to identify patterns that may help with this differentiation, we retrospectively mined medical records from dogs and cats with histopathologically diagnosed GI diseases. Since the pathology report is the key conveyer of this information in the medical records, our first study focused on its information structure. Other groups have had a similar interest. In 2008, to help insure consistent reporting, the World Small Animal Veterinary Association (WSAVA) GI International Standardization Group proposed standards for recording histopathological findings (HF) from GI biopsy samples. In our work, we extend WSAVA efforts and propose an information model (composed of information structure and terminology mapped to the Systematized Nomenclature of Medicine - Clinical Terms) to be used when recording histopathological diagnoses (HDX, one or more HF from one or more tissues). Next, our aim was to identify free-text HF not currently expressed in the WSAVA format that may provide evidence for distinguishing between IBD and ALA in cats. As part of this work, we hypothesized that WSAVA-based structured reports would have higher classification accuracy of GI disorders in comparison to use of unstructured free-text format. We trained machine learning models in 60 structured, and independently, 60 unstructured reports. Results show that unstructured information-based models using two machine learning algorithms achieved higher accuracy in predicting the diagnosis when compared to the structured information-based models, and some novel free-text features were identified for possible inclusion in the WSAVA-reports. In our third study, we tested the use of machine learning algorithms to differentiate between IBD and ALA using complete blood count and serum chemistry data. Three models (using naïve Bayes, neural networks, and C4.5 decision trees) were trained and tested on laboratory results for 40 Normal, 40 IBD, and 40 ALA cats. Diagnostic models achieved classification sensitivity ranging between 63% and 71% with naïve Bayes and neural networks being superior. These models can provide another non-invasive diagnostic tool to assist with differentiating between IBD and ALA, and between diseased and non-diseased cats. We believe that relying on our information model for histopathological reporting can lead to a more complete, consistent, and computable knowledgebase in which machine learning algorithms can more efficiently identify these and other disease patterns.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
35

Aryasomayajula, Naga Srinivasa Baradwaj. "Machine Learning Models for Categorizing Privacy Policy Text." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1535633397362514.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

IGUIDER, WALID. "Machine Learning Models for Sports Remote Coaching Platforms." Doctoral thesis, Università degli Studi di Cagliari, 2022. http://hdl.handle.net/11584/326530.

Full text
Abstract:
Offering timely support to users in eCoaching systems is a crucial factor to keep them engaged. However, coaches usually follow many users, so it is hard to prioritize those they should interact with first. Timeliness is especially needed when health implications might be the consequence of a lack of support. Thanks to the data provided by U4FIT (an eCoaching platform for runners we will describe in Chapter 1) and the rise of high-performance computing, Artificial Intelligence can turn such challenges into unparalleled opportunities. One of its sub-fields, namely Machine Learning, enables machines to receive data and learn for themselves without being programmed with rules. Bringing this intelligent support to the coaching domain has many advantages, such as reducing coaches’ workload and fostering sportspeople to keep their exercise routine. This thesis’s main focus consists of the design, implementation, and evaluation of Machine Learning models in the context of online coaching platforms. On the one hand, our goal is to provide coaches with dashboards that summarize the training behavior of the sportspeople they follow and with a ranked list of the sportspeople according to the support they need to interact with them timely. On the other hand, we want to guarantee a fair exposure in the ranking to ensure that sportspeople of different genres have equal opportunities to get supported. Past research in this field often relied on statistical processes hardly applicable at a large scale. Our studies explore opportunities and challenges introduced by Machine Learning for the above goals, a relevant and timely topic in literature. Extensive experiments support our work, revealing a clear opportunity to combine human and machine sensing for researchers interested in online coaching. Our findings illustrate the feasibility of designing, assessing, and deploying Machine Learning models for workout quality prediction and sportspeople dropout prevention, in addition to the design and implementation of dashboards providing trainers with actionable knowledge about the sportspeople they follow. Our results provide guidelines on model motivation, model design, data collection, and analysis techniques concerning the applicable scenarios above. Researchers can use our findings to improve data collection on eCoaching platforms, reduce bias in rankings, increase model effectiveness, and increase the reliability of their models, among others.
APA, Harvard, Vancouver, ISO, and other styles
37

Urbanczyk, Martin. "Webový simulátor fotbalových lig a turnajů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-403171.

Full text
Abstract:
This thesis is about the creation of a simulator of football leagues and championships. I studied the problematics of football competitions and their systems and also about the base of machine learning. There was also an analysis of similar and existing solutions and I took inspiration for my proposal from them. After that, I made the design of the whole simulator structure and of all of its key parts. Then the simulator was implemented and tested. The application allows simulating top five competitions in UEFA club coefficients rating.
APA, Harvard, Vancouver, ISO, and other styles
38

DiTomaso, Dominic F. "Reactive and Proactive Fault-Tolerant Network-on-Chip Architectures using Machine Learning." Ohio University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1439478822.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Dhakal, Parashar. "Novel Architectures for Human Voice and Environmental Sound Recognitionusing Machine Learning Algorithms." University of Toledo / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1531349806743278.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Rado, Omesaad A. M. "Contributions to evaluation of machine learning models. Applicability domain of classification models." Thesis, University of Bradford, 2019. http://hdl.handle.net/10454/18447.

Full text
Abstract:
Artificial intelligence (AI) and machine learning (ML) present some application opportunities and challenges that can be framed as learning problems. The performance of machine learning models depends on algorithms and the data. Moreover, learning algorithms create a model of reality through learning and testing with data processes, and their performance shows an agreement degree of their assumed model with reality. ML algorithms have been successfully used in numerous classification problems. With the developing popularity of using ML models for many purposes in different domains, the validation of such predictive models is currently required more formally. Traditionally, there are many studies related to model evaluation, robustness, reliability, and the quality of the data and the data-driven models. However, those studies do not consider the concept of the applicability domain (AD) yet. The issue is that the AD is not often well defined, or it is not defined at all in many fields. This work investigates the robustness of ML classification models from the applicability domain perspective. A standard definition of applicability domain regards the spaces in which the model provides results with specific reliability. The main aim of this study is to investigate the connection between the applicability domain approach and the classification model performance. We are examining the usefulness of assessing the AD for the classification model, i.e. reliability, reuse, robustness of classifiers. The work is implemented using three approaches, and these approaches are conducted in three various attempts: firstly, assessing the applicability domain for the classification model; secondly, investigating the robustness of the classification model based on the applicability domain approach; thirdly, selecting an optimal model using Pareto optimality. The experiments in this work are illustrated by considering different machine learning algorithms for binary and multi-class classifications for healthcare datasets from public benchmark data repositories. In the first approach, the decision trees algorithm (DT) is used for the classification of data in the classification stage. The feature selection method is applied to choose features for classification. The obtained classifiers are used in the third approach for selection of models using Pareto optimality. The second approach is implemented using three steps; namely, building classification model; generating synthetic data; and evaluating the obtained results. The results obtained from the study provide an understanding of how the proposed approach can help to define the model’s robustness and the applicability domain, for providing reliable outputs. These approaches open opportunities for classification data and model management. The proposed algorithms are implemented through a set of experiments on classification accuracy of instances, which fall in the domain of the model. For the first approach, by considering all the features, the highest accuracy obtained is 0.98, with thresholds average of 0.34 for Breast cancer dataset. After applying recursive feature elimination (RFE) method, the accuracy is 0.96% with 0.27 thresholds average. For the robustness of the classification model based on the applicability domain approach, the minimum accuracy is 0.62% for Indian Liver Patient data at r=0.10, and the maximum accuracy is 0.99% for Thyroid dataset at r=0.10. For the selection of an optimal model using Pareto optimality, the optimally selected classifier gives the accuracy of 0.94% with 0.35 thresholds average. This research investigates critical aspects of the applicability domain as related to the robustness of classification ML algorithms. However, the performance of machine learning techniques depends on the degree of reliable predictions of the model. In the literature, the robustness of the ML model can be defined as the ability of the model to provide the testing error close to the training error. Moreover, the properties can describe the stability of the model performance when being tested on the new datasets. Concluding, this thesis introduced the concept of applicability domain for classifiers and tested the use of this concept with some case studies on health-related public benchmark datasets.<br>Ministry of Higher Education in Libya
APA, Harvard, Vancouver, ISO, and other styles
41

Tuovinen, L. (Lauri). "From machine learning to learning with machines:remodeling the knowledge discovery process." Doctoral thesis, Oulun yliopisto, 2014. http://urn.fi/urn:isbn:9789526205243.

Full text
Abstract:
Abstract Knowledge discovery (KD) technology is used to extract knowledge from large quantities of digital data in an automated fashion. The established process model represents the KD process in a linear and technology-centered manner, as a sequence of transformations that refine raw data into more and more abstract and distilled representations. Any actual KD process, however, has aspects that are not adequately covered by this model. In particular, some of the most important actors in the process are not technological but human, and the operations associated with these actors are interactive rather than sequential in nature. This thesis proposes an augmentation of the established model that addresses this neglected dimension of the KD process. The proposed process model is composed of three sub-models: a data model, a workflow model, and an architectural model. Each sub-model views the KD process from a different angle: the data model examines the process from the perspective of different states of data and transformations that convert data from one state to another, the workflow model describes the actors of the process and the interactions between them, and the architectural model guides the design of software for the execution of the process. For each of the sub-models, the thesis first defines a set of requirements, then presents the solution designed to satisfy the requirements, and finally, re-examines the requirements to show how they are accounted for by the solution. The principal contribution of the thesis is a broader perspective on the KD process than what is currently the mainstream view. The augmented KD process model proposed by the thesis makes use of the established model, but expands it by gathering data management and knowledge representation, KD workflow and software architecture under a single unified model. Furthermore, the proposed model considers issues that are usually either overlooked or treated as separate from the KD process, such as the philosophical aspect of KD. The thesis also discusses a number of technical solutions to individual sub-problems of the KD process, including two software frameworks and four case-study applications that serve as concrete implementations and illustrations of several key features of the proposed process model<br>Tiivistelmä Tiedonlouhintateknologialla etsitään automoidusti tietoa suurista määristä digitaalista dataa. Vakiintunut prosessimalli kuvaa tiedonlouhintaprosessia lineaarisesti ja teknologiakeskeisesti sarjana muunnoksia, jotka jalostavat raakadataa yhä abstraktimpiin ja tiivistetympiin esitysmuotoihin. Todellisissa tiedonlouhintaprosesseissa on kuitenkin aina osa-alueita, joita tällainen malli ei kata riittävän hyvin. Erityisesti on huomattava, että eräät prosessin tärkeimmistä toimijoista ovat ihmisiä, eivät teknologiaa, ja että heidän toimintansa prosessissa on luonteeltaan vuorovaikutteista eikä sarjallista. Tässä väitöskirjassa ehdotetaan vakiintuneen mallin täydentämistä siten, että tämä tiedonlouhintaprosessin laiminlyöty ulottuvuus otetaan huomioon. Ehdotettu prosessimalli koostuu kolmesta osamallista, jotka ovat tietomalli, työnkulkumalli ja arkkitehtuurimalli. Kukin osamalli tarkastelee tiedonlouhintaprosessia eri näkökulmasta: tietomallin näkökulma käsittää tiedon eri olomuodot sekä muunnokset olomuotojen välillä, työnkulkumalli kuvaa prosessin toimijat sekä niiden väliset vuorovaikutukset, ja arkkitehtuurimalli ohjaa prosessin suorittamista tukevien ohjelmistojen suunnittelua. Väitöskirjassa määritellään aluksi kullekin osamallille joukko vaatimuksia, minkä jälkeen esitetään vaatimusten täyttämiseksi suunniteltu ratkaisu. Lopuksi palataan tarkastelemaan vaatimuksia ja osoitetaan, kuinka ne on otettu ratkaisussa huomioon. Väitöskirjan pääasiallinen kontribuutio on se, että se avaa tiedonlouhintaprosessiin valtavirran käsityksiä laajemman tarkastelukulman. Väitöskirjan sisältämä täydennetty prosessimalli hyödyntää vakiintunutta mallia, mutta laajentaa sitä kokoamalla tiedonhallinnan ja tietämyksen esittämisen, tiedon louhinnan työnkulun sekä ohjelmistoarkkitehtuurin osatekijöiksi yhdistettyyn malliin. Lisäksi malli kattaa aiheita, joita tavallisesti ei oteta huomioon tai joiden ei katsota kuuluvan osaksi tiedonlouhintaprosessia; tällaisia ovat esimerkiksi tiedon louhintaan liittyvät filosofiset kysymykset. Väitöskirjassa käsitellään myös kahta ohjelmistokehystä ja neljää tapaustutkimuksena esiteltävää sovellusta, jotka edustavat teknisiä ratkaisuja eräisiin yksittäisiin tiedonlouhintaprosessin osaongelmiin. Kehykset ja sovellukset toteuttavat ja havainnollistavat useita ehdotetun prosessimallin merkittävimpiä ominaisuuksia
APA, Harvard, Vancouver, ISO, and other styles
42

Vantzelfde, Nathan Hans. "Prognostic models for mesothelioma : variable selection and machine learning." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/33370.

Full text
Abstract:
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.<br>Includes bibliographical references (leaves 103-107).<br>Malignant pleural mesothelioma is a rare and lethal form of cancer affecting the external lining of the lungs. Extrapleural pneumonectomy (EPP), which involves the removal of the affected lung, is one of the few treatments that has been shown to have some effectiveness in treatment of the disease [39], but this procedure carries with it a high risk of mortality and morbidity [8]. This paper is concerned with building models using gene expression levels to predict patient survival following EPP; these models could potentially be used to guide patient treatment. A study by Gordon et al built a predictor based on ratios of gene expression levels that was 88% accurate on the set of 29 independent test samples, in terms of classifying whether or not the patients survived shorter or longer than the median survival [15]. These results were recreated both on the original data set used by Gordon et al and on a newer data set which contained the same samples but was generated using newer software. The predictors were evaluated using N-fold cross validation. In addition, other methods of variable selection and machine learning were investigated to build different types of predictive models. These analyses used a random training set from the newer data set. These models were evaluated using N-fold cross validation and the best of each of the four main types of models -<br>(cont.) decision trees, logistic regression, artificial neural networks, and support vector machines - were tested using a small set of samples excluded from the training set. Of these four models, the neural network with eight hidden neurons and weight decay regularization performed the best, achieving a zero cross validation error rate and, on the test set, 71% accuracy, an ROC area of .67 and a logrank p value of .219. The support vector machine model with linear kernel also had zero cross validation error and, on the test set, a 71% accuracy and an ROC area of .67 but had a higher logrank p value of .515. These both had a lower cross validation error than the ratio-based predictors of Gordon et al, which had an N-fold cross validation error rate of 35%; however, these results may not be comparable because the neural network and support vector machine used a different training set than the Gordon et al study. Regression analysis was also performed; the best neural network model was incorrect by an average of 4.6 months in the six test samples. The method of variable selection based on the signal-to-noise ratio of genes originally used by Golub et al proved more effective when used on the randomly generated training set than the method involving Student's t tests and fold change used by Gordon et al. Ultimately, however, these models will need to be evaluated using a large independent test.<br>by Nathan Hans Vantzelfde.<br>M.Eng.
APA, Harvard, Vancouver, ISO, and other styles
43

Ebbesson, Markus. "Mail Volume Forecasting an Evaluation of Machine Learning Models." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-301333.

Full text
Abstract:
This study applies machine learning models to mail volumes with the goal of making sufficiently accurate forecasts to minimise the problem of under- and overstaffing at a mail operating company. A most suitable model appraisal in the context is found by evaluating input features and three different models, Auto Regression (AR), Random Forest (RF) and Neural Network (NN) (Multilayer Perceptron (MLP)). The results provide exceedingly improved forecasting accuracies compared to the model that is currently in use. The RF model is recommended as the most practically applicable model for the company, although the RF and NN models provide similar accuracy. This study serves as an initiative since the field lacks previous research in producing mail volume forecasts with machine learning. The outcomes are predicted to be applicable for mail operators all over Sweden and the World.
APA, Harvard, Vancouver, ISO, and other styles
44

Wissel, Benjamin D. "Generalizability of Electronic Health Record-Based Machine Learning Models." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1627659161796896.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Pirgul, Khalid, та Jonathan Svensson. "Verification of Powertrain Simulation Models Using Machine Learning Methods". Thesis, Linköpings universitet, Fordonssystem, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166290.

Full text
Abstract:
This thesis is providing an insight into the verification of a quasi-static simulation model based on the estimation of fuel consumption using machine learning methods. Traditional verification using real test data is not always available. Therefore, a methodology consisting of verification analysis based on estimation methods was developed together with an improving process of a quasi-static simulation model. The modelling of the simulation model mainly consists of designing and implementing a gear selection strategy together with the gearbox itself for a dual clutch transmission dedicated to hybrid application. The purpose of the simulation model is to replicate the fuel consumption behaviour of vehicle data provided from performed tests. To verify the simulation results, a so-called ranking model is developed. The ranking model estimates a fuel consumption reference for each time step of the WLTC homologation drive cycle using multiple linear regression. The results of the simulation model are verified, and a scoring system is used to indicate the performance of the simulation model, based on the correlation between estimated- and simulated data of the fuel consumption. The results show that multiple linear regression can be an appropriate approach to use as verification of simulation models. The normalised cross-correlation power is also examined and turns out to be a useful measure for correlation be-tween signals including a lag. The developed ranking model is a fast first step of evaluating a new vehicle configuration concept.
APA, Harvard, Vancouver, ISO, and other styles
46

Elf, Sebastian, and Christopher Öqvist. "Comparison of supervised machine learning models forpredicting TV-ratings." Thesis, KTH, Hälsoinformatik och logistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-278054.

Full text
Abstract:
Abstract Manual prediction of TV-ratings to use for program and advertisement placement can be costly if they are wrong, as well as time-consuming. This thesis evaluates different supervised machine learning models to see if the process of predicting TV-ratings can be automated with better accuracy than the manual process. The results show that of the two tested supervised machine learning models, Random Forest and Support Vector Regression, Random Forest was the better model. Random Forest was better on both measurements, mean absolute error and root mean squared error, used to compare the models. The conclusion is that Random Forest, evaluated with the dataset and methods used, are not accurate enough to replace the manual process. Even though this is the case, it could still potentially be used as part of the manual process to ease the workload of the employees. Keywords Machine learning, supervised learning, TV-rating, Support Vector Regression, Random Forest.<br>SammanfattningAtt manuellt förutsäga tittarsiffor för program- och annonsplacering kan vara kostsamt och tidskrävande om de är fel. Denna rapport utvärderar olika modeller som utnyttjar övervakad maskininlärning för att se om processen för att förutsäga tittarsiffror kan automatiseras med bättre noggrannhet än den manuella processen. Resultaten visar att av de två testade övervakade modellerna för maskininlärning, Random Forest och Support Vector Regression, var Random Forest den bättre modellen. Random Forest var bättre med båda de två mätningsmetoder, genomsnittligt absolut fel och kvadratiskt medelvärde fel, som används för att jämföra modellerna. Slutsatsen är att Random Forest, utvärderad med de data och de metoderna som används, inte är tillräckligt exakt för att ersätta den manuella processen. Även om detta är fallet, kan den fortfarande potentiellt användas som en del av den manuella processen för att underlätta de anställdas arbetsbelastning.Nyckelord Maskininlärning, övervakad inlärning, tittarsiffror, Support Vector Regression, Random Forest.
APA, Harvard, Vancouver, ISO, and other styles
47

Lanka, Venkata Raghava Ravi Teja Lanka. "VEHICLE RESPONSE PREDICTION USING PHYSICAL AND MACHINE LEARNING MODELS." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1511891682062084.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Hugo, Linsey Sledge. "A Comparison of Machine Learning Models Predicting Student Employment." Ohio University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1544127100472053.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Snelson, Edward Lloyd. "Flexible and efficient Gaussian process models for machine learning." Thesis, University College London (University of London), 2007. http://discovery.ucl.ac.uk/1445855/.

Full text
Abstract:
Gaussian process (GP) models are widely used to perform Bayesian nonlinear regression and classification tasks that are central to many machine learning problems. A GP is nonparametric, meaning that the complexity of the model grows as more data points are received. Another attractive feature is the behaviour of the error bars. They naturally grow in regions away from training data where we have high uncertainty about the interpolating function. In their standard form GPs have several limitations, which can be divided into two broad categories: computational difficulties for large data sets, and restrictive modelling assumptions for complex data sets. This thesis addresses various aspects of both of these problems. The training cost for a GP has 0(N3) complexity, where N is the number of training data points. This is due to an inversion of the N x N covariance matrix. In this thesis we develop several new techniques to reduce this complexity to 0(NM2), where M is a user chosen number much smaller than N. The sparse approximation we use is based on a set of M 'pseudo-inputs' which are optimised together with hyperparameters at training time. We develop a further approximation based on clustering inputs that can be seen as a mixture of local and global approximations. Standard GPs assume a uniform noise variance. We use our sparse approximation described above as a way of relaxing this assumption. By making a modification of the sparse covariance function, we can model input dependent noise. To handle high dimensional data sets we use supervised linear dimensionality reduction. As another extension of the standard GP, we relax the Gaussianity assumption of the process by learning a nonlinear transformation of the output space. All these techniques further increase the applicability of GPs to real complex data sets. We present empirical comparisons of our algorithms with various competing techniques, and suggest problem dependent strategies to follow in practice.
APA, Harvard, Vancouver, ISO, and other styles
50

Zeng, Haoyang Ph D. Massachusetts Institute of Technology. "Machine learning models for functional genomics and therapeutic design." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122689.

Full text
Abstract:
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (pages 213-230).<br>Due to the limited size of training data available, machine learning models for biology have remained rudimentary and inaccurate despite the significant advance in machine learning research. With the recent advent of high-throughput sequencing technology, an exponentially growing number of genomic and proteomic datasets have been generated. These large-scale datasets admit the training of high-capacity machine learning models to characterize sophisticated features and produce accurate predictions on unseen examples. In this thesis, we attempt to develop advanced machine learning models for functional genomics and therapeutics design, two areas with ample data deposited in public databases and tremendous clinical implications. The shared theme of these models is to learn how the composition of a biological sequence encodes a functional phenotype and then leverage such knowledge to provide insight for target discovery and therapeutic design.<br>First, we design three machine learning models that predict transcription factor binding and DNA methylation, two fundamental epigenetic phenotypes closely tied to gene regulation, from DNA sequence alone. We show that these epigenetic phenotypes can be well predicted from the sequence context. Moreover, the predicted change in phenotype between the reference and alternate allele of a genetic variant accurately reflect its functional impact and improves the identification of regulatory variants causal for complex diseases. Second, we devise two machine learning models that improve the prediction of peptides displayed by the major histocompatibility complex (MHC) on the cell surface. Computational modeling of peptide-display by MHC is central in the design of peptide-based therapeutics.<br>Our first machine learning model introduces the capacity to quantify uncertainty in the computational prediction and proposes a new metric for peptide prioritization that reduces false positives in high-affinity peptide design. The second model improves the state-of-the-art performance in MHC-ligand prediction by employing a deep language model to learn the sequence determinants for auxiliary processes in MHC-ligand selection, such as proteasome cleavage, that are omitted by existing methods due to the lack of labeled data. Third, we develop machine learning frameworks to model the enrichment of an antibody sequence in phage-panning experiments against a target antigen. We show that antibodies with low specificity can be reduced by a computational procedure using machine learning models trained for multiple targets. Moreover, machine learning can help to design novel antibody sequences with improved affinity.<br>by Haoyang Zeng<br>Ph. D.<br>Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography