To see the other types of publications on this topic, follow the link: Machine Learning (ML) model.

Dissertations / Theses on the topic 'Machine Learning (ML) model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Machine Learning (ML) model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

John, Meenu Mary. "Design Methods and Processes for ML/DL models." Licentiate thesis, Malmö universitet, Institutionen för datavetenskap och medieteknik (DVMT), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-45026.

Full text
Abstract:
Context: With the advent of Machine Learning (ML) and especially Deep Learning (DL) technology, companies are increasingly using Artificial Intelligence (AI) in systems, along with electronics and software. Nevertheless, the end-to-end process of developing, deploying and evolving ML and DL models in companies brings some challenges related to the design and scaling of these models. For example, access to and availability of data is often challenging, and activities such as collecting, cleaning, preprocessing, and storing data, as well as training, deploying and monitoring the model(s) are complex. Regardless of the level of expertise and/or access to data scientists, companies in all embedded systems domain struggle to build high-performing models due to a lack of established and systematic design methods and processes. Objective: The overall objective is to establish systematic and structured design methods and processes for the end-to-end process of developing, deploying and successfully evolving ML/DL models. Method: To achieve the objective, we conducted our research in close collaboration with companies in the embedded systems domain using different empirical research methods such as case study, action research and literature review. Results and Conclusions: This research provides six main results: First, it identifies the activities that companies undertake in parallel to develop, deploy and evolve ML/DL models, and the challenges associated with them. Second, it presents a conceptual framework for the continuous delivery of ML/DL models to accelerate AI-driven business in companies. Third, it presents a framework based on current literature to accelerate the end-to-end deployment process and advance knowledge on how to integrate, deploy and operationalize ML/DL models. Fourth, it develops a generic framework with five architectural alternatives for deploying ML/DL models at the edge. These architectural alternatives range from a centralized architecture that prioritizes (re)training in the cloud to a decentralized architecture that prioritizes (re)training at the edge. Fifth, it identifies key factors to help companies decide which architecture to choose for deploying ML/DL models. Finally, it explores how MLOps, as a practice that brings together data scientist teams and operations, ensures the continuous delivery and evolution of models.
APA, Harvard, Vancouver, ISO, and other styles
2

Garg, Anushka. "Comparing Machine Learning Algorithms and Feature Selection Techniques to Predict Undesired Behavior in Business Processesand Study of Auto ML Frameworks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285559.

Full text
Abstract:
In recent years, the scope of Machine Learning algorithms and its techniques are taking up a notch in every industry (for example, recommendation systems, user behavior analytics, financial applications and many more). In practice, they play an important role in utilizing the power of the vast data we currently generate on a daily basis in our digital world.In this study, we present a comprehensive comparison of different supervised Machine Learning algorithms and feature selection techniques to build a best predictive model as an output. Thus, this predictive model helps companies predict unwanted behavior in their business processes. In addition, we have researched for the automation of all the steps involved (from understanding data to implementing models) in the complete Machine Learning Pipeline, also known as AutoML, and provide a comprehensive survey of the various frameworks introduced in this domain. These frameworks were introduced to solve the problem of CASH (combined algorithm selection and Hyper- parameter optimization), which is basically automation of various pipelines involved in the process of building a Machine Learning predictive model.<br>Under de senaste åren har omfattningen av maskininlärnings algoritmer och tekniker tagit ett steg i alla branscher (till exempel rekommendationssystem, beteendeanalyser av användare, finansiella applikationer och många fler). I praktiken spelar de en viktig roll för att utnyttja kraften av den enorma mängd data vi för närvarande genererar dagligen i vår digitala värld.I den här studien presenterar vi en omfattande jämförelse av olika övervakade maskininlärnings algoritmer och funktionsvalstekniker för att bygga en bästa förutsägbar modell som en utgång. Således hjälper denna förutsägbara modell företag att förutsäga oönskat beteende i sina affärsprocesser. Dessutom har vi undersökt automatiseringen av alla inblandade steg (från att förstå data till implementeringsmodeller) i den fullständiga maskininlärning rörledningen, även känd som AutoML, och tillhandahåller en omfattande undersökning av de olika ramarna som introducerats i denna domän. Dessa ramar introducerades för att lösa problemet med CASH (kombinerat algoritmval och optimering av Hyper-parameter), vilket i grunden är automatisering av olika rörledningar som är inblandade i processen att bygga en förutsägbar modell för maskininlärning.
APA, Harvard, Vancouver, ISO, and other styles
3

Appelstål, Michael. "Multimodal Model for Construction Site Aversion Classification." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-421011.

Full text
Abstract:
Aversion on construction sites can be everything from missingmaterial, fire hazards, or insufficient cleaning. These aversionsappear very often on construction sites and the construction companyneeds to report and take care of them in order for the site to runcorrectly. The reports consist of an image of the aversion and atext describing the aversion. Report categorization is currentlydone manually which is both time and cost-ineffective. The task for this thesis was to implement and evaluate an automaticmultimodal machine learning classifier for the reported aversionsthat utilized both the image and text data from the reports. Themodel presented is a late-fusion model consisting of a Swedish BERTtext classifier and a VGG16 for image classification. The results showed that an automated classifier is feasible for thistask and could be used in real life to make the classification taskmore time and cost-efficient. The model scored a 66.2% accuracy and89.7% top-5 accuracy on the task and the experiments revealed someareas of improvement on the data and model that could be furtherexplored to potentially improve the performance.
APA, Harvard, Vancouver, ISO, and other styles
4

Hellberg, Johan, and Kasper Johansson. "Building Models for Prediction and Forecasting of Service Quality." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295617.

Full text
Abstract:
In networked systems engineering, operational datagathered from sensors or logs can be used to build data-drivenfunctions for performance prediction, anomaly detection, andother operational tasks [1]. Future telecom services will share acommon communication and processing infrastructure in orderto achieve cost-efficient and robust operation. A critical issuewill be to ensure service quality, whereby different serviceshave very different requirements. Thanks to recent advances incomputing and networking technologies we are able to collect andprocess measurements from networking and computing devices,in order to predict and forecast certain service qualities, such asvideo streaming or data stores. In this paper we examine thesetechniques, which are based on statistical learning methods. Inparticular we will analyze traces from testbed measurements andbuild predictive models. A detailed description of the testbed,which is localized at KTH, is given in Section II, as well as in[2].<br>Inom nätverk och systemteknik samlas operativ data från sensorer eller loggar som sedan kan användas för att bygga datadrivna funktioner för förutsägelser om prestanda och andra operationella uppgifter [1]. Framtidens teletjänster kommer att dela en gemensam kommunikation och bearbetnings infrastruktur i syfte att uppnå kostnadseffektiva och robusta nätverk. Ett kritiskt problem med detta är att kunna garantera en hög servicekvalitet. Detta problem uppstår till stor del som ett resultat av att olika tjänster har olika krav. Tack vare nyliga avanceringar inom beräkning och nätverksteknologi har vi kunnat samla in användningsmätningar från nätverk och olika datorenheter för att kunna förutspå servicekvalitet för exempelvis videostreaming och lagring av data. I detta arbete undersöker vi data med hjälp av statistiska inlärningsmetoder och bygger prediktiva modeller. En mer detaljerat beskrivning av vår testbed, som är lokaliserad på KTH, finns i [2].<br>Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
APA, Harvard, Vancouver, ISO, and other styles
5

Hallberg, Jesper. "Searching for the charged Higgs boson in the tau nu analysis using Boosted Decision Trees." Thesis, Uppsala universitet, Högenergifysik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-301351.

Full text
Abstract:
his thesis implements a multivariate analysis in the current cut- based search for the charged Higgs bosons, which are new scalar particles predicted by several extensions to the Standard Model. Heavy charged Higgs bosons (mH± mtop) produced in association with a top quark de- caying via H± → τν are considered. The final state contains a hadronic τ decay, missing transverse energy and a hadronically decaying top quark. This study is based on Monte Carlo samples simulated at CM-energy √ s = 13 TeV for signal and backgrounds. The figure of merit to measure the improvement of the new method with respect to the old analysis is the separation between the signal and background distributions. Four mass points (mH± = 200, 400, 600, 1000 GeV) are considered, and an increase of the separation ranging from 2.6% (1000 GeV) to 29.2% (200 GeV) com- pared to the current cut-based analysis is found.<br>Denna studie implementerar en flervariabel-analys till den befintliga snitt-baserade analysen av laddade Higgs-bosoner, nya skal ̈arpartiklar fo ̈rutsagda av flertalet fo ̈rl ̈angningar av Standardmodellen. Studien antar tunga lad- dade Higgs-bosoner (mH± mtop) producerade tillsammans med en top- kvark som fo ̈rfaller via H± → τν. Sluttillst ̊andet best ̊ar av ett hadroniskt τ-so ̈nderfall, f ̈orlorad transversell energi och en hadroniskt so ̈nderfallande √ toppkvark. Studien a ̈r baserad p ̊a data f ̈or signal och bakgrund. Fo ̈r att ma ̈ta fo ̈rba ̈ttringen av analysens ka ̈nslighet anva ̈nds avst ̊and mellan bakgrundens och signalens distribu- tioner som godhetstal. Fyra masspunkter (mH± = 200, 400, 600, 1000 GeV) anva ̈nds, och en o ̈kning av avst ̊and fr ̊an 2.6% (1000 GeV) till 29.2% (200 GeV) hittades.
APA, Harvard, Vancouver, ISO, and other styles
6

Mathias, Berggren, and Sonesson Daniel. "Design Optimization in Gas Turbines using Machine Learning : A study performed for Siemens Energy AB." Thesis, Linköpings universitet, Programvara och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173920.

Full text
Abstract:
In this thesis, the authors investigate how machine learning can be utilized for speeding up the design optimization process of gas turbines. The Finite Element Analysis (FEA) steps of the design process are examined if they can be replaced with machine learning algorithms. The study is done using a component with given constraints that are provided by Siemens Energy AB. With this component, two approaches to using machine learning are tested. One utilizes design parameters, i.e. raw floating-point numbers, such as the height and width. The other technique uses a high dimensional mesh as input. It is concluded that using design parameters with surrogate models is a viable way of performing design optimization while mesh input is currently not. Results from using different amount of data samples are presented and evaluated.
APA, Harvard, Vancouver, ISO, and other styles
7

Keisala, Simon. "Using a Character-Based Language Model for Caption Generation." Thesis, Linköpings universitet, Interaktiva och kognitiva system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-163001.

Full text
Abstract:
Using AI to automatically describe images is a challenging task. The aim of this study has been to compare the use of character-based language models with one of the current state-of-the-art token-based language models, im2txt, to generate image captions, with focus on morphological correctness. Previous work has shown that character-based language models are able to outperform token-based language models in morphologically rich languages. Other studies show that simple multi-layered LSTM-blocks are able to learn to replicate the syntax of its training data. To study the usability of character-based language models an alternative model based on TensorFlow im2txt has been created. The model changes the token-generation architecture into handling character-sized tokens instead of word-sized tokens. The results suggest that a character-based language model could outperform the current token-based language models, although due to time and computing power constraints this study fails to draw a clear conclusion. A problem with one of the methods, subsampling, is discussed. When using the original method on character-sized tokens this method removes characters (including special characters) instead of full words. To solve this issue, a two-phase approach is suggested, where training data first is separated into word-sized tokens where subsampling is performed. The remaining tokens are then separated into character-sized tokens. Future work where the modified subsampling and fine-tuning of the hyperparameters are performed is suggested to gain a clearer conclusion of the performance of character-based language models.
APA, Harvard, Vancouver, ISO, and other styles
8

Giuliani, Luca. "Extending the Moving Targets Method for Injecting Constraints in Machine Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23885/.

Full text
Abstract:
Informed Machine Learning is an umbrella term that comprises a set of methodologies in which domain knowledge is injected into a data-driven system in order to improve its level of accuracy, satisfy some external constraint, and in general serve the purposes of explainability and reliability. The said topid has been widely explored in the literature by means of many different techniques. Moving Targets is one such a technique particularly focused on constraint satisfaction: it is based on decomposition and bi-level optimization and proceeds by iteratively refining the target labels through a master step which is in charge of enforcing the constraints, while the training phase is delegated to a learner. In this work, we extend the algorithm in order to deal with semi-supervised learning and soft constraints. In particular, we focus our empirical evaluation on both regression and classification tasks involving monotonicity shape constraints. We demonstrate that our method is robust with respect to its hyperparameters, as well as being able to generalize very well while reducing the number of violations on the enforced constraints. Additionally, the method can even outperform, both in terms of accuracy and constraint satisfaction, other state-of-the-art techniques such as Lattice Models and Semantic-based Regularization with a Lagrangian Dual approach for automatic hyperparameter tuning.
APA, Harvard, Vancouver, ISO, and other styles
9

Lundström, Robin. "Machine Learning for Air Flow Characterization : An application of Theory-Guided Data Science for Air Fow characterization in an Industrial Foundry." Thesis, Karlstads universitet, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-72782.

Full text
Abstract:
In industrial environments, operators are exposed to polluted air which after constant exposure can cause irreversible lethal diseases such as lung cancer. The current air monitoring techniques are carried out sparely in either a single day annually or at few measurement positions for a few days.In this thesis a theory-guided data science (TGDS) model is presented. This hybrid model combines a steady state Computational Fluid Dynamics (CFD) model with a machine learning model. Both the CFD model and the machine learning algorithm was developed in Matlab. The CFD model serves as a basis for the airflow whereas the machine learning model addresses dynamical features in the foundry. Measurements have previously been made at a foundry where five stationary sensors and one mobile robot were used for data acquisition. An Echo State Network was used as a supervised learning technique for airflow predictions at each robot measurement position and Gaussian Processes (GP) were used as a regression technique to form an Echo State Map (ESM). The stationary sensor data were used as input for the echo state network and the difference between the CFD and robot measurements were used as teacher signal which formed a dynamic correction map that was added to the steady state CFD. The proposed model utilizes the high spatio-temporal resolution of the echo state map whilst making use of the physical consistency of the CFD. The initial applications of the novel hybrid model proves that the best qualities of these two models could come together in symbiosis to give enhanced characterizations.The proposed model could have an important role for future characterization of airflow and more research on this and similar topics are encouraged to make sure we properly understand the potential of this novel model.<br>Industriarbetare utsätts för skadliga luftburna ämnen vilket över tid leder till högre prevalens för lungsjukdomar så som kronisk obstruktiv lungsjukdom, stendammslunga och lungcancer. De nuvarande luftmätningsmetoderna genomförs årligen under korta sessioner och ofta vid få selekterade platser i industrilokalen. I denna masteruppsats presenteras en teorivägledd datavetenskapsmodell (TGDS) som kombinerar en stationär beräkningsströmningsdynamik (CFD) modell med en dynamisk maskininlärningsmodell. Både CFD-modellen och maskininlärningsalgoritmen utvecklades i Matlab. Echo State Network (ESN) användes för att träna maskininlärningsmodellen och Gaussiska Processer (GP) används som regressionsteknik för att kartlägga luftflödet över hela industrilokalen. Att kombinera ESN med GP för att uppskatta luftflöden i stålverk genomfördes första gången 2016 och denna modell benämns Echo State Map (ESM). Nätverket använder data från fem stationära sensorer och tränades på differensen mellan CFD-modellen och mätningar genomfördes med en mobil robot på olika platser i industriområdet. Maskininlärningsmodellen modellerar således de dynamiska effekterna i industrilokalen som den stationära CFD-modellen inte tar hänsyn till. Den presenterade modellen uppvisar lika hög temporal och rumslig upplösning som echo state map medan den också återger fysikalisk konsistens som CFD-modellen. De initiala applikationerna för denna model påvisar att de främsta egenskaperna hos echo state map och CFD används i symbios för att ge förbättrad karakteriseringsförmåga. Den presenterade modellen kan spela en viktig roll för framtida karakterisering av luftflöden i industrilokaler och fler studier är nödvändiga innan full förståelse av denna model uppnås.
APA, Harvard, Vancouver, ISO, and other styles
10

Sibelius, Parmbäck Sebastian. "HMMs and LSTMs for On-line Gesture Recognition on the Stylaero Board : Evaluating and Comparing Two Methods." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162237.

Full text
Abstract:
In this thesis, methods of implementing an online gesture recognition system for the novel Stylaero Board device are investigated. Two methods are evaluated - one based on LSTMs and one based on HMMs - on three kinds of gestures: Tap, circle, and flick motions. A method’s performance was measured in its accuracy in determining both whether any of the above listed gestures were performed and, if so, which gesture, in an online single-pass scenario. Insight was acquired regarding the technical challenges and possible solutions to the online aspect of the problem. Poor performance was, however, observed in both methods, with a likely culprit identified as low quality of training data, due to an arduous and complex gesture performance capturing process. Further research improving on the process of gathering data is suggested.
APA, Harvard, Vancouver, ISO, and other styles
11

Holmberg, Lars. "Human In Command Machine Learning." Licentiate thesis, Malmö universitet, Malmö högskola, Institutionen för datavetenskap och medieteknik (DVMT), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-42576.

Full text
Abstract:
Machine Learning (ML) and Artificial Intelligence (AI) impact many aspects of human life, from recommending a significant other to assist the search for extraterrestrial life. The area develops rapidly and exiting unexplored design spaces are constantly laid bare. The focus in this work is one of these areas; ML systems where decisions concerning ML model training, usage and selection of target domain lay in the hands of domain experts.  This work is then on ML systems that function as a tool that augments and/or enhance human capabilities. The approach presented is denoted Human In Command ML (HIC-ML) systems. To enquire into this research domain design experiments of varying fidelity were used. Two of these experiments focus on augmenting human capabilities and targets the domains commuting and sorting batteries. One experiment focuses on enhancing human capabilities by identifying similar hand-painted plates. The experiments are used as illustrative examples to explore settings where domain experts potentially can: independently train an ML model and in an iterative fashion, interact with it and interpret and understand its decisions.  HIC-ML should be seen as a governance principle that focuses on adding value and meaning to users. In this work, concrete application areas are presented and discussed. To open up for designing ML-based products for the area an abstract model for HIC-ML is constructed and design guidelines are proposed. In addition, terminology and abstractions useful when designing for explicability are presented by imposing structure and rigidity derived from scientific explanations. Together, this opens up for a contextual shift in ML and makes new application areas probable, areas that naturally couples the usage of AI technology to human virtues and potentially, as a consequence, can result in a democratisation of the usage and knowledge concerning this powerful technology.
APA, Harvard, Vancouver, ISO, and other styles
12

Nangalia, V. "ML-EWS - Machine Learning Early Warning System : the application of machine learning to predict in-hospital patient deterioration." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1565193/.

Full text
Abstract:
Preventing hospitalised patients from suffering adverse event (AEs) (unexpected cardiac, arrest, intensive care unit admission, surgery or death) is a priority in healthcare. Almost 50% of these AEs, caused by mistakes/poor standards of care, are thought to be preventable. The identification and referral of a patient at risk of an AE to a dedicated rapid response team is a key mechanism for their reduction. Focussing on variables that are routinely collected and electronically stored (blood test data, and administrative data: demographics, date and method of admission, and co-morbidities), along with their trends, I have collected data on ~8 million admissions. I have explained how to navigate the complex ethical and legal landscape of performing such an ambitious data linkage and collection project. Analysing data on ~2 million hospital admissions with an in-hospital blood test result, I have 1. described how these variables (particularly urea and creatinine blood tests, method of admission, and date of admission) influence in-hospital mortality rate in different groups of patient. 2. created four machine learning (ML) models that have the highest accuracy yet described for identifying a patient at risk of an SAE, while at the same time capturing the majority of patients likely to die (high sensitivity). These models ML-Dehydration, ML-AKI, ML-Admission, and ML-Two- Tests, can be applied to admissions with limited data, specific syndromes, or on all patients in hospital at different time points in their hospital trajectory respectively. Their area under the receiver operator curves are 79.6%, 85.9%, 93% and 90.6% respectively. 3. built and deployed a technology platform Patient Rescue that allows for the automated application of any model in any hospital, as well as the communication of rich patient level reports to clinicians, all in real-time. The ML models and the Patient Rescue platform together form the ML – Early Warning System.
APA, Harvard, Vancouver, ISO, and other styles
13

Mattsson, Fredrik, and Anton Gustafsson. "Optimize Ranking System With Machine Learning." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-37431.

Full text
Abstract:
This thesis investigates how recommendation systems has been used and can be used with the help of different machine learning algorithms. Algorithms used and presented are decision tree, random forest and singular-value decomposition(SVD). Together with Tingstad, we have tried to implement the SVD function on their recommendation engine in order to enhance the recommendation given. A trivial presentation on how the algorithms work. General information about machine learning and how we tried to implement it with Tingstad’s data. Implementations with Netflix’s and Movielens open-source dataset was done, estimated with RMSE and MAE.
APA, Harvard, Vancouver, ISO, and other styles
14

Tabell, Johnsson Marco, and Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Kakadost, Naser, and Charif Ramadan. "Empirisk undersökning av ML strategier vid prediktion av cykelflöden baserad på cykeldata och veckodagar." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20168.

Full text
Abstract:
Detta arbete fokuserar på prediktion av cykeltrafik under en månad på en given gata i Malmö med hjälp av maskininlärning. Algoritmen som används är Python-implementering av Stödvektormaskin (Support Vector Machine) från Scikit . Data som används är antalet cyklister/dag under 2006-2013 från en cykel-barometer som är placerad på Kaptensgatan i Malmö. Barometerns funktion är att räkna antalet cyklar som passerar samt registrera tiden. I vår studie undersöker vi hur precision av prediktionen av antalet cyklister varje dag under fyra veckor i oktober 2013, mätt med metoderna RMSE och MAPE, beror av valet av indata (cykeldata och angivelse av veckodag). Ett antal experiment med olika kombinationer av indata och representanter av veckodagar genomfördes. Resultaten visar att testet med störst indata-mängd och veckodagar, angivet som 1-7, gav bäst prediktion.<br>This work focuses on the prediction of bicycle traffic for a month on a given street in Malmö by means of machine learning. The algorithm used is the Python implementation of Support Vector Machine from Scikit. The data used is the number of cyclists / day during 2006-2013 from a cycle barometer placed on Kaptensgatan in Malmö. The function of the barometer is to count the number of cycles that pass and register the time. In our study we investigate how precision of the prediction of the number of cyclists each day for four weeks in October 2013, measured by the RMSE and MAPE methods, depends on the choice of input data (cycle data and the weekday indication). A number of experiments with different combinations of input data and representatives of weekdays were conducted. The results show that the test with the largest input amount and week days indicated as 1-7 gave the best prediction.
APA, Harvard, Vancouver, ISO, and other styles
16

Gustafsson, Sebastian. "Interpretable serious event forecasting using machine learning and SHAP." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-444363.

Full text
Abstract:
Accurate forecasts are vital in multiple areas of economic, scientific, commercial, and industrial activity. There are few previous studies on using forecasting methods for predicting serious events. This thesis set out to investigate two things, firstly whether machine learning models could be applied to the objective of forecasting serious events. Secondly, if the models could be made interpretable. Given these objectives, the approach was to formulate two forecasting tasks for the models and then use the Python framework SHAP to make them interpretable. The first task was to predict if a serious event will happen in the coming eight hours. The second task was to forecast how many serious events that will happen in the coming six hours. GBDT and LSTM models were implemented, evaluated, and compared on both tasks. Given the problem complexity of forecasting, the results match those of previous related research. On the classification task, the best performing model achieved an accuracy of 71.6%, and on the regression task, it missed by less than 1 on average.<br>Exakta prognoser är viktiga inom flera områden av ekonomisk, vetenskaplig, kommersiell och industriell verksamhet. Det finns få tidigare studier där man använt prognosmetoder för att förutsäga allvarliga händelser. Denna avhandling syftar till att undersöka två saker, för det första om maskininlärningsmodeller kan användas för att förutse allvarliga händelser. För det andra, om modellerna kunde göras tolkbara. Med tanke på dessa mål var metoden att formulera två prognosuppgifter för modellerna och sedan använda Python-ramverket SHAP för att göra dem tolkbara. Den första uppgiften var att förutsäga om en allvarlig händelse kommer att ske under de kommande åtta timmarna. Den andra uppgiften var att förutse hur många allvarliga händelser som kommer att hända under de kommande sex timmarna. GBDT- och LSTM-modeller implementerades, utvärderades och jämfördes för båda uppgifterna. Med tanke på problemkomplexiteten i att förutspå framtiden matchar resultaten de från tidigare relaterad forskning. På klassificeringsuppgiften uppnådde den bäst presterande modellen en träffsäkerhet på 71,6%, och på regressionsuppgiften missade den i genomsnitt med mindre än 1 i antal förutspådda allvarliga händelser.
APA, Harvard, Vancouver, ISO, and other styles
17

Sammaritani, Gloria. "Google BigQuery ML. Analisi comparativa di un nuovo framework per il Machine Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Find full text
Abstract:
La crescita esponenziale nella digitalizzazione dei processi, lo sviluppo dei canali digitali e laconseguente crescita del volume e della variet`a dei dati operativi, unitamente e al miglioramentodelle tecnologie e capacit`a computazionali ha permesso di registrare notevoli progressi in ambitoMachine Learning. Le tecniche sviluppate all’interno di questo campo sono ora in grado dianalizzare e apprendere da enormi quantit`a di esempi del mondo reale. Il numero di algoritmi diMachine Learning `e ampio e in crescita cos`ı come le loro implementazioni attraverso frameworke librerie. Lo sviluppo di applicativi `e altrettanto frenetico con un gran numero di softwareopen source provenienti da universit`a, industria, start-up o comunit`a di ricercatori.La tesi in questione si pone l’obbiettivo di illustrare le potenzialit`a, le possibili utilit`a e i limitidi un nuovo framework per il machine learning, BigQueryML, basato su linguaggio SQL esviluppato Google. Ad argomentazione della tesi viene riportata un’analisi comparativa, sustessi task, con tecnologie gi`a affermate e usate in ambito, quali la libreria di apprendimentoautomatico SciKitLearn e la piattaforma Tensorflow 2. Il lavoro `e strutturato in modo dafornire una prima introduzione delle tecnologie adoperate e una successiva descrizione dellemetodologie di comparazione, per poi riportare i risultati e le relative considerazioni finali.
APA, Harvard, Vancouver, ISO, and other styles
18

Schoenfeld, Brandon J. "Metalearning by Exploiting Granular Machine Learning Pipeline Metadata." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8730.

Full text
Abstract:
Automatic machine learning (AutoML) systems have been shown to perform better when they use metamodels trained offline. Existing offline metalearning approaches treat ML models as black boxes. However, modern ML models often compose multiple ML algorithms into ML pipelines. We expand previous metalearning work on estimating the performance and ranking of ML models by exploiting the metadata about which ML algorithms are used in a given pipeline. We propose a dynamically assembled neural network with the potential to model arbitrary DAG structures. We compare our proposed metamodel against reasonable baselines that exploit varying amounts of pipeline metadata, including metamodels used in existing AutoML systems. We observe that metamodels that fully exploit pipeline metadata are better estimators of pipeline performance. We also find that ranking pipelines based on dataset metafeature similarity outperforms ranking based on performance estimates.
APA, Harvard, Vancouver, ISO, and other styles
19

Haussamer, Nicolai Haussamer. "Model Calibration with Machine Learning." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29451.

Full text
Abstract:
This dissertation focuses on the application of neural networks to financial model calibration. It provides an introduction to the mathematics of basic neural networks and training algorithms. Two simplified experiments based on the Black-Scholes and constant elasticity of variance models are used to demonstrate the potential usefulness of neural networks in calibration. In addition, the main experiment features the calibration of the Heston model using model-generated data. In the experiment, we show that the calibrated model parameters reprice a set of options to a mean relative implied volatility error of less than one per cent. The limitations and shortcomings of neural networks in model calibration are also investigated and discussed.
APA, Harvard, Vancouver, ISO, and other styles
20

Hellborg, Per. "Optimering av datamängder med Machine learning : En studie om Machine learning och Internet of Things." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-13747.

Full text
Abstract:
This report is about how an Internet of Things (IoT) optimization can be done with Machine learning (ML). The IoT- devices in this report are sensors in containers that read how full the containers are. The report contains a case from Sogeti. Were a client can use this optimization to get better routes for their garbage truck, with this solution the garbage trucks will only go to full containers and be able to skip empty or close to empty containers. This will result in less fuel costs and a better environment. This solution can be used for every industry that needs a route optimization. To do this there must first be understanding for what IoT is and what is possible to do with it then there need to be understanding about ML. The report cover these parts and tell how the method Design science (DS) is being used to produce this solution and some information about the method. This project also works agile with iterations under the implementation stage in DS. On the ML part there is an argumentation of a comparison of witch algorithm should be used. There are two candidates: Hill- Climbing and K-means Cluster. For this solution K-means cluster will be the one being used. K-means clustering is an unsupervised algorithm that doesn’t need practice data, it pairs data that are very similar and builds clusters. It will do this with full containers and build clusters with the ones that have similar coordinates so the full containers are close to each other. When this is done the clusters exports to a database and then there is a brief description on how its possible to make a map that makes a route between the containers in the cluster.
APA, Harvard, Vancouver, ISO, and other styles
21

Nämerforslund, Tim. "Machine Learning Adversaries in Video Games : Using reinforcement learning in the Unity Engine to create compelling enemy characters." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-42746.

Full text
Abstract:
I och med att videospel blir mer avancerade, inte bara grafiskt utan också som konstform samt att dom erbjuder en mer inlevelsefull upplevelse, så kan det förväntas att spelen också ska erbjuda en större utmaning för att få spelaren bli ännu mer engagerad i spelet. Dagens spelare är vana vid fiender vars beteende styrs av tydliga mönster och regler, som beroende på situation agerar på ett förprogrammerat sätt och agerar utifrån förutsägbara mönster. Detta leder till en spelupplevelse där målet blir att klura ut det här mönstret och hitta ett sätt att överlista eller besegra det. Men tänk om det fanns en möjlighet att skapa en ny form av fiende svarar och anpassar sig beroende på hur spelaren beter sig? Som anpassar sig och kommer på egna strategier utifrån hur spelaren spelar, som aktivt försöker överlista spelaren? Genom maskininlärning i spel möjliggörs just detta. Med en maskininlärningsmodell som styr fienderna och tränas mot spelarna som möter den så lär sig fienderna att möta spelarna på ett dynamiskt sätt som anpassas allt eftersom spelaren spelar spelet. Den här studien ämnar att undersöka stegen som krävs för att implementera maskininlärning i Unity motorn samt undersöka ifall det finns någon upplevd skillnad i spelupplevelsen hos spelare som fått möta fiender styrda av en maskininlärningsmodell samt en mer traditionell typ av fiende. Data samlas in från testspelarnas spelsessioner samt deras svar i form av ett frågeformulär, där datan presenteras i grafform för att ge insikt kring ifall fienderna var likvärdigt svåra att spela mot. Svaren från frågeformulären används för att jämföra spelarnas spelupplevelser och utifrån detta se skillnaderna mellan dom. Skalan på spelet och dess enkelhet leder till att svaren inte bör påverkas av okända och ej kontrollerbara faktorer, vilket ger svar som ger oss insikt i skillnaderna mellan dom olika spelupplevelserna där en preferens för fiender styrda av maskininlärningsmodeller kan anas, då dom upplevs mer oförutsägbara och varierande.<br>As video games become more complex and more immersive, not just graphically or as an artform, but also technically, it can be expected that games behave on a deeper level to challenge and immerse the player further. Today’s gamers have gotten used to pattern based enemies, moving between preprogrammed states with predictable patterns, which lends itself to a certain kind of gameplay where the goal is to figure out how to beat said pattern. But what if there could be more in terms of challenging the player on an interactive level? What if the enemies could learn and adapt, trying to outsmart the player just as much as the player tries to outsmart the enemies. This is where the field of machine learning enters the stage and opens up for an entirely new type of non-player character in videogames. An enemy who uses a trained machine learning model to play against the player, who can adapt and become better as more people play the game. This study aims to look at early steps to implement machine learning in video games, in this case in the Unity engine, and look at the players perception of said enemies compared to normal state-driven enemies. Via testing voluntary players by letting them play against two kinds of enemies, data is gathered to compare the average performance of the players, after which players answer a questionnaire. These answers are analysed to give an indication of preference in type of enemy. Overall the small scale of the game and simplicity of the enemies gives clear answers but also limits the potential complexity of the enemies and thus the players enjoyment. Though this also enables us to discern a perceived difference in the players experience, where a preference for machine learning controlled enemies is noticeable, as they behave less predictable with more varied behaviour.
APA, Harvard, Vancouver, ISO, and other styles
22

Bergkvist, Alexander, Nils Hedberg, Sebastian Rollino, and Markus Sagen. "Surmize: An Online NLP System for Close-Domain Question-Answering and Summarization." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412247.

Full text
Abstract:
The amount of data available and consumed by people globally is growing. To reduce mental fatigue and increase the general ability to gain insight into complex texts or documents, we have developed an application to aid in this task. The application allows users to upload documents and ask domain-specific questions about them using our web application. A summarized version of each document is presented to the user, which could further facilitate their understanding of the document and guide them towards what types of questions could be relevant to ask. Our application allows users flexibility with the types of documents that can be processed, it is publicly available, stores no user data, and uses state-of-the-art models for its summaries and answers. The result is an application that yields near human-level intuition for answering questions in certain isolated cases, such as Wikipedia and news articles, as well as some scientific texts. The application shows a decrease in reliability and its prediction as to the complexity of the subject, the number of words in the document, and grammatical inconsistency in the questions increases. These are all aspects that can be improved further if used in production.<br>Mängden data som är tillgänglig och konsumeras av människor växer globalt. För att minska den mentala trötthet och öka den allmänna förmågan att få insikt i komplexa, massiva texter eller dokument, har vi utvecklat en applikation för att bistå i de uppgifterna. Applikationen tillåter användare att ladda upp dokument och fråga kontextspecifika frågor via vår webbapplikation. En sammanfattad version av varje dokument presenteras till användaren, vilket kan ytterligare förenkla förståelsen av ett dokument och vägleda dem mot vad som kan vara relevanta frågor att ställa. Vår applikation ger användare möjligheten att behandla olika typer av dokument, är tillgänglig för alla, sparar ingen personlig data, och använder de senaste modellerna inom språkbehandling för dess sammanfattningar och svar. Resultatet är en applikation som når en nära mänsklig intuition för vissa domäner och frågor, som exempelvis Wikipedia- och nyhetsartiklar, samt viss vetensaplig text. Noterade undantag för tillämpningen härrör från ämnets komplexitet, grammatiska korrekthet för frågorna och dokumentets längd. Dessa är områden som kan förbättras ytterligare om den används i produktionen.
APA, Harvard, Vancouver, ISO, and other styles
23

Mahfouz, Tarek Said. "Construction legal support for differing site conditions (DSC) through statistical modeling and machine learning (ML)." [Ames, Iowa : Iowa State University], 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
24

Zhao, Yajing. "Chaotic Model Prediction with Machine Learning." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8419.

Full text
Abstract:
Chaos theory is a branch of modern mathematics concerning the non-linear dynamic systems that are highly sensitive to their initial states. It has extensive real-world applications, such as weather forecasting and stock market prediction. The Lorenz system, defined by three ordinary differential equations (ODEs), is one of the simplest and most popular chaotic models. Historically research has focused on understanding the Lorenz system's mathematical characteristics and dynamical evolution including the inherent chaotic features it possesses. In this thesis, we take a data-driven approach and propose the task of predicting future states of the chaotic system from limited observations. We explore two directions, answering two distinct fundamental questions of the system based on how informed we are about the underlying model. When we know the data is generated by the Lorenz System with unknown parameters, our task becomes parameter estimation (a white-box problem), or the ``inverse'' problem. When we know nothing about the underlying model (a black-box problem), our task becomes sequence prediction. We propose two algorithms for the white-box problem: Markov-Chain-Monte-Carlo (MCMC) and a Multi-Layer-Perceptron (MLP). Specially, we propose to use the Metropolis-Hastings (MH) algorithm with an additional random walk to avoid the sampler being trapped into local energy wells. The MH algorithm achieves moderate success in predicting the $\rho$ value from the data, but fails at the other two parameters. Our simple MLP model is able to attain high accuracy in terms of the $l_2$ distance between the prediction and ground truth for $\rho$ as well, but also fails to converge satisfactorily for the remaining parameters. We use a Recurrent Neural Network (RNN) to tackle the black-box problem. We implement and experiment with several RNN architectures including Elman RNN, LSTM, and GRU and demonstrate the relative strengths and weaknesses of each of these methods. Our results demonstrate the promising role of machine learning and modern statistical data science methods in the study of chaotic dynamic systems. The code for all of our experiments can be found on \url{https://github.com/Yajing-Zhao/}
APA, Harvard, Vancouver, ISO, and other styles
25

Nitesh, Varma Rudraraju Nitesh, and Boyanapally Varun Varun. "Data Quality Model for Machine Learning." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18498.

Full text
Abstract:
Context: - Machine learning is a part of artificial intelligence, this area is now continuously growing day by day. Most internet related services such as Social media service, Email Spam, E-commerce sites, Search engines are now using machine learning. The Quality of machine learning output relies on the input data, so the input data is crucial for machine learning and good quality of input data can give a better outcome to the machine learning system. In order to achieve quality data, a data scientist can use a data quality model on data of machine learning. Data quality model can help data scientists to monitor and control the input data of machine learning. But there is no considerable amount of research done on data quality attributes and data quality model for machine learning. Objectives: - The primary objectives of this paper are to find and understand the state-of-art and state-of-practice on data quality attributes for machine learning, and to develop a data quality model for machine learning in collaboration with data scientists. Methods: - This paper mainly consists of two studies: - 1) Conducted a literature review in the different database in order to identify literature on data quality attributes and data quality model for machine learning. 2) An in-depth interview study was conducted to allow a better understanding and verifying of data quality attributes that we identified from our literature review study, this process is carried out with the collaboration of data scientists from multiple locations. Totally of 15 interviews were performed and based on the results we proposed a data quality model based on these interviewees perspective. Result: - We identified 16 data quality attributes as important from our study which is based on the perspective of experienced data scientists who were interviewed in this study. With these selected data quality attributes, we proposed a data quality model with which quality of data for machine learning can be monitored and improved by data scientists, and effects of these data quality attributes on machine learning have also been stated. Conclusion: - This study signifies the importance of quality of data, for which we proposed a data quality model for machine learning based on the industrial experiences of a data scientist. This research gap is a benefit to all machine learning practitioners and data scientists who intended to identify quality data for machine learning. In order to prove that data quality attributes in the data quality model are important, a further experiment can be conducted, which is proposed in future work.
APA, Harvard, Vancouver, ISO, and other styles
26

Menke, Joshua Ephraim. "Improving Machine Learning Through Oracle Learning." BYU ScholarsArchive, 2007. https://scholarsarchive.byu.edu/etd/843.

Full text
Abstract:
The following dissertation presents a new paradigm for improving the training of machine learning algorithms, oracle learning. The main idea in oracle learning is that instead of training directly on a set of data, a learning model is trained to approximate a given oracle's behavior on a set of data. This can be beneficial in situations where it is easier to obtain an oracle than it is to use it at application time. It is shown that oracle learning can be applied to more effectively reduce the size of artificial neural networks, to more efficiently take advantage of domain experts by approximating them, and to adapt a problem more effectively to a machine learning algorithm.
APA, Harvard, Vancouver, ISO, and other styles
27

Wang, Gang. "Solution path algorithms : an efficient model selection approach /." View abstract or full-text, 2007. http://library.ust.hk/cgi/db/thesis.pl?CSED%202007%20WANGG.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Wang, Jiahao. "Vehicular Traffic Flow Prediction Model Using Machine Learning-Based Model." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42288.

Full text
Abstract:
Intelligent Transportation Systems (ITS) have attracted an increasing amount of attention in recent years. Thanks to the fast development of vehicular computing hardware, vehicular sensors and citywide infrastructures, many impressive applications have been proposed under the topic of ITS, such as Vehicular Cloud (VC), intelligent traffic controls, etc. These applications can bring us a safer, more efficient, and also more enjoyable transportation environment. However, an accurate and efficient traffic flow prediction system is needed to achieve these applications, which creates an opportunity for applications under ITS to deal with the possible road situation in advance. To achieve better traffic flow prediction performance, many prediction methods have been proposed, such as mathematical modeling methods, parametric methods, and non-parametric methods. It is always one of the hot topics about how to implement an efficient, robust and accurate vehicular traffic prediction system. With the help of Machine Learning-based (ML) methods, especially Deep Learning-based (DL) methods, the accuracy of the prediction model is increased. However, we also noticed that there are still many open challenges under ML-based vehicular traffic prediction model real-world implementation. Firstly, the time consumption for DL model training is relatively huge compared to parametric models, such as ARIMA, SARIMA, etc. Second, it is still a hot topic for the road traffic prediction that how to capture the special relationship between road detectors, which is affected by the geographic correlation, as well as the time change. The last but not the least, it is important for us to implement the prediction system in the real world; meanwhile, we should find a way to make use of the advanced technology applied in ITS to improve the prediction system itself. In our work, we focus on improving the features of the prediction model, which can be helpful for implementing the model in the real word. Firstly, we introduced an optimization strategy for ML-based models' training process, in order to reduce the time cost in this process. Secondly, We provide a new hybrid deep learning model by using GCN and the deep aggregation structure (i.e., the sequence to sequence structure) of the GRU. Meanwhile, in order to solve the real-world prediction problem, i.e., the online prediction task, we provide a new online prediction strategy by using refinement learning. In order to further improve the model's accuracy and efficiency when applied to ITS, we provide a parallel training strategy by using the benefits of the vehicular cloud structure.
APA, Harvard, Vancouver, ISO, and other styles
29

Ucci, Graziano. "The Interstellar Medium of Galaxies: a Machine Learning Approach." Doctoral thesis, Scuola Normale Superiore, 2019. http://hdl.handle.net/11384/85928.

Full text
Abstract:
Understanding the structure and physical properties of the Interstellar Medium (ISM) in galaxies, especially at high redshift, is one of the major drivers of galaxy formation studies. Measurements of key properties as gas density, column density, metallicity, ionization parameter, and Habing flux, rely on galaxy spectra obtained through the most advanced telescopes (both earth-based and space-borne) and, in particular, on their emission lines. However, finding diagnostics that are free of significant systematic uncertainties remains an unsolved problem. Several attempts have been made to recover ISM physical properties by mean of diagnostics based on small, pre-selected subsets of emission line ratios. Most of these previous works focused on ionized nebulae, and have obtained diagnostics for the physical properties of galaxies based only on the strongest nebular emission lines coming from extra-galactic HII regions and star-forming galaxies. The main purpose of this work is to reconstruct key ISM physical properties of galaxies from their spectra. The aim is to maximize the information that can be extracted from such data by using not only few specific and pre-selected emission lines, but the full information encoded in the spectra. This is now possible thanks to the combination of powerful Supervised Machine Learning (ml) algorithms and large synthetic spectra libraries. In order to achieve this goal, I have developed a code called game (GAlaxy Machine learning for Emission lines), a new fast method to reconstruct the ISM physical properties by using all the information carried by the emission lines intensities present in the available spectrum. The library included in this code covers a very large range of plausible ISM physical properties to accurately describe the physics both of ionized regions and of other phases (i.e neutral, molecular) of the ISM. The strength of the method relies on the fact that the ml algorithm can learn from all the lines present in a spectrum, including the weakest ones as those coming for example from neutral ISM components. I verified that with ml it is possible to set strong constraints on the properties of the different phases from observed spectra. game has been extensively tested, and shown to deliver excellent predictive performances when applied to synthetic spectra. A ml approach will become fundamental with upcoming high-quality spectra, including also faint lines of high-redshift galaxies, from new facilities such as the James Webb Space Telescope (JWST) and the Extremely Large Telescope (ELT). The astrophysical community will be therefore into an era where ml algorithms and Big Data Analytics will become extremely useful tools in the data-mining process. This is already the case for local observations where Integral Field Units (IFUs) are already able to provide observations containing tens of thousands of spaxels. A notable study case is the ISM of local Blue Compact Galaxies (BCGs), a subclass of dwarf galaxies. In fact, since BCGs are low-metallicity, compact, star-forming systems, they are thought to represent local analogues of early galaxies that will become soon observable in greater detail (e.g. with JWST). Thus, ISM studies of local BCGs can be used as benchmarks for understanding the structure, formation, and evolution of highredshift galaxies. In addition to a general description of the ml algorithm and the game code, I will show the first game results concerning the interpretation of high-quality IFU spectra of BCGs.
APA, Harvard, Vancouver, ISO, and other styles
30

REPETTO, MARCO. "Black-box supervised learning and empirical assessment: new perspectives in credit risk modeling." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2023. https://hdl.handle.net/10281/402366.

Full text
Abstract:
I recenti algoritmi di apprendimento automatico ad alte prestazioni sono convincenti ma opachi, quindi spesso è difficile capire come arrivano alle loro previsioni, dando origine a problemi di interpretabilità. Questi problemi sono particolarmente rilevanti nell'apprendimento supervisionato, dove questi modelli "black-box" non sono facilmente comprensibili per le parti interessate. Un numero crescente di lavori si concentra sul rendere più interpretabili i modelli di apprendimento automatico, in particolare quelli di apprendimento profondo. Gli approcci attualmente proposti si basano su un'interpretazione post-hoc, utilizzando metodi come la mappatura della salienza e le dipendenze parziali. Nonostante i progressi compiuti, l'interpretabilità è ancora un'area di ricerca attiva e non esiste una soluzione definitiva. Inoltre, nei processi decisionali ad alto rischio, l'interpretabilità post-hoc può essere subottimale. Un esempio è il campo della modellazione del rischio di credito aziendale. In questi campi, i modelli di classificazione discriminano tra buoni e cattivi mutuatari. Di conseguenza, gli istituti di credito possono utilizzare questi modelli per negare le richieste di prestito. Il rifiuto di un prestito può essere particolarmente dannoso quando il mutuatario non può appellarsi o avere una spiegazione e una motivazione della decisione. In questi casi, quindi, è fondamentale capire perché questi modelli producono un determinato risultato e orientare il processo di apprendimento verso previsioni basate sui fondamentali. Questa tesi si concentra sul concetto di Interpretable Machine Learning, con particolare attenzione al contesto della modellazione del rischio di credito. In particolare, la tesi ruota attorno a tre argomenti: l'interpretabilità agnostica del modello, l'interpretazione post-hoc nel rischio di credito e l'apprendimento guidato dall'interpretabilità. Più specificamente, il primo capitolo è un'introduzione guidata alle tecniche model-agnostic che caratterizzano l'attuale panorama del Machine Learning e alle loro implementazioni. Il secondo capitolo si concentra su un'analisi empirica del rischio di credito delle piccole e medie imprese italiane. Propone una pipeline analitica in cui l'interpretabilità post-hoc gioca un ruolo cruciale nel trovare le basi rilevanti che portano un'impresa al fallimento. Il terzo e ultimo articolo propone una nuova metodologia di iniezione di conoscenza multicriteriale. La metodologia si basa sulla doppia retropropagazione e può migliorare le prestazioni del modello, soprattutto in caso di scarsità di dati. Il vantaggio essenziale di questa metodologia è che permette al decisore di imporre le sue conoscenze pregresse all'inizio del processo di apprendimento, facendo previsioni che si allineano con i fondamentali.<br>Recent highly performant Machine Learning algorithms are compelling but opaque, so it is often hard to understand how they arrive at their predictions giving rise to interpretability issues. Such issues are particularly relevant in supervised learning, where such black-box models are not easily understandable by the stakeholders involved. A growing body of work focuses on making Machine Learning, particularly Deep Learning models, more interpretable. The currently proposed approaches rely on post-hoc interpretation, using methods such as saliency mapping and partial dependencies. Despite the advances that have been made, interpretability is still an active area of research, and there is no silver bullet solution. Moreover, in high-stakes decision-making, post-hoc interpretability may be sub-optimal. An example is the field of enterprise credit risk modeling. In such fields, classification models discriminate between good and bad borrowers. As a result, lenders can use these models to deny loan requests. Loan denial can be especially harmful when the borrower cannot appeal or have the decision explained and grounded by fundamentals. Therefore in such cases, it is crucial to understand why these models produce a given output and steer the learning process toward predictions based on fundamentals. This dissertation focuses on the concept of Interpretable Machine Learning, with particular attention to the context of credit risk modeling. In particular, the dissertation revolves around three topics: model agnostic interpretability, post-hoc interpretation in credit risk, and interpretability-driven learning. More specifically, the first chapter is a guided introduction to the model-agnostic techniques shaping today’s landscape of Machine Learning and their implementations. The second chapter focuses on an empirical analysis of the credit risk of Italian Small and Medium Enterprises. It proposes an analytical pipeline in which post-hoc interpretability plays a crucial role in finding the relevant underpinnings that drive a firm into bankruptcy. The third and last paper proposes a novel multicriteria knowledge injection methodology. The methodology is based on double backpropagation and can improve model performance, especially in the case of scarce data. The essential advantage of such methodology is that it allows the decision maker to impose his previous knowledge at the beginning of the learning process, making predictions that align with the fundamentals.
APA, Harvard, Vancouver, ISO, and other styles
31

Björkberg, David. "Comparison of cumulative reward withone, two and three layered artificialneural network in a simple environmentwhen using ml-agents." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21188.

Full text
Abstract:
Background.In machine learning you let the computer play a scenario, often millions of times. When the computer plays it receives feedback based on preset guidelines. The computer then adjusts its behaviour based on that feedback. The way the computer stores its feedback is in its artificial neural network(ANN). The ANN consists of an input layer, a set amount of hidden layers and an output layer. The ANN calculates actions using weights between the nodes in each layer and modifies those weights when it receives feedback. ml-agents is Unity Technologies implementation of machine learning. Objectives.ml-agents is a complex system with many different configurations. This results in users needing sources on what configuration to use for the best results. Our thesis aimed to answer the question of how many hidden layers yield the best results.We did this by attempting to answer our research question "How many layers are required to make the network capable of capturing the complexities of the environ-ment?". Methods.We used a prebuilt environment provided by Unity, in which the agent aims to keep a ball on its head for as long as possible. The training was collected by Tensorflow, which then provided graphs for each training session. We used these graphs to evaluate the training sessions. We ran each training session several times to get more consistent results. To evaluate the training sessions we looked at the peak of their cumulative reward graph and secondarily on how fast they reached this peak. Results.We found that with just one layer, the agent could only get roughly a fifth of the way to capturing the complexity of the environment. However, with two and three layers the agent was capable of capturing the complexity of the environment.The three layered training sessions reached their cumulative reward peak 22 percent faster than the two layered. Conclusions.We managed to get an answer to our research question. The minimum amount of hidden layers required to capture the complexity of the environment is two. However, with an additional layer the agent was able to get the same result faster. Which is worth taking into consideration
APA, Harvard, Vancouver, ISO, and other styles
32

Ferreira, E. (Eija). "Model selection in time series machine learning applications." Doctoral thesis, Oulun yliopisto, 2015. http://urn.fi/urn:isbn:9789526209012.

Full text
Abstract:
Abstract Model selection is a necessary step for any practical modeling task. Since the true model behind a real-world process cannot be known, the goal of model selection is to find the best approximation among a set of candidate models. In this thesis, we discuss model selection in the context of time series machine learning applications. We cover four steps of the commonly followed machine learning process: data preparation, algorithm choice, feature selection and validation. We consider how the characteristics and the amount of data available should guide the selection of algorithms to be used, and how the data set at hand should be divided for model training, selection and validation to optimize the generalizability and future performance of the model. We also consider what are the special restrictions and requirements that need to be taken into account when applying regular machine learning algorithms to time series data. We especially aim to bring forth problems relating model over-fitting and over-selection that might occur due to careless or uninformed application of model selection methods. We present our results in three different time series machine learning application areas: resistance spot welding, exercise energy expenditure estimation and cognitive load modeling. Based on our findings in these studies, we draw general guidelines on which points to consider when starting to solve a new machine learning problem from the point of view of data characteristics, amount of data, computational resources and possible time series nature of the problem. We also discuss how the practical aspects and requirements set by the environment where the final model will be implemented affect the choice of algorithms to use<br>Tiivistelmä Mallinvalinta on oleellinen osa minkä tahansa käytännön mallinnusongelman ratkaisua. Koska mallinnettavan ilmiön toiminnan taustalla olevaa todellista mallia ei voida tietää, on mallinvalinnan tarkoituksena valita malliehdokkaiden joukosta sitä lähimpänä oleva malli. Tässä väitöskirjassa käsitellään mallinvalintaa aikasarjamuotoista dataa sisältävissä sovelluksissa neljän koneoppimisprosessissa yleisesti noudatetun askeleen kautta: aineiston esikäsittely, algoritmin valinta, piirteiden valinta ja validointi. Väitöskirjassa tutkitaan, kuinka käytettävissä olevan aineiston ominaisuudet ja määrä tulisi ottaa huomioon algoritmin valinnassa, ja kuinka aineisto tulisi jakaa mallin opetusta, testausta ja validointia varten mallin yleistettävyyden ja tulevan suorituskyvyn optimoimiseksi. Myös erityisiä rajoitteita ja vaatimuksia tavanomaisten koneoppimismenetelmien soveltamiselle aikasarjadataan käsitellään. Työn tavoitteena on erityisesti tuoda esille mallin ylioppimiseen ja ylivalintaan liittyviä ongelmia, jotka voivat seurata mallinvalin- tamenetelmien huolimattomasta tai osaamattomasta käytöstä. Työn käytännön tulokset perustuvat koneoppimismenetelmien soveltamiseen aikasar- jadatan mallinnukseen kolmella eri tutkimusalueella: pistehitsaus, fyysisen harjoittelun aikasen energiankulutuksen arviointi sekä kognitiivisen kuormituksen mallintaminen. Väitöskirja tarjoaa näihin tuloksiin pohjautuen yleisiä suuntaviivoja, joita voidaan käyttää apuna lähdettäessä ratkaisemaan uutta koneoppimisongelmaa erityisesti aineiston ominaisuuksien ja määrän, laskennallisten resurssien sekä ongelman mahdollisen aikasar- jaluonteen näkökulmasta. Työssä pohditaan myös mallin lopullisen toimintaympäristön asettamien käytännön näkökohtien ja rajoitteiden vaikutusta algoritmin valintaan
APA, Harvard, Vancouver, ISO, and other styles
33

Uziela, Karolis. "Protein Model Quality Assessment : A Machine Learning Approach." Doctoral thesis, Stockholms universitet, Institutionen för biokemi och biofysik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-137695.

Full text
Abstract:
Many protein structure prediction programs exist and they can efficiently generate a number of protein models of a varying quality. One of the problems is that it is difficult to know which model is the best one for a given target sequence. Selecting the best model is one of the major tasks of Model Quality Assessment Programs (MQAPs). These programs are able to predict model accuracy before the native structure is determined. The accuracy estimation can be divided into two parts: global (the whole model accuracy) and local (the accuracy of each residue). ProQ2 is one of the most successful MQAPs for prediction of both local and global model accuracy and is based on a Machine Learning approach. In this thesis, I present my own contribution to Model Quality Assessment (MQA) and the newest developments of ProQ program series. Firstly, I describe a new ProQ2 implementation in the protein modelling software package Rosetta. This new implementation allows use of ProQ2 as a scoring function for conformational sampling inside Rosetta, which was not possible before. Moreover, I present two new methods, ProQ3 and ProQ3D that both outperform their predecessor. ProQ3 introduces new training features that are calculated from Rosetta energy functions and ProQ3D introduces a new machine learning approach based on deep learning. ProQ3 program participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12) and was one of the best methods in the MQA category. Finally, an important issue in model quality assessment is how to select a target function that the predictor is trying to learn. In the fourth manuscript, I show that MQA results can be improved by selecting a contact-based target function instead of more conventional superposition based functions.<br><p>At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.</p>
APA, Harvard, Vancouver, ISO, and other styles
34

de, la Rúa Martínez Javier. "Scalable Architecture for Automating Machine Learning Model Monitoring." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280345.

Full text
Abstract:
Last years, due to the advent of more sophisticated tools for exploratory data analysis, data management, Machine Learning (ML) model training and model serving into production, the concept of MLOps has gained more popularity. As an effort to bring DevOps processes to the ML lifecycle, MLOps aims at more automation in the execution of diverse and repetitive tasks along the cycle and at smoother interoperability between teams and tools involved. In this context, the main cloud providers have built their own ML platforms [4, 34, 61], offered as services in their cloud solutions. Moreover, multiple frameworks have emerged to solve concrete problems such as data testing, data labelling, distributed training or prediction interpretability, and new monitoring approaches have been proposed [32, 33, 65]. Among all the stages in the ML lifecycle, one of the most commonly overlooked although relevant is model monitoring. Recently, cloud providers have presented their own tools to use within their platforms [4, 61] while work is ongoing to integrate existent frameworks [72] into open-source model serving solutions [38]. Most of these frameworks are either built as an extension of an existent platform (i.e lack portability), follow a scheduled batch processing approach at a minimum rate of hours, or present limitations for certain outliers and drift algorithms due to the platform architecture design in which they are integrated. In this work, a scalable automated cloudnative architecture is designed and evaluated for ML model monitoring in a streaming approach. An experimentation conducted on a 7-node cluster with 250.000 requests at different concurrency rates shows maximum latencies of 5.9, 29.92 and 30.86 seconds after request time for 75% of distance-based outliers detection, windowed statistics and distribution-based data drift detection, respectively, using windows of 15 seconds length and 6 seconds of watermark delay.<br>Under de senaste åren har konceptet MLOps blivit alltmer populärt på grund av tillkomsten av mer sofistikerade verktyg för explorativ dataanalys, datahantering, modell-träning och model serving som tjänstgör i produktion. Som ett försök att föra DevOps processer till Machine Learning (ML)-livscykeln, siktar MLOps på mer automatisering i utförandet av mångfaldiga och repetitiva uppgifter längs cykeln samt på smidigare interoperabilitet mellan team och verktyg inblandade. I det här sammanhanget har de största molnleverantörerna byggt sina egna ML-plattformar [4, 34, 61], vilka erbjuds som tjänster i deras molnlösningar. Dessutom har flera ramar tagits fram för att lösa konkreta problem såsom datatestning, datamärkning, distribuerad träning eller tolkning av förutsägelse, och nya övervakningsmetoder har föreslagits [32, 33, 65]. Av alla stadier i ML-livscykeln förbises ofta modellövervakning trots att det är relevant. På senare tid har molnleverantörer presenterat sina egna verktyg att kunna användas inom sina plattformar [4, 61] medan arbetet pågår för att integrera befintliga ramverk [72] med lösningar för modellplatformer med öppen källkod [38]. De flesta av dessa ramverk är antingen byggda som ett tillägg till en befintlig plattform (dvs. saknar portabilitet), följer en schemalagd batchbearbetningsmetod med en lägsta hastighet av ett antal timmar, eller innebär begränsningar för vissa extremvärden och drivalgoritmer på grund av plattformsarkitekturens design där de är integrerade. I det här arbetet utformas och utvärderas en skalbar automatiserad molnbaserad arkitektur för MLmodellövervakning i en streaming-metod. Ett experiment som utförts på ett 7nodskluster med 250.000 förfrågningar vid olika samtidigheter visar maximala latenser på 5,9, 29,92 respektive 30,86 sekunder efter tid för förfrågningen för 75% av avståndsbaserad detektering av extremvärden, windowed statistics och distributionsbaserad datadriftdetektering, med hjälp av windows med 15 sekunders längd och 6 sekunders fördröjning av vattenstämpel.
APA, Harvard, Vancouver, ISO, and other styles
35

Kothawade, Rohan Dilip. "Wine quality prediction model using machine learning techniques." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-20009.

Full text
Abstract:
The quality of a wine is important for the consumers as well as the wine industry. The traditional (expert) way of measuring wine quality is time-consuming. Nowadays, machine learning models are important tools to replace human tasks. In this case, there are several features to predict the wine quality but the entire features will not be relevant for better prediction. So, our thesis work is focusing on what wine features are important to get the promising result. For the purposeof classification model and evaluation of the relevant features, we used three algorithms namely support vector machine (SVM), naïve Bayes (NB), and artificial neural network (ANN). In this study, we used two wine quality datasets red wine and white wine. To evaluate the feature importance we used the Pearson coefficient correlation and performance measurement matrices such as accuracy, recall, precision, and f1 score for comparison of the machine learning algorithm. A grid search algorithm was applied to improve the model accuracy. Finally, we achieved the artificial neural network (ANN) algorithm has better prediction results than the Support Vector Machine (SVM) algorithm and the Naïve Bayes (NB) algorithm for both red wine and white wine datasets.
APA, Harvard, Vancouver, ISO, and other styles
36

Gilmore, Eugene M. "Learning Interpretable Decision Tree Classifiers with Human in the Loop Learning and Parallel Coordinates." Thesis, Griffith University, 2022. http://hdl.handle.net/10072/418633.

Full text
Abstract:
The Machine Learning (ML) community has recently started to recognise the importance of model interpretability when using ML techniques. In this work, I review the literature on Explainable Artificial Intelligence (XAI) and interpretability in ML and discuss several reasons why interpretability is critical for many ML applications. Although there is now increased interest in XAI, there are significant issues with the approaches taken in a large portion of the research in XAI. In particular, the popularity of techniques that try to explain black-box models often leads to misleading explanations that are not faithful to the model being explained. The popularity of black-box models is, in large part, due to the immense size and complexity of many datasets available today. The high dimensionality of many datasets has encouraged research in ML and particular techniques such as Artificial Neural Networks (ANNs). However, I argue in this work that the high dimensionality of a dataset should not, in itself, be a reason to settle for black-box models that humans cannot understand. Instead, I argue for the need to learn inherently interpretable models, rather than black-box models with post-hoc explanations of their results. One of the most well-known ML models for supervised learning tasks that remains interpretable to humans is the Decision Tree Classifier (DTC). The DTC's interpretability is due to its simple tree structure where a human can individually inspect the splits at each node in the tree. Although a DTC's fundamental structure is interpretable to humans, even a DTC can effective become a black-box model. This may be due to the size of a DTC being too large for a human to comprehend. Alternatively, a DTC may use uninterpretable oblique splits at each node. These oblique splits most often use a hyperplane through the entire attributes space of a dataset to construct a split which is impossible for a human to interpret past three dimensions. In this work, I propose techniques for learning and visualising DTCs and datasets to produce interpretable classifiers that do not sacrifice predictive power. Moreover, I combine such visualisation with an interactive DTC building strategy and enable productive and effective Human-In-the-Loop-Learning (HILL). Not only do classifiers learnt with human involvement have the natural requirement of being humanly interpretable, but there are also several additional advantages to be gained by involving human expertise. These advantages include the ability for a domain expert to contribute their domain knowledge to a model. We can also exploit the highly sophisticated visual pattern recognition capabilities of the human to learn models that more effectively generalise to unseen data. Despite limitations of current HILL systems, a user study conducted as part of this work provides promising results for the involving the human in the construction of DTCs. However, to effective employ this learning style, we need powerful visualisation techniques for both high dimensional datasets and DTCs. Remarkably, despite being ideally suited for high dimensional datasets, the use of Parallel Coordinates (||-coords) by the ML community is minimal. First proposed by Alfred Inselberg, ||-coords is a revolutionary visualisation technique that uses parallel axis to display a dataset of an arbitrary number of dimensions. Using ||-coords, I propose a HILL system for the construction of DTCs. This work also exploits the ||-coords visualisation system to facilitate human input to the splits of internal nodes in a DTC. In addition, I propose a new form of oblique split for DTCs that uses the properties of the ||-coords plane. Unlike other oblique rules, this oblique rule can be easily visualised using ||-coords. While there has recently been renewed interest in XAI and HILL, the research that evaluates systems that facilitate XAI and HILL is limited. I report on an online survey that gathers data from 104 participants. This survey examines participants' use of visualisation systems which I argue are ideally suited for HILL and XAI. The results support my hypothesis and the proposals for HILL. I further argue that for a HILL system to succeed, comprehensive algorithm support is critical. As such, I propose two new DTC induction algorithms. These algorithms are designed to be used in conjunction with the HILL system developed in this work to provide algorithmic assistance in the form of suggestions of splits for a DTC node. The first proposed induction algorithm uses the newly proposed form of oblique split with ||-coords to learn interpretable splits that can capture correlations between attributes. The second induction algorithm advances the nested cavities algorithm originally proposed by Inselberg for classification tasks using ||-coords. Using these induction algorithms enables learning of DTCs with oblique splits that remain interpretable to a human without sacrificing predictive performance.<br>Thesis (PhD Doctorate)<br>Doctor of Philosophy (PhD)<br>School of Info & Comm Tech<br>Science, Environment, Engineering and Technology<br>Full Text
APA, Harvard, Vancouver, ISO, and other styles
37

Lagerkvist, Love. "Neural Novelty — How Machine Learning Does Interactive Generative Literature." Thesis, Malmö universitet, Fakulteten för kultur och samhälle (KS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-21222.

Full text
Abstract:
Every day, machine learning (ML) and artificial intelligence (AI) embeds itself further into domestic and industrial technologies. Interaction de- signers have historically struggled to engage directly with the subject, facing a shortage of appropriate methods and abstractions. There is a need to find ways though which interaction design practitioners might integrate ML into their work, in order to democratize and diversify the field. This thesis proposes a mode of inquiry that considers the inter- active qualities of what machine learning does, as opposed the tech- nical specifications of what machine learning is. A shift in focus from the technicality of ML to the artifacts it creates allows the interaction designer to situate its existing skill set, affording it to engage with ma- chine learning as a design material. A Research-through-Design pro- cess explores different methodological adaptions, evaluated through user feedback and an-in depth case analysis. An elaborated design experiment, Multiverse, examines the novel, non-anthropomorphic aesthetic qualities of generative literature. It prototypes interactions with bidirectional literature and studies how these transform the reader into a cybertextual “user-reader”. The thesis ends with a discussion on the implications of machine written literature and proposes a number of future investigations into the research space unfolded through the prototype.
APA, Harvard, Vancouver, ISO, and other styles
38

Bhogi, Keerthana. "Two New Applications of Tensors to Machine Learning for Wireless Communications." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/104970.

Full text
Abstract:
With the increasing number of wireless devices and the phenomenal amount of data that is being generated by them, there is a growing interest in the wireless communications community to complement the traditional model-driven design approaches with data-driven machine learning (ML)-based solutions. However, managing the large-scale multi-dimensional data to maintain the efficiency and scalability of the ML algorithms has obviously been a challenge. Tensors provide a useful framework to represent multi-dimensional data in an integrated manner by preserving relationships in data across different dimensions. This thesis studies two new applications of tensors to ML for wireless communications where the tensor structure of the concerned data is exploited in novel ways. The first contribution of this thesis is a tensor learning-based low-complexity precoder codebook design technique for a full-dimension multiple-input multiple-output (FD-MIMO) system with a uniform planar antenna (UPA) array at the transmitter (Tx) whose channel distribution is available through a dataset. Represented as a tensor, the FD-MIMO channel is further decomposed using a tensor decomposition technique to obtain an optimal precoder which is a function of Kronecker-Product (KP) of two low-dimensional precoders, each corresponding to the horizontal and vertical dimensions of the FD-MIMO channel. From the design perspective, we have made contributions in deriving a criterion for optimal product precoder codebooks using the obtained low-dimensional precoders. We show that this product codebook design problem is an unsupervised clustering problem on a Cartesian Product Grassmann Manifold (CPM), where the optimal cluster centroids form the desired codebook. We further simplify this clustering problem to a $K$-means algorithm on the low-dimensional factor Grassmann manifolds (GMs) of the CPM which correspond to the horizontal and vertical dimensions of the UPA, thus significantly reducing the complexity of precoder codebook construction when compared to the existing codebook learning techniques. The second contribution of this thesis is a tensor-based bandwidth-efficient gradient communication technique for federated learning (FL) with convolutional neural networks (CNNs). Concisely, FL is a decentralized ML approach that allows to jointly train an ML model at the server using the data generated by the distributed users coordinated by a server, by sharing only the local gradients with the server and not the raw data. Here, we focus on efficient compression and reconstruction of convolutional gradients at the users and the server, respectively. To reduce the gradient communication overhead, we compress the sparse gradients at the users to obtain their low-dimensional estimates using compressive sensing (CS)-based technique and transmit to the server for joint training of the CNN. We exploit a natural tensor structure offered by the convolutional gradients to demonstrate the correlation of a gradient element with its neighbors. We propose a novel prior for the convolutional gradients that captures the described spatial consistency along with its sparse nature in an appropriate way. We further propose a novel Bayesian reconstruction algorithm based on the Generalized Approximate Message Passing (GAMP) framework that exploits this prior information about the gradients. Through the numerical simulations, we demonstrate that the developed gradient reconstruction method improves the convergence of the CNN model.<br>Master of Science<br>The increase in the number of wireless and mobile devices have led to the generation of massive amounts of multi-modal data at the users in various real-world applications including wireless communications. This has led to an increasing interest in machine learning (ML)-based data-driven techniques for communication system design. The native setting of ML is {em centralized} where all the data is available on a single device. However, the distributed nature of the users and their data has also motivated the development of distributed ML techniques. Since the success of ML techniques is grounded in their data-based nature, there is a need to maintain the efficiency and scalability of the algorithms to manage the large-scale data. Tensors are multi-dimensional arrays that provide an integrated way of representing multi-modal data. Tensor algebra and tensor decompositions have enabled the extension of several classical ML techniques to tensors-based ML techniques in various application domains such as computer vision, data-mining, image processing, and wireless communications. Tensors-based ML techniques have shown to improve the performance of the ML models because of their ability to leverage the underlying structural information in the data. In this thesis, we present two new applications of tensors to ML for wireless applications and show how the tensor structure of the concerned data can be exploited and incorporated in different ways. The first contribution is a tensor learning-based precoder codebook design technique for full-dimension multiple-input multiple-output (FD-MIMO) systems where we develop a scheme for designing low-complexity product precoder codebooks by identifying and leveraging a tensor representation of the FD-MIMO channel. The second contribution is a tensor-based gradient communication scheme for a decentralized ML technique known as federated learning (FL) with convolutional neural networks (CNNs), where we design a novel bandwidth-efficient gradient compression-reconstruction algorithm that leverages a tensor structure of the convolutional gradients. The numerical simulations in both applications demonstrate that exploiting the underlying tensor structure in the data provides significant gains in their respective performance criteria.
APA, Harvard, Vancouver, ISO, and other styles
39

Badayos, Noah Garcia. "Machine Learning-Based Parameter Validation." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/47675.

Full text
Abstract:
As power system grids continue to grow in order to support an increasing energy demand, the system's behavior accordingly evolves, continuing to challenge designs for maintaining security. It has become apparent in the past few years that, as much as discovering vulnerabilities in the power network, accurate simulations are very critical. This study explores a classification method for validating simulation models, using disturbance measurements from phasor measurement units (PMU). The technique used employs the Random Forest learning algorithm to find a correlation between specific model parameter changes, and the variations in the dynamic response. Also, the measurements used for building and evaluating the classifiers were characterized using Prony decomposition. The generator model, consisting of an exciter, governor, and its standard parameters have been validated using short circuit faults. Single-error classifiers were first tested, where the accuracies of the classifiers built using positive, negative, and zero sequence measurements were compared. The negative sequence measurements have consistently produced the best classifiers, with majority of the parameter classes attaining F-measure accuracies greater than 90%. A multiple-parameter error technique for validation has also been developed and tested on standard generator parameters. Only a few target parameter classes had good accuracies in the presence of multiple parameter errors, but the results were enough to permit a sequential process of validation, where elimination of a highly detectable error can improve the accuracy of suspect errors dependent on the former's removal, and continuing the procedure until all corrections are covered.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
40

Chida, Anjum A. "Protein Tertiary Model Assessment Using Granular Machine Learning Techniques." Digital Archive @ GSU, 2012. http://digitalarchive.gsu.edu/cs_diss/65.

Full text
Abstract:
The automatic prediction of protein three dimensional structures from its amino acid sequence has become one of the most important and researched fields in bioinformatics. As models are not experimental structures determined with known accuracy but rather with prediction it’s vital to determine estimates of models quality. We attempt to solve this problem using machine learning techniques and information from both the sequence and structure of the protein. The goal is to generate a machine that understands structures from PDB and when given a new model, predicts whether it belongs to the same class as the PDB structures (correct or incorrect protein models). Different subsets of PDB (protein data bank) are considered for evaluating the prediction potential of the machine learning methods. Here we show two such machines, one using SVM (support vector machines) and another using fuzzy decision trees (FDT). First using a preliminary encoding style SVM could get around 70% in protein model quality assessment accuracy, and improved Fuzzy Decision Tree (IFDT) could reach above 80% accuracy. For the purpose of reducing computational overhead multiprocessor environment and basic feature selection method is used in machine learning algorithm using SVM. Next an enhanced scheme is introduced using new encoding style. In the new style, information like amino acid substitution matrix, polarity, secondary structure information and relative distance between alpha carbon atoms etc is collected through spatial traversing of the 3D structure to form training vectors. This guarantees that the properties of alpha carbon atoms that are close together in 3D space and thus interacting are used in vector formation. With the use of fuzzy decision tree, we obtained a training accuracy around 90%. There is significant improvement compared to previous encoding technique in prediction accuracy and execution time. This outcome motivates to continue to explore effective machine learning algorithms for accurate protein model quality assessment. Finally these machines are tested using CASP8 and CASP9 templates and compared with other CASP competitors, with promising results. We further discuss the importance of model quality assessment and other information from proteins that could be considered for the same.
APA, Harvard, Vancouver, ISO, and other styles
41

Lee, Wei-En. "Visualizations for model tracking and predictions in machine learning." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/113133.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from student-submitted PDF version of thesis.<br>Includes bibliographical references (pages 82-84).<br>Building machine learning models is often an exploratory and iterative process. A data scientist frequently builds and trains hundreds of models with different parameters and feature sets in order to find one that meets the desired criteria. However, it can be difficult to keep track of all the parameters and metadata that are associated with the models. ModelDB, an end-to-end system for managing machine learning models, is a tool that solves this problem of model management. In this thesis, we present a graphical user interface for ModelDB, along with an extension for visualizing model predictions. The core user interface for model management augments the ModelDB system, which previously consisted only of native client libraries and a backend. The interface provides new ways of exploring, visualizing, and analyzing model data through a web application. The prediction visualizations extend the core user interface by providing a novel prediction matrix that displays classifier outputs in order to convey model performance at the example level. We present the design and implementation of both the core user interface and the prediction visualizations, discussing at each step the motivations behind key features. We evaluate the prediction visualizations through a pilot user study, which produces preliminary feedback on the practicality and utility of the interface. The overall goal of this research is to provide a powerful, user-friendly interface that leverages the data stored in ModelDB to generate effective visualizations for analyzing and improving models.<br>by Wei-En Lee.<br>M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
42

Bagheri, Rajeoni Alireza. "ANALOG CIRCUIT SIZING USING MACHINE LEARNING BASED TRANSISTORCIRCUIT MODEL." University of Akron / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=akron1609428170125214.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Sharma, Sagar. "Towards Data and Model Confidentiality in Outsourced Machine Learning." Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1567529092809275.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Lanctot, J. Kevin (Joseph Kevin) Carleton University Dissertation Mathematics. "Discrete estimator algorithms: a mathematical model of machine learning." Ottawa, 1989.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
45

Kokkonen, H. (Henna). "Effects of data cleaning on machine learning model performance." Bachelor's thesis, University of Oulu, 2019. http://jultika.oulu.fi/Record/nbnfioulu-201911133081.

Full text
Abstract:
Abstract. This thesis is focused on the preprocessing and challenges of a university student data set and how different levels of data preprocessing affect the performance of a prediction model both in general and in selected groups of interest. The data set comprises the students at the University of Oulu who were admitted to the Faculty of Information Technology and Electrical Engineering during years 2006–2015. This data set was cleaned at three different levels, which resulted in three differently processed data sets: one set is the original data set with only basic cleaning, the second has been cleaned out of the most obvious anomalies and the third has been systematically cleaned out of possible anomalies. Each of these data sets was used to build a Gradient Boosting Machine model that predicted the cumulative number of ECTS the students would achieve by the end of their second-year studies based on their first-year studies and the Matriculation Examination results. The effects of the cleaning on the model performance were examined by comparing the prediction accuracy and the information the models gave of the factors that might indicate a slow ECTS accumulation. The results showed that the prediction accuracy improved after each cleaning stage and the influences of the features altered significantly, becoming more reasonable.Datan siivouksen vaikutukset koneoppimismallin suorituskykyyn. Tiivistelmä. Tässä tutkielmassa keskitytään opiskelijadatan esikäsittelyyn ja haasteisiin sekä siihen, kuinka eritasoinen esikäsittely vaikuttaa ennustemallin suorituskykyyn sekä yleisesti että tietyissä kiinnostuksen kohteena olevissa ryhmissä. Opiskelijadata koostuu Oulun yliopiston Tieto- ja sähkötekniikan tiedekuntaan vuosina 2006–2015 valituista opiskelijoista. Tätä opiskelijadataa käsiteltiin kolmella eri tasolla, jolloin saatiin kolme eritasoisesti siivottua versiota alkuperäisestä datajoukosta. Ensimmäinen versio on alkuperäinen datajoukko, jolle on tehty vain perussiivous, toisessa versiossa datasta on poistettu vain ilmeisimmät poikkeavuudet ja kolmannessa versiossa datasta on systemaattisesti poistettu mahdolliset poikkeavuudet. Jokaisella datajoukolla opetettiin Gradient Boosting Machine koneoppismismalli ennustamaan opiskelijoiden opintopistekertymää toisen vuoden loppuun mennessä perustuen heidän ensimmäisen vuoden opintoihinsa ja ylioppilaskirjoitustensa tuloksiin. Datan eritasoisen siivouksen vaikutuksia mallin suorituskykyyn tutkittiin vertailemalla mallien ennustetarkkuutta sekä tietoa, jota mallit antoivat niistä tekijöistä, jotka voivat ennakoida hitaampaa opintopistekertymää. Tulokset osoittivat mallin ennustetarkkuuden parantuneen jokaisen käsittelytason jälkeen sekä mallin ennustajien vaikutusten muuttuneen järjellisemmiksi.
APA, Harvard, Vancouver, ISO, and other styles
46

Sridhar, Sabarish. "SELECTION OF FEATURES FOR ML BASED COMMANDING OF AUTONOMOUS VEHICLES." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-287450.

Full text
Abstract:
Traffic coordination is an essential challenge in vehicle automation. The challenge is not only about maximizing the revenue/productivity of a fleet of vehicles, but also about avoiding non feasible states such as collisions and low energy levels, which could make the fleet inoperable. The challenge is hard due to the complex nature of the real time traffic and the large state space involved. Reinforcement learning and simulation-based search techniques have been successful in handling complex problem with large state spaces [1] and can be used as potential candidates for traffic coordination. In this degree project, a variant of these techniques known as Dyna-2 [2] is investigated for traffic coordination. A long term memory of past experiences is approximated by a neural network and is used to guide a Temporal Difference (TD) search. Various features are proposed, evaluated and finally a feature representation is chosen to build the neural network model. The Dyna-2 Traffic Coordinator (TC) is investigated for its ability to provide supervision for handling vehicle bunching and charging. Two variants of traffic coordinators, one based on simple rules and another based on TD search are the existing baselines for the performance evaluation. The results indicate that by incorporating learning via a long-term memory, the Dyna-2 TC is robust to vehicle bunching and ensures a good balance in charge levels over time. The performance of the Dyna-2 TC depends on the choice of features used to build the function approximator, a bad feature choice does not provide good generalization and hence results in bad performance. On the other hand, the previous approaches based on rule-based planning and TD search made poor decisions resulting in collisions and low energy states. The search based approach is comparatively better than the rule-based approach, however it is not able to find an optimal solution due to the depth limitations. With the guidance from a long term memory, the search was able to generate a higher return and ensure a good balance in charge levels.<br>Trafikkoordinering är en grundläggande utmaning för att autonomisera fordon. Utmaningen ligger inte bara i att maximera inkomsten/produktiviteten hos en fordonsflotta utan även i att undvika olämpliga tillstånd, så som krockar och brist på energi vilka skulle kunna göra flottan obrukbar. Utmaningen är svår på grund av den komplexa naturen hos trafik i realtid och det stora tillståndsrummet som innefattas. Förstärkningsinlärning och simulationsbaserade söktekniker har varit framgångsrika metoder för att hantera komplexa problem med stora tillståndsrum [1] och kan ses som en potentiell kandidat för trafikkoordinering. Detta examensarbete undersöker en variant av dessa tekniker, känd som Dyna-2 [2], applicerat på trafikkoordinering. Ett långsiktigt minne av tidigare erfarenheter approximeras med ett neuron nät och används för att vägleda en Temporal Difference (TD) sökning. Olika attribut föreslås, utvärderas och sätts sedan samman till en representation att bygga nätverket kring. Dyna-2 Trafikkoordinator (TC) undersöks för dess färdighet att ge beslutsstöd för hantering av grupperade fordon och laddning. Två varianter av trafikkoordinerare, en baserad på enkla regler och en baserad på TD-sökningen, används som grund för utvärderingen av prestanda. Resultaten indikerar att genom inkludering av inlärning via ett långsiktigt minne så är Dyna-2 TC en robust metod för att hantera grupperade fordon och ger en god balans av laddningsnivå över tid. Prestandan hos Dyna-2 TC beror på valet av de attribut som används för att bygga approximeringsfunktionen, sämre val av attribut generaliserar inte bra vilket då resulterar i dålig prestanda. Å andra sidan, de tidigare tillvägagånssätten baserade på planering genom regler och TD-sökning tog dåliga beslut vilket resulterade i kollisioner och tillstånd med låga laddningsnivåer. Jämfört med att basera på regler så är den sökbaserade metoden bättre, den lyckades dock inte hitta en optimal lösning på grund av begränsningar hos sökdjupet. Med vägvisning från ett långsiktigt minne så sökningen kunde sökningen generera högre avkastning och säkerställa en god balans hos laddningsnivåerna.
APA, Harvard, Vancouver, ISO, and other styles
47

Abdurahiman, Vakulathil. "Towards inducing a simulation model description." Thesis, Brunel University, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.239138.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Geras, Krzysztof Jerzy. "Exploiting diversity for efficient machine learning." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/28839.

Full text
Abstract:
A common practice for solving machine learning problems is currently to consider each problem in isolation, starting from scratch every time a new learning problem is encountered or a new model is proposed. This is a perfectly feasible solution when the problems are sufficiently easy or, if the problem is hard when a large amount of resources, both in terms of the training data and computation, are available. Although this naive approach has been the main focus of research in machine learning for a few decades and had a lot of success, it becomes infeasible if the problem is too hard in proportion to the available resources. When using a complex model in this naive approach, it is necessary to collect large data sets (if possible at all) to avoid overfitting and hence it is also necessary to use large computational resources to handle the increased amount of data, first during training to process a large data set and then also at test time to execute a complex model. An alternative to this strategy of treating each learning problem independently is to leverage related data sets and computation encapsulated in previously trained models. By doing that we can decrease the amount of data necessary to reach a satisfactory level of performance and, consequently, improve the accuracy achievable and decrease training time. Our attack on this problem is to exploit diversity - in the structure of the data set, in the features learnt and in the inductive biases of different neural network architectures. In the setting of learning from multiple sources we introduce multiple-source cross-validation, which gives an unbiased estimator of the test error when the data set is composed of data coming from multiple sources and the data at test time are coming from a new unseen source. We also propose new estimators of variance of the standard k-fold cross-validation and multiple-source cross-validation, which have lower bias than previously known ones. To improve unsupervised learning we introduce scheduled denoising autoencoders, which learn a more diverse set of features than the standard denoising auto-encoder. This is thanks to their training procedure, which starts with a high level of noise, when the network is learning coarse features and then the noise is lowered gradually, which allows the network to learn some more local features. A connection between this training procedure and curriculum learning is also drawn. We develop further the idea of learning a diverse representation by explicitly incorporating the goal of obtaining a diverse representation into the training objective. The proposed model, the composite denoising autoencoder, learns multiple subsets of features focused on modelling variations in the data set at different levels of granularity. Finally, we introduce the idea of model blending, a variant of model compression, in which the two models, the teacher and the student, are both strong models, but different in their inductive biases. As an example, we train convolutional networks using the guidance of bidirectional long short-term memory (LSTM) networks. This allows to train the convolutional neural network to be more accurate than the LSTM network at no extra cost at test time.
APA, Harvard, Vancouver, ISO, and other styles
49

Stroulia, Eleni. "Failure-driven learning as model-based self-redesign." Diss., Georgia Institute of Technology, 1994. http://hdl.handle.net/1853/8291.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Caceres, Carlos Antonio. "Machine Learning Techniques for Gesture Recognition." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/52556.

Full text
Abstract:
Classification of human movement is a large field of interest to Human-Machine Interface researchers. The reason for this lies in the large emphasis humans place on gestures while communicating with each other and while interacting with machines. Such gestures can be digitized in a number of ways, including both passive methods, such as cameras, and active methods, such as wearable sensors. While passive methods might be the ideal, they are not always feasible, especially when dealing in unstructured environments. Instead, wearable sensors have gained interest as a method of gesture classification, especially in the upper limbs. Lower arm movements are made up of a combination of multiple electrical signals known as Motor Unit Action Potentials (MUAPs). These signals can be recorded from surface electrodes placed on the surface of the skin, and used for prosthetic control, sign language recognition, human machine interface, and a myriad of other applications. In order to move a step closer to these goal applications, this thesis compares three different machine learning tools, which include Hidden Markov Models (HMMs), Support Vector Machines (SVMs), and Dynamic Time Warping (DTW), to recognize a number of different gestures classes. It further contrasts the applicability of these tools to noisy data in the form of the Ninapro dataset, a benchmarking tool put forth by a conglomerate of universities. Using this dataset as a basis, this work paves a path for the analysis required to optimize each of the three classifiers. Ultimately, care is taken to compare the three classifiers for their utility against noisy data, and a comparison is made against classification results put forth by other researchers in the field. The outcome of this work is 90+ % recognition of individual gestures from the Ninapro dataset whilst using two of the three distinct classifiers. Comparison against previous works by other researchers shows these results to outperform all other thus far. Through further work with these tools, an end user might control a robotic or prosthetic arm, or translate sign language, or perhaps simply interact with a computer.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography