Dissertations / Theses: 'Bayesian Optimization'

1

Klein, Aaron [Verfasser], and Frank [Akademischer Betreuer] Hutter. "Efficient bayesian hyperparameter optimization." Freiburg : Universität, 2020. http://d-nb.info/1214592961/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Mahendran, Nimalan. "Bayesian optimization for adaptive MCMC." Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/30636.

Full text

Abstract:

A new randomized strategy for adaptive Markov chain Monte Carlo (MCMC) using Bayesian optimization, called Bayesian-optimized MCMC, is proposed. This approach can handle non-differentiable objective functions and trades off exploration and exploitation to reduce the number of function evaluations. Bayesian-optimized MCMC is applied to the complex setting of sampling from constrained, discrete and densely connected probabilistic graphical models where, for each variation of the problem, one needs to adjust the parameters of the proposal mechanism automatically to ensure efficient mixing of the Markov chains. It is found that Bayesian-optimized MCMC is able to match or surpass manual tuning of the proposal mechanism by a domain expert.

APA, Harvard, Vancouver, ISO, and other styles

3

Gelbart, Michael Adam. "Constrained Bayesian Optimization and Applications." Thesis, Harvard University, 2015. http://nrs.harvard.edu/urn-3:HUL.InstRepos:17467236.

Full text

Abstract:

Bayesian optimization is an approach for globally optimizing black-box functions that are expensive to evaluate, non-convex, and possibly noisy. Recently, Bayesian optimization has been used with great effectiveness for applications like tuning the hyperparameters of machine learning algorithms and automatic A/B testing for websites. This thesis considers Bayesian optimization in the presence of black-box constraints. Prior work on constrained Bayesian optimization consists of a variety of methods that can be used with some efficacy in specific contexts. Here, by forming a connection with multi-task Bayesian optimization, we formulate a more general class of constrained Bayesian optimization problems that we call Bayesian optimization with decoupled constraints. In this general framework, the objective and constraint functions are divided into tasks that can be evaluated independently of each other, and resources with which these tasks can be performed. We then present two methods for solving problems in this general class. The first method, an extension to a constrained variant of expected improvement, is fast and straightforward to implement but performs poorly in some circumstances and is not sufficiently flexible to address all varieties of decoupled problems. The second method, Predictive Entropy Search with Constraints (PESC), is highly effective and sufficiently flexible to address all problems in the general class of decoupled problems without any ad hoc modifications. The two weaknesses of PESC are its implementation difficulty and slow execution time. We address these issues by, respectively, providing a publicly available implementation within the popular Bayesian optimization software Spearmint, and developing an extension to PESC that achieves greater speed without significant performance losses. We demonstrate the effectiveness of these methods on real-world machine learning meta-optimization problems.
Biophysics

APA, Harvard, Vancouver, ISO, and other styles

4

Gaudrie, David. "High-Dimensional Bayesian Multi-Objective Optimization." Thesis, Lyon, 2019. https://tel.archives-ouvertes.fr/tel-02356349.

Full text

Abstract:

Dans cette thèse, nous nous intéressons à l'optimisation simultanée de fonctions coûteuses à évaluer et dépendant d'un grand nombre de paramètres. Cette situation est rencontrée dans de nombreux domaines tels que la conception de systèmes en ingénierie au moyen de simulations numériques. L'optimisation bayésienne, reposant sur des méta-modèles (processus gaussiens) est particulièrement adaptée à ce contexte.La première partie de cette thèse est consacrée au développement de nouvelles méthodes d'optimisation multi-objectif assistées par méta-modèles. Afin d'améliorer le temps d'atteinte de solutions Pareto optimales, un critère d'acquisition est adapté pour diriger l'algorithme vers une région de l'espace des objectifs plébiscitée par l'utilisateur ou, en son absence, le centre du front de Pareto introduit dans nos travaux. Outre le ciblage, la méthode prend en compte le budget d'optimisation, afin de restituer un éventail de solutions optimales aussi large que possible, dans la limite des ressources disponibles.Dans un second temps, inspirée par l'optimisation de forme, une approche d'optimisation avec réduction de dimension est proposée pour contrer le fléau de la dimension. Elle repose sur la construction, par analyse en composantes principales de solutions candidates, de variables auxiliaires adaptées au problème, hiérarchisées et plus à même de décrire les candidats globalement. Peu d'entre elles suffisent à approcher les solutions, et les plus influentes sont sélectionnées et priorisées au sein d'un processus gaussien additif. Cette structuration des variables est ensuite exploitée dans l'algorithme d'optimisation bayésienne qui opère en dimension réduite
This thesis focuses on the simultaneous optimization of expensive-to-evaluate functions that depend on a high number of parameters. This situation is frequently encountered in fields such as design engineering through numerical simulation. Bayesian optimization relying on surrogate models (Gaussian Processes) is particularly adapted to this context.The first part of this thesis is devoted to the development of new surrogate-assisted multi-objective optimization methods. To improve the attainment of Pareto optimal solutions, an infill criterion is tailored to direct the search towards a user-desired region of the objective space or, in its absence, towards the Pareto front center introduced in our work. Besides targeting a well-chosen part of the Pareto front, the method also considers the optimization budget in order to provide an as wide as possible range of optimal solutions in the limit of the available resources.Next, inspired by shape optimization problems, an optimization method with dimension reduction is proposed to tackle the curse of dimensionality. The approach hinges on the construction of hierarchized problem-related auxiliary variables that can describe all candidates globally, through a principal component analysis of potential solutions. Few of these variables suffice to approach any solution, and the most influential ones are selected and prioritized inside an additive Gaussian Process. This variable categorization is then further exploited in the Bayesian optimization algorithm which operates in reduced dimension

APA, Harvard, Vancouver, ISO, and other styles

5

Scotto, Di Perrotolo Alexandre. "A Theoretical Framework for Bayesian Optimization Convergence." Thesis, KTH, Optimeringslära och systemteori, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-225129.

Full text

Abstract:

Bayesian optimization is a well known class of derivative-free optimization algorithms mainly used for expensive black-box objective functions. Despite their eﬃciency, they suﬀer from a lack of rigorous convergence criterion which makes them more prone to be used as modeling tools rather than optimizing tools. This master thesis proposes, analyzes, and tests a globally convergent framework (that is to say the convergence to a stationary point regardless the initial sample) for Bayesian optimization algorithms. The framework design intends to preserve the global search characteristics for minimum while being rigorously monitored to converge.
Bayesiansk optimering är en välkänd klass av globala optimeringsalgoritmer som inte beror av derivator och främst används för optimering av dyra svartlådsfunktioner. Trots sin relativa eﬀektivitet lider de av en brist av stringent konvergenskriterium som gör dem mer benägna att användas som modelleringsverktyg istället för som optimeringsverktyg. Denna rapport är avsedd att föreslå, analysera och testa en ett globalt konvergerande ramverk (på ett sätt som som beskrivs vidare) för Bayesianska optimeringsalgoritmer, som ärver de globala sökegenskaperna för minimum medan de noggrant övervakas för att konvergera.

APA, Harvard, Vancouver, ISO, and other styles

6

Wang, Ziyu. "Practical and theoretical advances in Bayesian optimization." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:9612d870-015e-4236-8c8d-0419670172fb.

Full text

Abstract:

Bayesian optimization forms a set of powerful tools that allows efficient blackbox optimization and has general applications in a large variety of fields. In this work we seek to advance Bayesian optimization both in the theoretical and the practical fronts as well as apply Bayesian optimization to novel and difficult problems in order to advance the state of the art. Chapter 1 gives a broad overview of Bayesian optimization. We start by covering the published applications of Bayesian optimization. The chapter then proceeds to introduce the essential ingredients of Bayesian optimization in depth. After going through some practical considerations, the theory and history of Bayesian optimization, we end the chapter with a discussion on the latest extensions and open problems. In Chapters 2-4, we solve three outstanding problems in the Bayesian optimization literature. Traditional Bayesian optimization approaches need to solve an auxiliary non-convex global optimization problem in the inner loop. The difficulties in solving this auxiliary optimization problem not only break the assumptions of most theoretical works in this area but also lead to computationally inefficient solutions. In Chapter 2, we propose the first algorithm in Bayesian optimization that does not need to solve auxiliary optimization problems and prove its convergence. In Bayesian optimization, it is often important to tune the hyper-parameters of the underlying Gaussian processes. There did not exist theoretical results that allow noisy observations and at the same time varying hyper-parameters. Chapter 3, proves the first such result. Bayesian optimization is very effective when the dimensionality of the problem is low. Scaling Bayesian optimization to high dimensionality, however, has been a long standing open problem of the field. In Chapter 4, we develop an algorithm that extends Bayesian optimization to very high dimensionalities where the underlying problems have low intrinsic dimensionality. We also prove theoretical guarantees of the proposed algorithm. In Chapter 5, we turn our attention to improving an essential component of Bayesian optimization: acquisition functions. Acquisition functions form a critical component of Bayesian optimization and yet there does not exist an optimal acquisition function that is easily computable. Instead of relying on one acquisition function, we develop a new information-theoretic portfolio of acquisition functions. We show empirically that our approach is more effective than any single acquisition function in the portfolio. Last but not least, in Chapter 6 we adapt Bayesian optimization to derive an adaptive Hamiltonian Monte Carlo sampler. Hamiltonian Monte Carlo is one of the most effective MCMC algorithms. It is, however, notoriously difficult to tune. In this chapter, we follow the approach of adapting Markov chains in order to improve their convergence where our adaptive strategy is based on Bayesian optimization. We provide theoretical analysis as well as a comprehensive set of experiments demonstrating the effectiveness of our proposed algorithm.

APA, Harvard, Vancouver, ISO, and other styles

7

Zinberg, Ben (Ben I. ). "Bayesian optimization as a probabilistic meta-program." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/106374.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (page 50).
This thesis answers two questions: 1. How should probabilistic programming languages in- corporate Gaussian processes? and 2. Is it possible to write a probabilistic meta-program for Bayesian optimization, a probabilistic meta-algorithm that can combine regression frameworks such as Gaussian processes with a broad class of parameter estimation and optimization techniques? We answer both questions affirmatively, presenting both an implementation and informal semantics for Gaussian process models in probabilistic programming systems, and a probabilistic meta-program for Bayesian optimization. The meta-program exposes modularity common to a wide range of Bayesian optimization methods in a way that is not apparent from their usual treatment in statistics.
by Ben Zinberg.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

8

Wang, Zheng S. M. Massachusetts Institute of Technology. "An optimization based algorithm for Bayesian inference." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/98815.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 75-76).
In the Bayesian statistical paradigm, uncertainty in the parameters of a physical system is characterized by a probability distribution. Information from observations is incorporated by updating this distribution from prior to posterior. Quantities of interest, such as credible regions, event probabilities, and other expectations can then be obtained from the posterior distribution. One major task in Bayesian inference is then to characterize the posterior distribution, for example, through sampling. Markov chain Monte Carlo (MCMC) algorithms are often used to sample from posterior distributions using only unnormalized evaluations of the posterior density. However, high dimensional Bayesian inference problems are challenging for MCMC-type sampling algorithms, because accurate proposal distributions are needed in order for the sampling to be efficient. One method to obtain efficient proposal samples is an optimization-based algorithm titled 'Randomize-then-Optimize' (RTO). We build upon RTO by developing a new geometric interpretation that describes the samples as projections of Gaussian-distributed points, in the joint data and parameter space, onto a nonlinear manifold defined by the forward model. This interpretation reveals generalizations of RTO that can be used. We use this interpretation to draw connections between RTO and two other sampling techniques, transport map based MCMC and implicit sampling. In addition, we motivate and propose an adaptive version of RTO designed to be more robust and efficient. Finally, we introduce a variable transformation to apply RTO to problems with non-Gaussian priors, such as Bayesian inverse problems with Li-type priors. We demonstrate several orders of magnitude in computational savings from this strategy on a high-dimensional inverse problem.
by Zheng Wang.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

9

Carstens, Herman. "A Bayesian approach to energy monitoring optimization." Thesis, University of Pretoria, 2017. http://hdl.handle.net/2263/63791.

Full text

Abstract:

This thesis develops methods for reducing energy Measurement and Verification (M&V) costs through the use of Bayesian statistics. M&V quantifies the savings of energy efficiency and demand side projects by comparing the energy use in a given period to what that use would have been, had no interventions taken place. The case of a large-scale lighting retrofit study, where incandescent lamps are replaced by Compact Fluorescent Lamps (CFLs), is considered. These projects often need to be monitored over a number of years with a predetermined level of statistical rigour, making M&V very expensive. M&V lighting retrofit projects have two interrelated uncertainty components that need to be addressed, and which form the basis of this thesis. The first is the uncertainty in the annual energy use of the average lamp, and the second the persistence of the savings over multiple years, determined by the number of lamps that are still functioning in a given year. For longitudinal projects, the results from these two aspects need to be obtained for multiple years. This thesis addresses these problems by using the Bayesian statistical paradigm. Bayesian statistics is still relatively unknown in M&V, and presents an opportunity for increasing the efficiency of statistical analyses, especially for such projects. After a thorough literature review, especially of measurement uncertainty in M&V, and an introduction to Bayesian statistics for M&V, three methods are developed. These methods address the three types of uncertainty in M&V: measurement, sampling, and modelling. The first method is a low-cost energy meter calibration technique. The second method is a Dynamic Linear Model (DLM) with Bayesian Forecasting for determining the size of the metering sample that needs to be taken in a given year. The third method is a Dynamic Generalised Linear Model (DGLM) for determining the size of the population survival survey sample. It is often required by law that M&V energy meters be calibrated periodically by accredited laboratories. This can be expensive and inconvenient, especially if the facility needs to be shut down for meter installation or removal. Some jurisdictions also require meters to be calibrated in-situ; in their operating environments. However, it is shown that metering uncertainty makes a relatively small impact to overall M&V uncertainty in the presence of sampling, and therefore the costs of such laboratory calibration may outweigh the benefits. The proposed technique uses another commercial-grade meter (which also measures with error) to achieve this calibration in-situ. This is done by accounting for the mismeasurement effect through a mathematical technique called Simulation Extrapolation (SIMEX). The SIMEX result is refined using Bayesian statistics, and achieves acceptably low error rates and accurate parameter estimates. The second technique uses a DLM with Bayesian forecasting to quantify the uncertainty in metering only a sample of the total population of lighting circuits. A Genetic Algorithm (GA) is then applied to determine an efficient sampling plan. Bayesian statistics is especially useful in this case because it allows the results from previous years to inform the planning of future samples. It also allows for exact uncertainty quantification, where current confidence interval techniques do not always do so. Results show a cost reduction of up to 66%, but this depends on the costing scheme used. The study then explores the robustness of the efficient sampling plans to forecast error, and finds a 50% chance of undersampling for such plans, due to the standard M&V sampling formula which lacks statistical power. The third technique uses a DGLM in the same way as the DLM, except for population survival survey samples and persistence studies, not metering samples. Convolving the binomial survey result distributions inside a GA is problematic, and instead of Monte Carlo simulation, a relatively new technique called Mellin Transform Moment Calculation is applied to the problem. The technique is then expanded to model stratified sampling designs for heterogeneous populations. Results show a cost reduction of 17-40%, although this depends on the costing scheme used. Finally the DLM and DGLM are combined into an efficient overall M&V plan where metering and survey costs are traded off over multiple years, while still adhering to statistical precision constraints. This is done for simple random sampling and stratified designs. Monitoring costs are reduced by 26-40% for the costing scheme assumed. The results demonstrate the power and flexibility of Bayesian statistics for M&V applications, both in terms of exact uncertainty quantification, and by increasing the efficiency of the study and reducing monitoring costs.
Hierdie proefskrif ontwikkel metodes waarmee die koste van energiemonitering en verifieëring (M&V) deur Bayesiese statistiek verlaag kan word. M&V bepaal die hoeveelheid besparings wat deur energiedoeltreffendheid- en vraagkantbestuurprojekte behaal kan word. Dit word gedoen deur die energieverbruik in ’n gegewe tydperk te vergelyk met wat dit sou wees indien geen ingryping plaasgevind het nie. ’n Grootskaalse beligtingsretrofitstudie, waar filamentgloeilampe met fluoresserende spaarlampe vervang word, dien as ’n gevallestudie. Sulke projekte moet gewoonlik oor baie jare met ’n vasgestelde statistiese akkuuraatheid gemonitor word, wat M&V duur kan maak. Twee verwante onsekerheidskomponente moet in M&V beligtingsprojekte aangespreek word, en vorm die grondslag van hierdie proefskrif. Ten eerste is daar die onsekerheid in jaarlikse energieverbruik van die gemiddelde lamp. Ten tweede is daar die volhoubaarheid van die besparings oor veelvoudige jare, wat bepaal word deur die aantal lampe wat tot in ’n gegewe jaar behoue bly. Vir longitudinale projekte moet hierdie twee komponente oor veelvoudige jare bepaal word. Hierdie proefskrif spreek die probleem deur middel van ’n Bayesiese paradigma aan. Bayesiese statistiek is nog relatief onbekend in M&V, en bied ’n geleentheid om die doeltreffendheid van statistiese analises te verhoog, veral vir bogenoemde projekte. Die proefskrif begin met ’n deeglike literatuurstudie, veral met betrekking tot metingsonsekerheid in M&V. Daarna word ’n inleiding tot Bayesiese statistiek vir M&V voorgehou, en drie metodes word ontwikkel. Hierdie metodes spreek die drie hoofbronne van onsekerheid in M&V aan: metings, opnames, en modellering. Die eerste metode is ’n laekoste energiemeterkalibrasietegniek. Die tweede metode is ’n Dinamiese Linieêre Model (DLM) met Bayesiese vooruitskatting, waarmee meter opnamegroottes bepaal kan word. Die derde metode is ’n Dinamiese Veralgemeende Linieêre Model (DVLM), waarmee bevolkingsoorlewing opnamegroottes bepaal kan word. Volgens wet moet M&V energiemeters gereeld deur erkende laboratoria gekalibreer word. Dit kan duur en ongerieflik wees, veral as die aanleg tydens meterverwydering en -installering afgeskakel moet word. Sommige regsgebiede vereis ook dat meters in-situ gekalibreer word; in hul bedryfsomgewings. Tog word dit aangetoon dat metingsonsekerheid ’n klein deel van die algehele M&V onsekerheid beslaan, veral wanneer opnames gedoen word. Dit bevraagteken die kostevoordeel van laboratoriumkalibrering. Die voorgestelde tegniek gebruik ’n ander kommersieële-akkuurraatheidsgraad meter (wat self ’n nie-weglaatbare metingsfout bevat), om die kalibrasie in-situ te behaal. Dit word gedoen deur die metingsfout deur SIMulerings EKStraptolering (SIMEKS) te verminder. Die SIMEKS resultaat word dan deur Bayesiese statistiek verbeter, en behaal aanvaarbare foutbereike en akkuurate parameterafskattings. Die tweede tegniek gebruik ’n DLM met Bayesiese vooruitskatting om die onsekerheid in die meting van die opnamemonster van die algehele bevolking af te skat. ’n Genetiese Algoritme (GA) word dan toegepas om doeltreffende opnamegroottes te vind. Bayesiese statistiek is veral nuttig in hierdie geval aangesien dit vorige jare se uitslae kan gebruik om huidige afskattings te belig Dit laat ook die presiese afskatting van onsekerheid toe, terwyl standaard vertrouensintervaltegnieke dit nie doen nie. Resultate toon ’n kostebesparing van tot 66%. Die studie ondersoek dan die standvastigheid van kostedoeltreffende opnameplanne in die teenwoordigheid van vooruitskattingsfoute. Dit word gevind dat kostedoeltreffende opnamegroottes 50% van die tyd te klein is, vanweë die gebrek aan statistiese krag in die standaard M&V formules. Die derde tegniek gebruik ’n DVLM op dieselfde manier as die DLM, behalwe dat bevolkingsoorlewingopnamegroottes ondersoek word. Die saamrol van binomiale opname-uitslae binne die GA skep ’n probleem, en in plaas van ’n Monte Carlo simulasie word die relatiewe nuwe Mellin Vervorming Moment Berekening op die probleem toegepas. Die tegniek word dan uitgebou om laagsgewyse opname-ontwerpe vir heterogene bevolkings te vind. Die uitslae wys ’n 17-40% kosteverlaging, alhoewel dit van die koste-skema afhang. Laastens word die DLM en DVLM saamgevoeg om ’n doeltreffende algehele M&V plan, waar meting en opnamekostes teen mekaar afgespeel word, te ontwerp. Dit word vir eenvoudige en laagsgewyse opname-ontwerpe gedoen. Moniteringskostes word met 26-40% verlaag, maar hang van die aangenome koste-skema af. Die uitslae bewys die krag en buigsaamheid van Bayesiese statistiek vir M&V toepassings, beide vir presiese onsekerheidskwantifisering, en deur die doeltreffendheid van die dataverbruik te verhoog en sodoende moniteringskostes te verlaag.
Thesis (PhD)--University of Pretoria, 2017.
National Research Foundation
Department of Science and Technology
National Hub for the Postgraduate Programme in Energy Efficiency and Demand Side Management
Electrical, Electronic and Computer Engineering
PhD
Unrestricted

APA, Harvard, Vancouver, ISO, and other styles

10

Taheri, Sona. "Learning Bayesian networks based on optimization approaches." Thesis, University of Ballarat, 2012. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/36051.

Full text

Abstract:

Learning accurate classifiers from preclassified data is a very active research topic in machine learning and artifcial intelligence. There are numerous classifier paradigms, among which Bayesian Networks are very effective and well known in domains with uncertainty. Bayesian Networks are widely used representation frameworks for reasoning with probabilistic information. These models use graphs to capture dependence and independence relationships between feature variables, allowing a concise representation of the knowledge as well as efficient graph based query processing algorithms. This representation is defined by two components: structure learning and parameter learning. The structure of this model represents a directed acyclic graph. The nodes in the graph correspond to the feature variables in the domain, and the arcs (edges) show the causal relationships between feature variables. A directed edge relates the variables so that the variable corresponding to the terminal node (child) will be conditioned on the variable corresponding to the initial node (parent). The parameter learning represents probabilities and conditional probabilities based on prior information or past experience. The set of probabilities are represented in the conditional probability table. Once the network structure is constructed, the probabilistic inferences are readily calculated, and can be performed to predict the outcome of some variables based on the observations of others. However, the problem of structure learning is a complex problem since the number of candidate structures grows exponentially when the number of feature variables increases. This thesis is devoted to the development of learning structures and parameters in Bayesian Networks. Different models based on optimization techniques are introduced to construct an optimal structure of a Bayesian Network. These models also consider the improvement of the Naive Bayes' structure by developing new algorithms to alleviate the independence assumptions. We present various models to learn parameters of Bayesian Networks; in particular we propose optimization models for the Naive Bayes and the Tree Augmented Naive Bayes by considering different objective functions. To solve corresponding optimization problems in Bayesian Networks, we develop new optimization algorithms. Local optimization methods are introduced based on the combination of the gradient and Newton methods. It is proved that the proposed methods are globally convergent and have superlinear convergence rates. As a global search we use the global optimization method, AGOP, implemented in the open software library GANSO. We apply the proposed local methods in the combination with AGOP. Therefore, the main contributions of this thesis include (a) new algorithms for learning an optimal structure of a Bayesian Network; (b) new models for learning the parameters of Bayesian Networks with the given structures; and finally (c) new optimization algorithms for optimizing the proposed models in (a) and (b). To validate the proposed methods, we conduct experiments across a number of real world problems. Print version is available at: http://library.federation.edu.au/record=b1804607~S4
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

11

PEREGO, RICCARDO. "Automated Deep Learning through Constrained Bayesian Optimization." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2021. http://hdl.handle.net/10281/314922.

Full text

Abstract:

In un mondo sempre più tecnologico e interconnesso, la quantità di dati è in continua crescita e, di conseguenza, anche gli algoritmi di decision-making sono in continua evoluzione per adattarsi ad essi. Una delle principali fonti di questa grande quantità di dati è l'Internet of Things, in cui miliardi di sensori si scambiano informazioni attraverso la rete per svolgere vari tipi di attività come il monitoraggio industriale e medico. Negli ultimi anni lo sviluppo tecnologico ha permesso di definire nuove architetture hardware ad alte prestazioni per i sensori, detti microcontrollori, che hanno permesso la creazione di un nuovo tipo di calcolo decentralizzato denominato Edge Computing. Questo nuovo paradigma di calcolo ha permesso ai sensori di eseguire algoritmi di decision-making per prendere decisioni immediate e locali invece di trasferire i dati su un server centrale di elaborazione. Per supportare l'Edge Computing, la comunità di ricerca ha iniziato a sviluppare nuove tecniche avanzate per gestire in modo efficiente le limitate risorse su questi dispositivi per l'applicazione dei più avanzati modelli di Machine Learning, in particolare le Deep Neural Network. Automated Machine Learning è una branca del campo del Machine Learning che mira a divulgare la potenza del Machine Learning ai non esperti, oltre a supportare in modo efficiente i data scientists nella progettazione delle proprie pipeline di analisi dei dati. L'adozione del Automated Machine Learning ha reso possibile lo sviluppo quasi automatico di modelli sempre più performanti. Tuttavia, con l'avvento dell'Edge Computing, è nata una specializzazione del Machine Learning, definita come Tiny Machine Learning (Tiny ML), ovvero l'applicazione di algoritmi di Machine Learning su dispositivi con risorse hardware limitate. Questa tesi si occupa principalmente dell'applicabilità del Automated Machine Learning per generare modelli accurati che devono essere anche implementabili su dispositivi minuscoli, in particolare i microcontrollori. Più specificamente, l'approccio proposto è volto a massimizzare le prestazioni delle Reti Neurali Profonde e a soddisfare i vincoli associati alle limitate risorse hardware, comprese le batterie, dei microcontrollori. Grazie ad una stretta collaborazione con STMicroelectronics, azienda leader nella progettazione, produzione e vendita di microcontrollori, è stato possibile sviluppare un nuovo framework di Automated Machine Learning che si occupa dei vincoli black-box relativi all’impossibilità di implementare una Deep Neural Network su questi piccoli dispositivi, ampiamente adottato nelle applicazioni dell'internet degli oggetti. L'applicazione su due casi d'uso reali forniti da STMicroelectronics (ad esempio, Human Activity Recognition e Image Recognition) ha dimostrato che il nuovo approccio proposto è in grado di trovare in modo efficiente configurazioni per reti neurali profonde accurate e implementabili, aumentandone l'accuratezza rispetto ai modelli di base e riducendo drasticamente le risorse hardware necessarie per farle funzionare su un microcontrollore (cioè, una riduzione di oltre il 90%). L'approccio è stato anche confrontato con una delle soluzioni AutoML all'avanguardia per valutare la sua capacità di superare i problemi che attualmente limitano l'ampia applicazione di AutoML nel campo del Tiny ML. Infine, questa tesi di dottorato suggerisce interessanti e stimolanti direzioni di ricerca per aumentare ulteriormente l'applicabilità dell'approccio proposto, integrando i risultati di ricerche recenti e innovative (ad esempio, weakly defined search spaces, Meta-Learning, Multi-objective and Multi-Information Source optimization).
In an increasingly technological and interconnected world, the amount of data is continuously growing, and as a consequence, decision-making algorithms are also continually evolving to adapt to it. One of the major sources of this vast amount of data is the Internet of Things, in which billions of sensors exchange information over the network to perform various types of activities such as industrial and medical monitoring. In recent years, technological development has made it possible to define new high-performance hardware architectures for sensors, called Microcontrollers, which enabled the creation of a new kind of decentralized computing named Edge Computing. This new computing paradigm allowed sensors to run decision-making algorithm at the edge in order to take immediate and local decisions instead of transferring the data on central server processing. To support Edge Computing, the research community started developing new advanced techniques to efficiently manage the limited resources on these devices for applying the most advanced Machine Learning models, especially the Deep Neural Networks. Automated Machine Learning is a branch of the Machine Learning field aimed at disclosing the power of Machine Learning to non-experts as well as efficiently supporting data scientists in designing their own data analysis pipelines. The adoption of Automated Machine Learning has made it possible to develop increasingly high-performance models almost automatically. However, with the advent of Edge Computing, a specialization of Machine Learning, defined as Tiny Machine Learning (Tiny ML), has been arising, that is, the application of Machine Learning algorithms on devices having limited hardware resources. This thesis mainly addresses the applicability of Automated Machine Learning to generate accurate models which must be also deployable on tiny devices, specifically Microcontroller Units. More specifically, the proposed approach is aimed at maximizing the performances of Deep Neural Networks while satisfying the constraints associated to the limited hardware resources, including batteries, of Microcontrollers. Thanks to a close collaboration with STMicroelectronics, a leading company for design, production and sale of microcontrollers, it was possible to develop a novel Automated Machine Learning framework that deals with the black-box constraints related to the deployability of a Deep Neural Network on these tiny devices, widely adopted in IoT applications. The application on two real-life use cases provided by STMicroelectronics (i.e., Human Activity Recognition and Image Recognition) proved that the novel proposed approach can efficiently find out configurations for accurate and deployable Deep Neural Networks, increasing their accuracy against baseline models while drastically reducing hardware required to run them on a microcontroller (i.e., a reduction of more than 90\%). The approach was also compared against one of the state-of-the-art AutoML solutions in order to evaluate its capability to overcome the issues which currently limit the wide application of AutoML in the tiny ML field. Finally, this PhD thesis suggests interesting and challenging research directions to further increase the applicability of the proposed approach by integrating recent and innovative research results (e.g., weakly defined search spaces, Meta-Learning, Multi-objective and Multi-Information Source optimization).

APA, Harvard, Vancouver, ISO, and other styles

12

Fowkes, Jaroslav Mrazek. "Bayesian numerical analysis : global optimization and other applications." Thesis, University of Oxford, 2011. http://ora.ox.ac.uk/objects/uuid:ab268fe7-f757-459e-b1fe-a4a9083c1cba.

Full text

Abstract:

We present a unifying framework for the global optimization of functions which are expensive to evaluate. The framework is based on a Bayesian interpretation of radial basis function interpolation which incorporates existing methods such as Kriging, Gaussian process regression and neural networks. This viewpoint enables the application of Bayesian decision theory to derive a sequential global optimization algorithm which can be extended to include existing algorithms of this type in the literature. By posing the optimization problem as a sequence of sampling decisions, we optimize a general cost function at each stage of the algorithm. An extension to multi-stage decision processes is also discussed. The key idea of the framework is to replace the underlying expensive function by a cheap surrogate approximation. This enables the use of existing branch and bound techniques to globally optimize the cost function. We present a rigorous analysis of the canonical branch and bound algorithm in this setting as well as newly developed algorithms for other domains including convex sets. In particular, by making use of Lipschitz continuity of the surrogate approximation, we develop an entirely new algorithm based on overlapping balls. An application of the framework to the integration of expensive functions over rectangular domains and spherical surfaces in low dimensions is also considered. To assess performance of the framework, we apply it to canonical examples from the literature as well as an industrial model problem from oil reservoir simulation.

APA, Harvard, Vancouver, ISO, and other styles

13

Lorenz, Romy. "Neuroadaptive Bayesian optimization : implications for the cognitive sciences." Thesis, Imperial College London, 2017. http://hdl.handle.net/10044/1/51419.

Full text

Abstract:

Cognitive neuroscientists are often interested in broad research questions, yet use overly narrow experimental designs by considering only a small subset of possible experimental conditions. This limits the generalizability and reproducibility of many research findings. In this thesis, I propose, validate and apply an alternative approach that resolves these problems by building upon neuroadaptive experimental paradigms, and combines real-time analysis of functional neuroimaging (fMRI) data with a branch of machine learning, Bayesian optimization. Neuroadaptive Bayesian optimization is a powerful strategy to efficiently explore more experimental conditions than is currently possible with standard methodology. In the first study (Chapter 3), I demonstrate the validity of the approach in a proof-of-principle study involving audio-visual stimuli with varying perceptual complexity. In a subsequent study (Chapter 4), I test the generalizability of the framework to paradigms with lower effect sizes and investigate how automatic stopping criteria could further boost the efficiency of the approach. This is followed by three studies in which I apply neuroadaptive Bayesian optimization to tackle different research questions within the cognitive neurosciences. In the first application study (Chapter 5), I employ the approach to identify the exact cognitive task conditions that optimally dissociate between two frontoparietal brain networks. For the second application (Chapter 6), I use neuroadaptive Bayesian optimization in a study involving non-invasive brain stimulation in order to find the stimulation parameters that elicit optimal network coupling in a frontoparietal network. In the third application study (Chapter 7), I show how adaptive Bayesian optimization can be used beyond the field of cognitive neuroimaging; I investigate the phenomenon of phosphene perception caused by non-invasive brain stimulation by optimizing based on preference ratings given by the participants. As a whole, this thesis provides evidence that neuroadaptive Bayesian optimization can be used to formulate new and exciting research questions within cognitive neuroscience. I argue that the approach could broaden the hypotheses considered in cognitive neuroscience, thereby improving the generalizability of findings. In addition, Bayesian optimization can be combined with preregistration to cover exploration, mitigating researcher bias more broadly and improving reproducibility.

APA, Harvard, Vancouver, ISO, and other styles

14

Fu, Stefan Xueyan. "Finding Optimal Jetting Waveform Parameters with Bayesian Optimization." Thesis, KTH, Optimeringslära och systemteori, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-231374.

Full text

Abstract:

Jet printing is a method in surface mount technology (SMT) in which small volumes of solder paste or other electronic materials are applied to printed circuit boards (PCBs). The solder paste is shot onto the boards by a piston powered by a piezoelectric stack. The characteristics of jetted results can be controlled by a number of factors, one of which is the waveform of the piezo actuation voltage signal. While in theory any waveform is possible, in practice, the signal is defined by seven parameters for the specific technology studied here. The optimization problem of finding the optimal parameter combination cannot be solved by standard derivative based methods, as the objective is a black-box function which can only be sampled though noisy and time-consuming evaluations. The current method for optimizing the parameters is an expert guided grid search of the two most important parameters, while the remaining five are kept constant at default values. Bayesian optimization is a heuristic model based search method for efficient optimization of possibly noisy functions with unavailable derivatives. An implementation of the Bayesian optimization algorithm was adapted for the optimization of the waveform parameters, and used to optimize various combinations of the parameters. Results from different trials produced similar values for the two known parameters, with differences within the uncertainty caused by noise. For the remaining five parameters results were more ambiguous. However, a closer examination of the model hyperparameters showed that these five parameters had almost no impact on the objective function. Thus, the best found parameter values were affected more by random noise than the objective. It is concluded that Bayesian optimization might be a suitable and effective method for waveform parameter optimization, and some directions for further development are suggested based on the results of this project.
Jet printing är en metod för att applicera lodpasta eller andra elektroniska material på kretskort inom ytmontering inom elektronikproduktion. Lodpastan skjuts ut på kretskorten med hjälp av en pistong som drivs av piezoelektrisk enhet. Kvaliteten på det jettade resultatet kan påverkas av en mängd faktorer, till exempel vågformen av signalen som används för att aktivera piezoenheten. I teorin är vilken vågform som helst möjlig, men i praktiken används en vågform som definieras av sju parametrar. Att hitta optimala värden på dessa parametrar är ett optimeringsproblem som inte kan lösas med metoder baserade på derivata, då optimeringens målfunktion är en s.k. svart låda (black-box function) som bara är tillgänglig via brusiga och tidskrävande evalueringar. Den nuvarande metoden för optimering av parametrarna är en modifierad gridsökning för de två viktigaste parametrarna där de kvarvarande fem parametrarna är fixerade. Bayesiansk optimering är en heuristisk modell-baserad sökmetod för dataeffektiv optimering av brusiga funktioner för vilka derivator inte kan beräknas. En implementation av Bayesiansk optimering anpassades för optimering av vågformsparametrar och användes för att optimera en mängd kombinationer av parametrarna. Alla resultaten gav liknande värden för de två kända parametrarna, med skillnader inom osäkerheten från mätbrus. Resultaten för de övriga fem parametrarna var motstridiga, men en närmare granskning av hyperparametrar för modellen visade att detta berodde på att de fem parametrarna bara har en minimal påverkan på det jettade resultatet. Därför kan de motstridiga resultaten förklaras helt som skillnader på grund av mätbrus. Baserat på resultaten verkar Bayesiansk optimering vara en passande och effektiv metod för optimering av vågformsparametrar. Slutligen föreslås några möjligheter för vidare utveckling av metoden.

APA, Harvard, Vancouver, ISO, and other styles

15

Lévesque, Julien-Charles. "Bayesian hyperparameter optimization : overfitting, ensembles and conditional spaces." Doctoral thesis, Université Laval, 2018. http://hdl.handle.net/20.500.11794/28364.

Full text

Abstract:

Dans cette thèse, l’optimisation bayésienne sera analysée et étendue pour divers problèmes reliés à l’apprentissage supervisé. Les contributions de la thèse sont en lien avec 1) la surestimation de la performance de généralisation des hyperparamètres et des modèles résultants d’une optimisation bayésienne, 2) une application de l’optimisation bayésienne pour la génération d’ensembles de classifieurs, et 3) l’optimisation d’espaces avec une structure conditionnelle telle que trouvée dans les problèmes “d’apprentissage machine automatique” (AutoML). Généralement, les algorithmes d’apprentissage automatique ont des paramètres libres, appelés hyperparamètres, permettant de réguler ou de modifier leur comportement à plus haut niveau. Auparavant, ces hyperparamètres étaient choisis manuellement ou par recherche exhaustive. Des travaux récents ont souligné la pertinence d’utiliser des méthodes plus intelligentes pour l’optimisation d’hyperparamètres, notamment l’optimisation bayésienne. Effectivement, l’optimisation bayésienne est un outil polyvalent pour l’optimisation de fonctions inconnues ou non dérivables, ancré fortement dans la modélisation probabiliste et l’estimation d’incertitude. C’est pourquoi nous adoptons cet outil pour le travail dans cette thèse. La thèse débute avec une introduction de l’optimisation bayésienne avec des processus gaussiens (Gaussian processes, GP) et décrit son application à l’optimisation d’hyperparamètres. Ensuite, des contributions originales sont présentées sur les dangers du surapprentissage durant l’optimisation d’hyperparamètres, où l’on se trouve à mémoriser les plis de validation utilisés pour l’évaluation. Il est démontré que l’optimisation d’hyperparamètres peut en effet mener à une surestimation de la performance de validation, même avec des méthodologies de validation croisée. Des méthodes telles que le rebrassage des plis d’entraînement et de validation sont ensuite proposées pour réduire ce surapprentissage. Une autre méthode prometteuse est démontrée dans l’utilisation de la moyenne a posteriori d’un GP pour effectuer la sélection des hyperparamètres finaux, plutôt que sélectionner directement le modèle avec l’erreur minimale en validation croisée. Les deux approches suggérées ont montré une amélioration significative sur la performance en généralisation pour un banc de test de 118 jeux de données. Les contributions suivantes proviennent d’une application de l’optimisation d’hyperparamètres pour des méthodes par ensembles. Les méthodes dites d’empilage (stacking) ont précédemment été employées pour combiner de multiples classifieurs à l’aide d’un métaclassifieur. Ces méthodes peuvent s’appliquer au résultat final d’une optimisation bayésienne d’hyperparamètres en conservant les meilleurs classifieurs identifiés lors de l’optimisation et en les combinant à la fin de l’optimisation. Notre méthode d’optimisation bayésienne d’ensembles consiste en une modification du pipeline d’optimisation d’hyperparamètres pour rechercher des hyperparamètres produisant de meilleurs modèles pour un ensemble, plutôt que d’optimiser pour la performance d’un modèle seul. L’approche suggérée a l’avantage de ne pas nécessiter plus d’entraînement de modèles qu’une méthode classique d’optimisation bayésienne d’hyperparamètres. Une évaluation empirique démontre l’intérêt de l’approche proposée. Les dernières contributions sont liées à l’optimisation d’espaces d’hyperparamètres plus complexes, notamment des espaces contenant une structure conditionnelle. Ces conditions apparaissent dans l’optimisation d’hyperparamètres lorsqu’un modèle modulaire est défini – certains hyperparamètres sont alors seulement définis si leur composante parente est activée. Un exemple de tel espace de recherche est la sélection de modèles et l’optimisation d’hyperparamètres combinée, maintenant davantage connu sous l’appellation AutoML, où l’on veut à la fois choisir le modèle de base et optimiser ses hyperparamètres. Des techniques et de nouveaux noyaux pour processus gaussiens sont donc proposées afin de mieux gérer la structure de tels espaces d’une manière fondée sur des principes. Les contributions présentées sont appuyées par une autre étude empirique sur de nombreux jeux de données. En résumé, cette thèse consiste en un rassemblement de travaux tous reliés directement à l’optimisation bayésienne d’hyperparamètres. La thèse présente de nouvelles méthodes pour l’optimisation bayésienne d’ensembles de classifieurs, ainsi que des procédures pour réduire le surapprentissage et pour optimiser des espaces d’hyperparamètres structurés.
In this thesis, we consider the analysis and extension of Bayesian hyperparameter optimization methodology to various problems related to supervised machine learning. The contributions of the thesis are attached to 1) the overestimation of the generalization accuracy of hyperparameters and models resulting from Bayesian optimization, 2) an application of Bayesian optimization to ensemble learning, and 3) the optimization of spaces with a conditional structure such as found in automatic machine learning (AutoML) problems. Generally, machine learning algorithms have some free parameters, called hyperparameters, allowing to regulate or modify these algorithms’ behaviour. For the longest time, hyperparameters were tuned by hand or with exhaustive search algorithms. Recent work highlighted the conceptual advantages in optimizing hyperparameters with more rational methods, such as Bayesian optimization. Bayesian optimization is a very versatile framework for the optimization of unknown and non-derivable functions, grounded strongly in probabilistic modelling and uncertainty estimation, and we adopt it for the work in this thesis. We first briefly introduce Bayesian optimization with Gaussian processes (GP) and describe its application to hyperparameter optimization. Next, original contributions are presented on the dangers of overfitting during hyperparameter optimization, where the optimization ends up learning the validation folds. We show that there is indeed overfitting during the optimization of hyperparameters, even with cross-validation strategies, and that it can be reduced by methods such as a reshuffling of the training and validation splits at every iteration of the optimization. Another promising method is demonstrated in the use of a GP’s posterior mean for the selection of final hyperparameters, rather than directly returning the model with the minimal crossvalidation error. Both suggested approaches are demonstrated to deliver significant improvements in the generalization accuracy of the final selected model on a benchmark of 118 datasets. The next contributions are provided by an application of Bayesian hyperparameter optimization for ensemble learning. Stacking methods have been exploited for some time to combine multiple classifiers in a meta classifier system. Those can be applied to the end result of a Bayesian hyperparameter optimization pipeline by keeping the best classifiers and combining them at the end. Our Bayesian ensemble optimization method consists in a modification of the Bayesian optimization pipeline to search for the best hyperparameters to use for an ensemble, which is different from optimizing hyperparameters for the performance of a single model. The approach has the advantage of not requiring the training of more models than a regular Bayesian hyperparameter optimization. Experiments show the potential of the suggested approach on three different search spaces and many datasets. The last contributions are related to the optimization of more complex hyperparameter spaces, namely spaces that contain a structure of conditionality. Conditions arise naturally in hyperparameter optimization when one defines a model with multiple components – certain hyperparameters then only need to be specified if their parent component is activated. One example of such a space is the combined algorithm selection and hyperparameter optimization, now better known as AutoML, where the objective is to choose the base model and optimize its hyperparameters. We thus highlight techniques and propose new kernels for GPs that handle structure in such spaces in a principled way. Contributions are also supported by experimental evaluation on many datasets. Overall, the thesis regroups several works directly related to Bayesian hyperparameter optimization. The thesis showcases novel ways to apply Bayesian optimization for ensemble learning, as well as methodologies to reduce overfitting or optimize more complex spaces.
Dans cette thèse, l’optimisation bayésienne sera analysée et étendue pour divers problèmes reliés à l’apprentissage supervisé. Les contributions de la thèse sont en lien avec 1) la surestimation de la performance de généralisation des hyperparamètres et des modèles résultants d’une optimisation bayésienne, 2) une application de l’optimisation bayésienne pour la génération d’ensembles de classifieurs, et 3) l’optimisation d’espaces avec une structure conditionnelle telle que trouvée dans les problèmes “d’apprentissage machine automatique” (AutoML). Généralement, les algorithmes d’apprentissage automatique ont des paramètres libres, appelés hyperparamètres, permettant de réguler ou de modifier leur comportement à plus haut niveau. Auparavant, ces hyperparamètres étaient choisis manuellement ou par recherche exhaustive. Des travaux récents ont souligné la pertinence d’utiliser des méthodes plus intelligentes pour l’optimisation d’hyperparamètres, notamment l’optimisation bayésienne. Effectivement, l’optimisation bayésienne est un outil polyvalent pour l’optimisation de fonctions inconnues ou non dérivables, ancré fortement dans la modélisation probabiliste et l’estimation d’incertitude. C’est pourquoi nous adoptons cet outil pour le travail dans cette thèse. La thèse débute avec une introduction de l’optimisation bayésienne avec des processus gaussiens (Gaussian processes, GP) et décrit son application à l’optimisation d’hyperparamètres. Ensuite, des contributions originales sont présentées sur les dangers du surapprentissage durant l’optimisation d’hyperparamètres, où l’on se trouve à mémoriser les plis de validation utilisés pour l’évaluation. Il est démontré que l’optimisation d’hyperparamètres peut en effet mener à une surestimation de la performance de validation, même avec des méthodologies de validation croisée. Des méthodes telles que le rebrassage des plis d’entraînement et de validation sont ensuite proposées pour réduire ce surapprentissage. Une autre méthode prometteuse est démontrée dans l’utilisation de la moyenne a posteriori d’un GP pour effectuer la sélection des hyperparamètres finaux, plutôt que sélectionner directement le modèle avec l’erreur minimale en validation croisée. Les deux approches suggérées ont montré une amélioration significative sur la performance en généralisation pour un banc de test de 118 jeux de données. Les contributions suivantes proviennent d’une application de l’optimisation d’hyperparamètres pour des méthodes par ensembles. Les méthodes dites d’empilage (stacking) ont précédemment été employées pour combiner de multiples classifieurs à l’aide d’un métaclassifieur. Ces méthodes peuvent s’appliquer au résultat final d’une optimisation bayésienne d’hyperparamètres en conservant les meilleurs classifieurs identifiés lors de l’optimisation et en les combinant à la fin de l’optimisation. Notre méthode d’optimisation bayésienne d’ensembles consiste en une modification du pipeline d’optimisation d’hyperparamètres pour rechercher des hyperparamètres produisant de meilleurs modèles pour un ensemble, plutôt que d’optimiser pour la performance d’un modèle seul. L’approche suggérée a l’avantage de ne pas nécessiter plus d’entraînement de modèles qu’une méthode classique d’optimisation bayésienne d’hyperparamètres. Une évaluation empirique démontre l’intérêt de l’approche proposée. Les dernières contributions sont liées à l’optimisation d’espaces d’hyperparamètres plus complexes, notamment des espaces contenant une structure conditionnelle. Ces conditions apparaissent dans l’optimisation d’hyperparamètres lorsqu’un modèle modulaire est défini – certains hyperparamètres sont alors seulement définis si leur composante parente est activée. Un exemple de tel espace de recherche est la sélection de modèles et l’optimisation d’hyperparamètres combinée, maintenant davantage connu sous l’appellation AutoML, où l’on veut à la fois choisir le modèle de base et optimiser ses hyperparamètres. Des techniques et de nouveaux noyaux pour processus gaussiens sont donc proposées afin de mieux gérer la structure de tels espaces d’une manière fondée sur des principes. Les contributions présentées sont appuyées par une autre étude empirique sur de nombreux jeux de données. En résumé, cette thèse consiste en un rassemblement de travaux tous reliés directement à l’optimisation bayésienne d’hyperparamètres. La thèse présente de nouvelles méthodes pour l’optimisation bayésienne d’ensembles de classifieurs, ainsi que des procédures pour réduire le surapprentissage et pour optimiser des espaces d’hyperparamètres structurés.

APA, Harvard, Vancouver, ISO, and other styles

16

Kawaguchi, Kenji Ph D. Massachusetts Institute of Technology. "Towards practical theory : Bayesian optimization and optimal exploration." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/103670.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 83-87).
This thesis presents novel principles to improve the theoretical analyses of a class of methods, aiming to provide theoretically driven yet practically useful methods. The thesis focuses on a class of methods, called bound-based search, which includes several planning algorithms (e.g., the A* algorithm and the UCT algorithm), several optimization methods (e.g., Bayesian optimization and Lipschitz optimization), and some learning algorithms (e.g., PAC-MDP algorithms). For Bayesian optimization, this work solves an open problem and achieves an exponential convergence rate. For learning algorithms, this thesis proposes a new analysis framework, called PACRMDP, and improves the previous theoretical bounds. The PAC-RMDP framework also provides a unifying view of some previous near-Bayes optimal and PAC-MDP algorithms. All proposed algorithms derived on the basis of the new principles produced competitive results in our numerical experiments with standard benchmark tests.
by Kenji Kawaguchi.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

17

Petit, Sébastien. "Improved Gaussian process modeling : Application to Bayesian optimization." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG063.

Full text

Abstract:

Cette thèse s’inscrit dans la lignée de travaux portant sur la modélisation bayésienne de fonctions par processus gaussiens, pour des applications en conception industrielle s’appuyant sur des simulateurs numériques dont le temps de calcul peut atteindre jusqu’à plusieurs heures. Notre travail se concentre sur le problème de sélection et de validation de modèle et s’articule autour de deux axes. Le premier consiste à étudier empiriquement les pratiques courantes de modélisation par processus gaussien stationnaire. Plusieurs problèmes sur la sélection automatique de paramètre de processus gaussien sont considérés. Premièrement, une étude empirique des critères de sélection de paramètres constitue le coeur de cet axe de recherche et conclut que, pour améliorer la prédictivité des modèles, le choix d’un critère de sélection parmi les plus courants est un facteur de moindre importance que le choix a priori d’une famille de modèles. Plus spécifiquement, l’étude montre que le paramètre de régularité de la fonction de covariance de Matérn est plus déterminant que le choix d’un critère de vraisemblance ou de validation croisée. De plus, l’analyse des résultats numériques montre que ce paramètre peut-être sélectionné de manière satisfaisante par les critères, ce qui aboutit à une recommandation permettant d’améliorer les pratiques courantes. Ensuite, une attention particulière est réservée à l’optimisation numérique du critère de vraisemblance. Constatant, comme Erickson et al. (2018), des inconsistances importantes entre les différentes librairies disponibles pour la modélisation par processus gaussien, nous proposons une série de recettes numériques élémentaires permettant d’obtenir des gains significatifs tant en termes de vraisemblance que de précision du modèle. Enfin, les formules analytiques pour le calcul de critère de validation croisée sont revisitées sous un angle nouveau et enrichies de formules analogues pour les gradients. Cette dernière contribution permet d’aligner le coût calculatoire d’une classe de critères de validation croisée sur celui de la vraisemblance. Le second axe de recherche porte sur le développement de méthodes dépassant le cadre des modèles gaussiens stationnaires. Constatant l’absence de méthode ciblée dans la littérature, nous proposons une approche permettant d’améliorer la précision d’un modèle sur une plage d’intérêt en sortie. Cette approche consiste à relâcher les contraintes d’interpolation sur une plage de relaxation disjointe de la plage d’intérêt, tout en conservant un coût calculatoire raisonnable. Nous proposons également une approche pour la sélection automatique de la plage de relaxation en fonction de la plage d’intérêt. Cette nouvelle méthode permet de définir des régions d’intérêt potentiellement complexes dans l’espace d’entrée avec peu de paramètres et, en dehors, d’apprendre de manière non-paramétrique une transformation permettant d’améliorer la prédictivité du modèle sur la plage d’intérêt. Des simulations numériques montrent l’intérêt de la méthode pour l’optimisation bayésienne, où l’on est intéressé par les valeurs basses dans le cadre de la minimisation. De plus, la convergence théorique de la méthode est établie, sous certaines hypothèses
This manuscript focuses on Bayesian modeling of unknown functions with Gaussian processes. This task arises notably for industrial design, with numerical simulators whose computation time can reach several hours. Our work focuses on the problem of model selection and validation and goes in two directions. The first part studies empirically the current practices for stationary Gaussian process modeling. Several issues on Gaussian process parameter selection are tackled. A study of parameter selection criteria is the core of this part. It concludes that the choice of a family of models is more important than that of the selection criterion. More specifically, the study shows that the regularity parameter of the Matérn covariance function is more important than the choice of a likelihood or cross-validation criterion. Moreover, the analysis of the numerical results shows that this parameter can be selected satisfactorily by the criteria, which leads to a practical recommendation. Then, particular attention is given to the numerical optimization of the likelihood criterion. Observing important inconsistencies between the different libraries available for Gaussian process modeling like Erickson et al. (2018), we propose elementary numerical recipes making it possible to obtain significant gains both in terms of likelihood and model accuracy. Finally, the analytical formulas for computing cross-validation criteria are revisited under a new angle and enriched with similar formulas for the gradients. This last contribution aligns the computational cost of a class of cross-validation criteria with that of the likelihood. The second part presents a goal-oriented methodology. It is designed to improve the accuracy of the model in an (output) range of interest. This approach consists in relaxing the interpolation constraints on a relaxation range disjoint from the range of interest. We also propose an approach for automatically selecting the relaxation range. This new method can implicitly manage potentially complex regions of interest in the input space with few parameters. Outside, it learns non-parametrically a transformation improving the predictions on the range of interest. Numerical simulations show the benefits of the approach for Bayesian optimization, where one is interested in low values in the minimization framework. Moreover, the theoretical convergence of the method is established under some assumptions

APA, Harvard, Vancouver, ISO, and other styles

18

Gaul, Nicholas John. "Modified Bayesian Kriging for noisy response problems and Bayesian confidence-based reliability-based design optimization." Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/1322.

Full text

Abstract:

The objective of this study is to develop a new modified Bayesian Kriging (MBKG) surrogate modeling method that can be used to carry out confidence-based reliability-based design optimization (RBDO) for problems in which simulation analyses are inherently noisy and standard Kriging approaches fail. The formulation of the MBKG surrogate modeling method is presented, and the full conditional distributions of the unknown MBKG parameters are derived and coded into a Gibbs sampling algorithm. Using the coded Gibbs sampling algorithm, Markov chain Monte Carlo is used to fit the MBKG surrogate model. A sequential sampling method that uses the posterior credible sets for inserting new design of experiment (DoE) sample points is proposed. The sequential sampling method is developed in such a way that the new DoE sample points added will provide the maximum amount of information possible to the MBKG surrogate model, making it an efficient and effective way to reduce the number of DoE sample points needed. Therefore, it improves the posterior distribution of the probability of failure efficiently. Finally, a confidence-based RBDO method using the posterior distribution of the probability of failure is developed. The confidence-based RBDO method is developed so that the uncertainty of the MBKG surrogate model is included in the optimization process. A 2-D mathematical example was used to demonstrate fitting the MBKG surrogate model and the developed sequential sampling method that uses the posterior credible sets for inserting new DoE. A detailed study on how the posterior distribution of the probability of failure changes as new DoE are added using the developed sequential sampling method is presented. Confidence-based RBDO is carried out using the same 2-D mathematical example. Three different noise levels are used for the example to compare how the MBKG surrogate modeling method, the sequential sampling method, and the confidence-based RBDO method behave for different amounts of noise in the response. A comparison of the optimization results for the three different noise levels for the same 2-D mathematical example is presented. A 3-D multibody dynamics (MBD) engineering block-car example is presented. The example is used to demonstrate using the developed methods to carry out confidence-based RBDO for an engineering problem that contains noise in the response. The MBD simulations for this example were done using the commercially available MBD software package RecurDyn. Deterministic design optimization (DDO) was first done using the MBKG surrogate model to obtain the mean response values, which then were used with standard Kriging methods to obtain the sensitivity of the responses. Confidence-based RBDO was then carried out using the DDO solution as the initial design point.

APA, Harvard, Vancouver, ISO, and other styles

19

Lee, Chung Hyun. "Bayesian collaborative sampling: adaptive learning for multidisciplinary design." Diss., Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/42894.

Full text

Abstract:

A Bayesian adaptive sampling method is developed for highly coupled multidisciplinary design problems. The method addresses a major challenge in aerospace design: exploration of a design space with computationally expensive analysis tools such as computational fluid dynamics (CFD) or finite element analysis. With a limited analysis budget, it is often impossible to optimize directly or to explore a design space with off-line design of experiments (DoE) and surrogate models. This difficulty is magnified in multidisciplinary problems with feedbacks between disciplines because each design point may require iterative analyses to converge on a compatible solution between different disciplines. Bayesian Collaborative Sampling (BCS) is a bi-level architecture for adaptive sampling that simulataneously - concentrates disciplinary analyses in regions of a design space that are favorable to a system-level objective - guides analyses to regions where interdisciplinary coupling variables are probably compatible BCS uses Bayesian models and sequential sampling techniques along with elements of the collaborative optimization (CO) architecture for multidisciplinary optimization. The method is tested with the aero-structural design of a glider wing and the aero-propulsion design of a turbojet engine nacelle.

APA, Harvard, Vancouver, ISO, and other styles

20

Conjeevaram, Krishnakumar Naveen Kartik. "A Bayesian approach to feed reconstruction." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/82414.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2013.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 83-86).
In this thesis, we developed a Bayesian approach to estimate the detailed composition of an unknown feedstock in a chemical plant by combining information from a few bulk measurements of the feedstock in the plant along with some detailed composition information of a similar feedstock that was measured in a laboratory. The complexity of the Bayesian model combined with the simplex-type constraints on the weight fractions makes it difficult to sample from the resulting high-dimensional posterior distribution. We reviewed and implemented different algorithms to generate samples from this posterior that satisfy the given constraints. We tested our approach on a data set from a plant.
by Naveen Kartik Conjeevaram Krishnakumar.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

21

Horiguchi, Akira. "Bayesian Additive Regression Trees: Sensitivity Analysis and Multiobjective Optimization." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1606841319315633.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Matosevic, Antonio. "On Bayesian optimization and its application to hyperparameter tuning." Thesis, Linnéuniversitetet, Institutionen för matematik (MA), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-74962.

Full text

Abstract:

This thesis introduces the concept of Bayesian optimization, primarly used in optimizing costly black-box functions. Besides theoretical treatment of the topic, the focus of the thesis is on two numerical experiments. Firstly, different types of acquisition functions, which are the key components responsible for the performance, are tested and compared. Special emphasis is on the analysis of a so-called exploration-exploitation trade-off. Secondly, one of the most recent applications of Bayesian optimization concerns hyperparameter tuning in machine learning algorithms, where the objective function is expensive to evaluate and not given analytically. However, some results indicate that much simpler methods can give similar results. Our contribution is therefore a statistical comparison of simple random search and Bayesian optimization in the context of finding the optimal set of hyperparameters in support vector regression. It has been found that there is no significant difference in performance of these two methods.

APA, Harvard, Vancouver, ISO, and other styles

23

Krishnaswami, Sreedhar Bharathwaj. "Bayesian Optimization for Neural Architecture Search using Graph Kernels." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291219.

Full text

Abstract:

Neural architecture search is a popular method for automating architecture design. Bayesian optimization is a widely used approach for hyper-parameter optimization and can estimate a function with limited samples. However, Bayesian optimization methods are not preferred for architecture search as it expects vector inputs while graphs are high dimensional data. This thesis presents a Bayesian approach with Gaussian priors that use graph kernels specifically targeted to work in the higherdimensional graph space. We implemented three different graph kernels and show that on the NAS-Bench-101 dataset, an untrained graph convolutional network kernel outperforms previous methods significantly in terms of the best network found and the number of samples required to find it. We follow the AutoML guidelines to make this work reproducible.
Neural arkitektur sökning är en populär metod för att automatisera arkitektur design. Bayesian-optimering är ett vanligt tillvägagångssätt för optimering av hyperparameter och kan uppskatta en funktion med begränsade prover. Bayesianska optimeringsmetoder är dock inte att föredra för arkitektonisk sökning eftersom vektoringångar förväntas medan grafer är högdimensionella data. Denna avhandling presenterar ett Bayesiansk tillvägagångssätt med gaussiska prior som använder grafkärnor som är särskilt fokuserade på att arbeta i det högre dimensionella grafutrymmet. Vi implementerade tre olika grafkärnor och visar att det på NASBench- 101-data, till och med en otränad Grafkonvolutionsnätverk-kärna, överträffar tidigare metoder när det gäller det bästa nätverket som hittats och antalet prover som krävs för att hitta det. Vi följer AutoML-riktlinjerna för att göra detta arbete reproducerbart.

APA, Harvard, Vancouver, ISO, and other styles

24

Pelamatti, Julien. "Mixed-variable Bayesian optimization : application to aerospace system design." Thesis, Lille 1, 2020. http://www.theses.fr/2020LIL1I003.

Full text

Abstract:

Dans le cadre de la conception de systèmes complexes, tels que les aéronefs et les lanceurs, la présence de fonctions d'objectifs et/ou de contraintes à forte intensité de calcul (e.g., modèles d'éléments finis) couplée à la dépendance de choix de conception technologique discrets et non ordonnés entraîne des problèmes d'optimisation difficiles. De plus, une partie de ces choix technologiques est associée à un certain nombre de variables de conception continues et discrètes spécifiques qui ne doivent être prises en considération que si des choix technologiques spécifiques sont faits. Par conséquent, le problème d'optimisation qui doit être résolu afin de déterminer la conception optimale du système présente un espace de recherche et un domaine de faisabilité variant de façon dynamique. Les algorithmes existants qui permettent de résoudre ce type particulier de problèmes ont tendance à exiger une grande quantité d'évaluations de fonctions afin de converger vers l'optimum réalisable, et sont donc inadéquats lorsqu'il s'agit de résoudre les problèmes à forte intensité de calcul. Pour cette raison, cette thèse explore la possibilité d'effectuer une optimisation de l'espace de conception contraint à variables mixtes et de taille variable en s'appuyant sur des méthodes d’optimisation à base de modèles de substitution créés à l'aide de processus Gaussiens, également connue sous le nom d'optimisation Bayésienne. Plus spécifiquement, 3 axes principaux sont discutés. Premièrement, la modélisation de substitution par processus gaussien de fonctions mixtes continues/discrètes et les défis qui y sont associés sont discutés en détail. Un formalisme unificateur est proposé afin de faciliter la description et la comparaison entre les noyaux existants permettant d'adapter les processus gaussiens à la présence de variables discrètes non ordonnées. De plus, les performances réelles de modélisation de ces différents noyaux sont testées et comparées sur un ensemble de benchmarks analytiques et de conception ayant des caractéristiques et des paramétrages différents. Dans la deuxième partie de la thèse, la possibilité d'étendre la modélisation de substitution mixte continue/discrète à un contexte d'optimisation Bayésienne est discutée. La faisabilité théorique de cette extension en termes de modélisation de la fonction objectif/contrainte ainsi que de définition et d'optimisation de la fonction d'acquisition est démontrée. Différentes alternatives possibles sont considérées et décrites. Enfin, la performance de l'algorithme d'optimisation proposé, avec diverses paramétrisations des noyaux et différentes initialisations, est testée sur un certain nombre de cas-test analytiques et de conception et est comparée aux algorithmes de référence.Dans la dernière partie de ce manuscrit, deux approches permettant d'adapter les algorithmes d'optimisation bayésienne mixte continue/discrète discutés précédemment afin de résoudre des problèmes caractérisés par un espace de conception variant dynamiquement au cours de l’optimisation sont proposées. La première adaptation est basée sur l'optimisation parallèle de plusieurs sous-problèmes couplée à une allocation de budget de calcul basée sur l'information fournie par les modèles de substitution. La seconde adaptation, au contraire, est basée sur la définition d'un noyau permettant de calculer la covariance entre des échantillons appartenant à des espaces de recherche partiellement différents en fonction du regroupement hiérarchique des variables dimensionnelles. Enfin, les deux alternatives sont testées et comparées sur un ensemble de cas-test analytiques et de conception.Globalement, il est démontré que les méthodes d'optimisation proposées permettent de converger vers les optimums des différents types de problèmes considérablement plus rapidement par rapport aux méthodes existantes. Elles représentent donc un outil prometteur pour la conception de systèmes complexes
Within the framework of complex system design, such as aircraft and launch vehicles, the presence of computationallyintensive objective and/or constraint functions (e.g., finite element models and multidisciplinary analyses)coupled with the dependence on discrete and unordered technological design choices results in challenging optimizationproblems. Furthermore, part of these technological choices is associated to a number of specific continuous anddiscrete design variables which must be taken into consideration only if specific technological and/or architecturalchoices are made. As a result, the optimization problem which must be solved in order to determine the optimalsystem design presents a dynamically varying search space and feasibility domain.The few existing algorithms which allow solving this particular type of problems tend to require a large amountof function evaluations in order to converge to the feasible optimum, and result therefore inadequate when dealingwith the computationally intensive problems which can often be encountered within the design of complex systems.For this reason, this thesis explores the possibility of performing constrained mixed-variable and variable-size designspace optimization by relying on surrogate model-based design optimization performed with the help of Gaussianprocesses, also known as Bayesian optimization. More specifically, 3 main axes are discussed. First, the Gaussianprocess surrogate modeling of mixed continuous/discrete functions and the associated challenges are extensivelydiscussed. A unifying formalism is proposed in order to facilitate the description and comparison between theexisting kernels allowing to adapt Gaussian processes to the presence of discrete unordered variables. Furthermore,the actual modeling performances of these various kernels are tested and compared on a set of analytical and designrelated benchmarks with different characteristics and parameterizations.In the second part of the thesis, the possibility of extending the mixed continuous/discrete surrogate modeling toa context of Bayesian optimization is discussed. The theoretical feasibility of said extension in terms of objective/-constraint function modeling as well as acquisition function definition and optimization is shown. Different possiblealternatives are considered and described. Finally, the performance of the proposed optimization algorithm, withvarious kernels parameterizations and different initializations, is tested on a number of analytical and design relatedtest-cases and compared to reference algorithms.In the last part of this manuscript, two alternative ways of adapting the previously discussed mixed continuous/discrete Bayesian optimization algorithms in order to solve variable-size design space problems (i.e., problemscharacterized by a dynamically varying design space) are proposed. The first adaptation is based on the paralleloptimization of several sub-problems coupled with a computational budget allocation based on the informationprovided by the surrogate models. The second adaptation, instead, is based on the definition of a kernel allowingto compute the covariance between samples belonging to partially different search spaces based on the hierarchicalgrouping of design variables. Finally, the two alternatives are tested and compared on a set of analytical and designrelated benchmarks.Overall, it is shown that the proposed optimization methods allow to converge to the various constrained problemoptimum neighborhoods considerably faster when compared to the reference methods, thus representing apromising tool for the design of complex systems

APA, Harvard, Vancouver, ISO, and other styles

25

Fischer, Christopher Corey. "Bayesian Inspired Multi-Fidelity Optimization with Aerodynamic Design Application." Wright State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=wright1621948051637597.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Bade, Alexander. "Bayesian portfolio optimization from a static and dynamic perspective /." Münster : Verl.-Haus Monsenstein und Vannerdat, 2009. http://d-nb.info/996985085/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Shahriari, Bobak. "Practical Bayesian optimization with application to tuning machine learning algorithms." Thesis, University of British Columbia, 2016. http://hdl.handle.net/2429/59104.

Full text

Abstract:

Bayesian optimization has recently emerged in the machine learning community as a very effective automatic alternative to the tedious task of hand-tuning algorithm hyperparameters. Although it is a relatively new aspect of machine learning, it has known roots in the Bayesian experimental design (Lindley, 1956; Chaloner and Verdinelli, 1995), the design and analysis of computer experiments (DACE; Sacks et al., 1989), Kriging (Krige, 1951), and multi-armed bandits (Gittins, 1979). In this thesis, we motivate and introduce the model-based optimization framework and provide some historical context to the technique that dates back as far as 1933 with application to clinical drug trials (Thompson, 1933). Contributions of this work include a Bayesian gap-based exploration policy, inspired by Gabillon et al. (2012); a principled information-theoretic portfolio strategy, out-performing the portfolio of Hoffman et al. (2011); and a general practical technique circumventing the need for an initial bounding box. These various works each address existing practical challenges in the way of more widespread adoption of probabilistic model-based optimization techniques. Finally, we conclude this thesis with important directions for future research, emphasizing scalability and computational feasibility of the approach as a general purpose optimizer.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

28

Brochu, Eric. "Interactive Bayesian optimization : learning user preferences for graphics and animation." Thesis, University of British Columbia, 2010. http://hdl.handle.net/2429/30519.

Full text

Abstract:

Bayesian optimization with Gaussian processes has become increasingly popular in the machine learning community. It is efficient and can be used when very little is known about the objective function, making it useful for optimizing expensive black box functions. We examine the case of using Bayesian optimization when the objective function requires feedback from a human. We call this class of problems \emph{interactive} Bayesian optimization. Here, we assume a parameterized model, and a user whose task is to find an acceptable set of parameters according to some perceptual value function that cannot easily be articulated. This requires special attention to the qualities that make this a unique problem, and so, we introduce three novel extensions: the application of Bayesian optimization to "preference galleries", where human feedback is in the form of preferences over a set of instances; a particle-filter method for learning the distribution of model hyperparameters over heterogeneous users and tasks; and a bandit-based method of using a portfolio of utility functions to select sample points. Using a variety of test functions, we validate our extensions empirically on both low- and high-dimensional objective functions. We also present graphics and animation applications that use interactive Bayesian optimization techniques to help artists find parameters on difficult problems. We show that even with minimal domain knowledge, an interface using interactive Bayesian optimization is much more efficient and effective than traditional "parameter twiddling" techniques on the same problem.

APA, Harvard, Vancouver, ISO, and other styles

29

Chen, Zhaozhong. "Visual-Inertial SLAM Extrinsic Parameter Calibration Based on Bayesian Optimization." Thesis, University of Colorado at Boulder, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=10789260.

Full text

Abstract:

VI-SLAM (Visual-Inertial Simultaneous Localization and Mapping) is a popular way for robotics navigation and tracking. With the help of sensor fusion from IMU and camera, VI-SLAM can give a more accurate solution for navigation. One important problem needs to be solved in VI-SLAM is that we need to know accurate relative position between camera and IMU, we call it the extrinsic parameter. However, our measurement of the rotation and translation between IMU and camera is noisy. If the measurement is slightly o?, the result of SLAM system will be much more away from the ground truth after a long run. Optimization is necessary. This paper uses a global optimization method called Bayesian Optimization to optimize the relative pose between IMU and camera based on the sliding window residual output from VISLAM. The advantage of using Bayesian Optimization is that we can get an accurate pose estimation between IMU and camera from a large searching range. Whats more, thanks to the Gaussian Process or T process of Bayesian Optimization, we can get a result with a known uncertainty, which cannot be done by many optimization solutions.

APA, Harvard, Vancouver, ISO, and other styles

30

Shende, Sourabh. "Bayesian Topology Optimization for Efficient Design of Origami Folding Structures." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1592170569337763.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Parno, Matthew David. "A multiscale framework for Bayesian inference in elliptic problems." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/65322.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2011.
Page 118 blank. Cataloged from PDF version of thesis.
Includes bibliographical references (p. 112-117).
The Bayesian approach to inference problems provides a systematic way of updating prior knowledge with data. A likelihood function involving a forward model of the problem is used to incorporate data into a posterior distribution. The standard method of sampling this distribution is Markov chain Monte Carlo which can become inefficient in high dimensions, wasting many evaluations of the likelihood function. In many applications the likelihood function involves the solution of a partial differential equation so the large number of evaluations required by Markov chain Monte Carlo can quickly become computationally intractable. This work aims to reduce the computational cost of sampling the posterior by introducing a multiscale framework for inference problems involving elliptic forward problems. Through the construction of a low dimensional prior on a coarse scale and the use of iterative conditioning technique the scales are decouples and efficient inference can proceed. This work considers nonlinear mappings from a fine scale to a coarse scale based on the Multiscale Finite Element Method. Permeability characterization is the primary focus but a discussion of other applications is also provided. After some theoretical justification, several test problems are shown that demonstrate the efficiency of the multiscale framework.
by Matthew David Parno.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

32

Pelikan, Martin. "Hierarchical Bayesian optimization algorithm : toward a new generation of evolutionary algorithms /." Berlin [u.a.] : Springer, 2005. http://www.loc.gov/catdir/toc/fy053/2004116659.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Jeggle, Kai. "Scalable Hyperparameter Opimization: Combining Asynchronous Bayesian Optimization With Efficient Budget Allocation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280340.

Full text

Abstract:

Automated hyperparameter tuning has become an integral part in the optimization of machine learning (ML) pipelines. Sequential model based optimization algorithms, such as bayesian optimization (BO), have been proven to be sample efficient with strong final performance. However, the increasing complexity and training times of ML models requires a shift from sequential to asynchronous, distributed hyperparameter tuning. The literature has come up with different strategies to modify BO to work in an asynchronous setting. By combining asynchronous BO with budget allocation strategies, poor performing trials are stopped early to free up expensive resources for other trials, improving the efficient use of resources and hence scalability further. Maggy is an open-source asynchronous hyperparameter optimization framework built on Spark that transparently schedules and manages hyperparameter trials. In this thesis, we present new support for a plug and play API to arbitrarily combine asynchronous Bayesian optimization algorithms with budget allocation strategies, like Hyperband or Median Early Stopping. This combines the best of both worlds and provides high scalability through efficient use of resources and strong final performance. We experimentally evaluate different combinations of asynchronous Bayesian Optimization with budget allocation algorithms and demonstrate its competitive performance and ability to scale.
Automatiserad inställning av hyperparameter har blivit en integrerad del i optimering av arbetsflöde för maskininlärning (ML). Sekventiella modellbaserade optimeringsalgoritmer, såsom Bayesian Optimization (BO), har visat sig vara dataeffektiva med stark slutprestanda. Emellertid kräver ökande komplexitet och träningstider för ML-modeller en övergång från sekventiell till asynkron, distribuerad hyperparameterinställning. Litteraturen har kommit med olika strategier för att modifiera BO för att arbeta i en asynkron miljö. Genom att kombinera asynkron BO med budgetallokeringsstrategier kan dåliga försök stoppas tidigt för att frigöra dyra resurser för andra försök att utforska sökutrymmet, förbättra effektiv resursanvändning och därmed skalbarhet ytterligare.Maggy är en öppen källkod asynkron hyperparameteroptimeringsram som är byggd på Spark som transparent planerar och hanterar hyperparameter-försök. I den här avhandlingen presenterar vi nytt stöd för en plug and play API som kombinerar asynkron bayesiska optimeringsalgoritmer med bud-get allokeringsstrategier, som Hyperband eller Median Early Stopping. Detta kombinerar det bästa från båda världarna och ger hög skalbarhet genom effektiv användning av resurser och stark slutprestanda. Vi utvärderar experimentellt olika kombinationer av asynkron Bayesian Optimization med allokering av budgetalgoritmer och visar dess konkurrenskraftiga prestanda och förmåga.

APA, Harvard, Vancouver, ISO, and other styles

34

Feng, Chi S. M. Massachusetts Institute of Technology. "Optimal Bayesian experimental design in the presence of model error." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/97790.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 87-90).
The optimal selection of experimental conditions is essential to maximizing the value of data for inference and prediction. We propose an information theoretic framework and algorithms for robust optimal experimental design with simulation-based models, with the goal of maximizing information gain in targeted subsets of model parameters, particularly in situations where experiments are costly. Our framework employs a Bayesian statistical setting, which naturally incorporates heterogeneous sources of information. An objective function reflects expected information gain from proposed experimental designs. Monte Carlo sampling is used to evaluate the expected information gain, and stochastic approximation algorithms make optimization feasible for computationally intensive and high-dimensional problems. A key aspect of our framework is the introduction of model calibration discrepancy terms that are used to "relax" the model so that proposed optimal experiments are more robust to model error or inadequacy. We illustrate the approach via several model problems and misspecification scenarios. In particular, we show how optimal designs are modified by allowing for model error, and we evaluate the performance of various designs by simulating "real-world" data from models not considered explicitly in the optimization objective.
by Chi Feng.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

35

Lieberman, Chad Eric. "Parameter and state model reduction for Bayesian statistical inverse problems." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/54213.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2009.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student submitted PDF version of thesis.
Includes bibliographical references (p. 113-118).
Decisions based on single-point estimates of uncertain parameters neglect regions of significant probability. We consider a paradigm based on decision-making under uncertainty including three steps: identification of parametric probability by solution of the statistical inverse problem, propagation of that uncertainty through complex models, and solution of the resulting stochastic or robust mathematical programs. In this thesis we consider the first of these steps, solution of the statistical inverse problem, for partial differential equations (PDEs) parameterized by field quantities. When these field variables and forward models are discretized, the resulting system is high-dimensional in both parameter and state space. The system is therefore expensive to solve. The statistical inverse problem is one of Bayesian inference. With assumption on prior belief about the form of the parameter and an assignment of normal error in sensor measurements, we derive the solution to the statistical inverse problem analytically, up to a constant of proportionality. The parametric probability density, or posterior, depends implicitly on the parameter through the forward model. In order to understand the distribution in parameter space, we must sample. Markov chain Monte Carlo (MCMC) sampling provides a method by which a random walk is constructed through parameter space. By following a few simple rules, the random walk converges to the posterior distribution and the resulting samples represent draws from that distribution. This set of samples from the posterior can be used to approximate its moments.
(cont.) In the multi-query setting, it is computationally intractable to utilize the full-order forward model to perform the posterior evaluations required in the MCMC sampling process. Instead, we implement a novel reduced-order model which reduces in parameter and state. The reduced bases are generated by greedy sampling. We iteratively sample the field in parameter space which maximizes the error in full-order and current reduced-order model outputs. The parameter is added to its basis and then a high-fidelity forward model is solved for the state, which is then added to the state basis. The reduction in state accelerates posterior evaluation while the reduction in parameter allows the MCMC sampling to be conducted with a simpler, non-adaptive 3 Metropolis-Hastings algorithm. In contrast, the full-order parameter space is high-dimensional and requires more expensive adaptive methods. We demonstrate for the groundwater inverse problem in 1-D and 2-D that the reduced-order implementation produces accurate results with a factor of three speed up even for the model problems of dimension N ~~500. Our complexity analysis demonstrates that the same approach applied to the large-scale models of interest (e.g. N > 10⁴) results in a speed up of three orders of magnitude.
by Chad Eric Lieberman.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

36

Ramadan, Saleem Z. "Bayesian Multi-objective Design of Reliability Testing." Ohio University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1298474937.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Bahg, Giwon. "Adaptive Design Optimization in Functional MRI Experiments." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531836392551605.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Lam, Remi Roger Alain Paul. "Scaling Bayesian optimization for engineering design : lookahead approaches and multifidelity dimension reduction." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119289.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 105-111).
The objective functions and constraints that arise in engineering design problems are often non-convex, multi-modal and do not have closed-form expressions. Evaluation of these functions can be expensive, requiring a time-consuming computation (e.g., solving a set of partial differential equations) or a costly experiment (e.g., conducting wind-tunnel measurements). Accordingly, whether the task is formal optimization or just design space exploration, there is often a finite budget specifying the maximum number of evaluations of the objectives and constraints allowed. Bayesian optimization (BO) has become a popular global optimization technique for solving problems governed by such expensive functions. BO iteratively updates a statistical model and uses it to quantify the expected benefits of evaluating a given design under consideration. The next design to evaluate can be selected in order to maximize such benefits. Most existing BO algorithms are greedy strategies, making decisions to maximize the immediate benefits, without planning over several steps. This is typically a suboptimal approach. In the first part of this thesis, we develop a novel BO algorithm with planning capabilities. This algorithm selects the next design to evaluate in order to maximize the long-term expected benefit obtained at the end of the optimization. This lookahead approach requires tools to quantify the effects a decision has over several steps in the future. To do so, we use Gaussian processes as generative models and combine them with dynamic programming to formulate the optimal planning strategy. We first illustrate the proposed algorithm on unconstrained optimization problems. In the second part, we demonstrate how the proposed lookahead BO algorithm can be extended to handle non-linear expensive inequality constraints, a ubiquitous situation in engineering design. We illustrate the proposed lookahead constrained BO algorithm on a reacting flow optimization problem. In the last part of this thesis, we develop techniques to scale BO to high dimension by exploiting a special structure arising when the objective function varies only in a low-dimensional subspace. Such a subspace can be detected using the (randomized) method of Active Subspaces. We propose a multifidelity active subspace algorithm that reduces the computational cost by leveraging a cheap-to-evaluate approximation of the objective function. We analyze the number of evaluations sufficient to control the error incurred, both in expectation and with high probability. We illustrate the proposed algorithm on an ONERA M6 wing shape-optimization problem.
by Remi Roger Alain Paul Lam.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

39

Liu, Jeffrey Ph D. Massachusetts Institute of Technology. "On the effect and value of information in Bayesian routing games." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/106962.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, School of Engineering, Center for Computational Engineering, Computation for Design and Optimization Program, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 52-54).
We consider the problem of estimating individual and social value of information in routing games. We propose a Bayesian congestion game that accounts for the heterogeneity in the commuters' access to information about traffic incidents. The model divides the population of commuters into two sub-populations or types based on their information about incidents. Types-H and L have high and low information about incidents, respectively. Each population routes its demand on an incident-prone, parallel route network. The cost function for each route depends is affine in its usage level and its slope increases with the route's incident state. Both populations (player types) know the demand of each type, route cost functions, and the incident probability. In addition, in our model, the commuters in type-H population receive private information on the true realization of incident state. We analyze both individual cost for each population and the aggregate (social) cost as the type-H population size increases. We observe that, in equilibrium, both these costs are non-monotonic and non-linear as the fraction of the total demand that is type-H increases. Our main results are as follows: First, the information improves individual welfare (i.e., when a commuter shifts from being in the type-L population to the type-H population), but the value of information is zero after a certain threshold fraction. Second, there exist another threshold (lower than the first threshold) after which increasing the relative fraction of type-H commuters does not reduce the aggregate social cost.
by Jeffrey Liu.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

40

Wogrin, Sonja. "Model reduction for dynamic sensor steering : a Bayesian approach to inverse problems." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/43739.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Computation for Design and Optimization Program, 2008.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 97-101).
In many settings, distributed sensors provide dynamic measurements over a specified time horizon that can be used to reconstruct information such as parameters, states or initial conditions. This estimation task can be posed formally as an inverse problem: given a model and a set of measurements, estimate the parameters of interest. We consider the specific problem of computing in real-time the prediction of a contamination event, based on measurements obtained by mobile sensors. The spread of the contamination is modeled by the convection diffusion equation. A Bayesian approach to the inverse problem yields an estimate of the probability density function of the initial contaminant concentration, which can then be propagated through the forward model to determine the predicted contaminant field at some future time and its associated uncertainty distribution. Sensor steering is effected by formulating and solving an optimization problem that seeks the sensor locations that minimize the uncertainty in this prediction. An important aspect of this Dynamic Sensor Steering Algorithm is the ability to execute in real-time. We achieve this through reduced-order modeling, which (for our two-dimensional examples) yields models that can be solved two orders of magnitude faster than the original system, but only incur average relative errors of magnitude O(10-3). The methodology is demonstrated on the contaminant transport problem, but is applicable to a broad class of problems where we wish to observe certain phenomena whose location or features are not known a priori.
by Sonja Wogrin.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

41

Tohme, Tony. "The Bayesian validation metric : a framework for probabilistic model calibration and validation." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/126919.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Computation for Design and Optimization Program, May, 2020
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 109-114).
In model development, model calibration and validation play complementary roles toward learning reliable models. In this thesis, we propose and develop the "Bayesian Validation Metric" (BVM) as a general model validation and testing tool. We show that the BVM can represent all the standard validation metrics - square error, reliability, probability of agreement, frequentist, area, probability density comparison, statistical hypothesis testing, and Bayesian model testing - as special cases while improving, generalizing and further quantifying their uncertainties. In addition, the BVM assists users and analysts in designing and selecting their models by allowing them to specify their own validation conditions and requirements. Further, we expand the BVM framework to a general calibration and validation framework by inverting the validation mathematics into a method for generalized Bayesian regression and model learning. We perform Bayesian regression based on a user's definition of model-data agreement. This allows for model selection on any type of data distribution, unlike Bayesian and standard regression techniques, that "fail" in some cases. We show that our tool is capable of representing and combining Bayesian regression, standard regression, and likelihood-based calibration techniques in a single framework while being able to generalize aspects of these methods. This tool also offers new insights into the interpretation of the predictive envelopes in Bayesian regression, standard regression, and likelihood-based methods while giving the analyst more control over these envelopes.
by Tony Tohme.
S.M.
S.M. Massachusetts Institute of Technology, Computation for Design and Optimization Program

APA, Harvard, Vancouver, ISO, and other styles

42

Monson, Christopher Kenneth. "No Free Lunch, Bayesian Inference, and Utility: A Decision-Theoretic Approach to Optimization." Diss., CLICK HERE for online access, 2006. http://contentdm.lib.byu.edu/ETD/image/etd1292.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Xue, Liang. "Multi-Model Bayesian Analysis of Data Worth and Optimization of Sampling Scheme Design." Diss., The University of Arizona, 2011. http://hdl.handle.net/10150/203432.

Full text

Abstract:

Groundwater is a major source of water supply, and aquifers form major storage reservoirs as well as water conveyance systems, worldwide. The viability of groundwater as a source of water to the world's population is threatened by overexploitation and contamination. The rational management of water resource systems requires an understanding of their response to existing and planned schemes of exploitation, pollution prevention and/or remediation. Such understanding requires the collection of data to help characterize the system and monitor its response to existing and future stresses. It also requires incorporating such data in models of system makeup, water flow and contaminant transport. As the collection of subsurface characterization and monitoring data is costly, it is imperative that the design of corresponding data collection schemes is cost-effective. A major benefit of new data is its potential to help improve one's understanding of the system, in large part through a reduction in model predictive uncertainty and corresponding risk of failure. Traditionally, value-of-information or data-worth analyses have relied on a single conceptual-mathematical model of site hydrology with prescribed parameters. Yet there is a growing recognition that ignoring model and parameter uncertainties render model predictions prone to statistical bias and underestimation of uncertainty. This has led to a recent emphasis on conducting hydrologic analyses and rendering corresponding predictions by means of multiple models. We develop a theoretical framework of data worth analysis considering model uncertainty, parameter uncertainty and potential sample value uncertainty. The framework entails Bayesian Model Averaging (BMA) with emphasis on its Maximum Likelihood version (MLBMA). An efficient stochastic optimization method, called Differential Evolution Method (DEM), is explored to aid in the design of optimal sampling schemes aiming at maximizing data worth. A synthetic case entailing generated log hydraulic conductivity random fields is used to illustrate the procedure. The proposed data worth analysis framework is applied to field pneumatic permeability data collected from unsaturated fractured tuff at the Apache Leap Research Site (ALRS) near Superior, Arizona.

APA, Harvard, Vancouver, ISO, and other styles

44

Francis, Gilad. "Autonomous Exploration over Continuous Domains." Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/20443.

Full text

Abstract:

Motion planning is an essential aspect of robot autonomy, and as such it has been studied for decades, producing a wide range of planning methodologies. Path planners are generally categorised as either trajectory optimisers or sampling-based planners. The latter is the predominant planning paradigm as it can resolve a path efficiently while explicitly reasoning about path safety. Yet, with a limited budget, the resulting paths are far from optimal. In contrast, state-of-the-art trajectory optimisers explicitly trade-off between path safety and efficiency to produce locally optimal paths. However, these planners cannot incorporate updates from a partially observed model such as an occupancy map and fail in planning around information gaps caused by incomplete sensor coverage. Autonomous exploration adds another twist to path planning. The objective of exploration is to safely and efficiently traverse through an unknown environment in order to map it. The desired output of such a process is a sequence of paths that efficiently and safely minimise the uncertainty of the map. However, optimising over the entire space of trajectories is computationally intractable. Therefore, most exploration algorithms relax the general formulation by optimising a simpler one, for example finding the single next best view, resulting in suboptimal performance. This thesis investigates methodologies for optimal and safe exploration over continuous paths. Contrary to existing exploration algorithms that break exploration into independent sub-problems of finding goal points and planning safe paths to these points, our holistic approach simultaneously optimises the coupled problems of where and how to explore. Thus, offering a shift in paradigm from next best view to next best path. With exploration defined as an optimisation problem over continuous paths, this thesis explores two different optimisation paradigms; Bayesian and functional.

APA, Harvard, Vancouver, ISO, and other styles

45

Song, Mingzhou. "Integrated surface model optimization from images and prior shape knowledge /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/6115.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Carroll, James Lamond. "A Bayesian Decision Theoretical Approach to Supervised Learning, Selective Sampling, and Empirical Function Optimization." Diss., CLICK HERE for online access, 2010. http://contentdm.lib.byu.edu/ETD/image/etd3413.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Mashamba, Able. "Bayesian optimization and uncertainty analysis of complex environmental models, with applications in watershed management." Diss., Montana State University, 2010. http://etd.lib.montana.edu/etd/2010/mashamba/MashambaA1210.pdf.

Full text

Abstract:

This dissertation presents results of research in the development, testing and application of an automated calibration and uncertainty analysis framework for distributed environmental models based on Bayesian Markov chain Monte Carlo (MCMC) sampling and response surface methodology (RSM) surrogate models that use a novel random local fitting algorithm. Typical automated search methods for optimization and uncertainty assessment such as evolutionary and Nelder-Mead Simplex algorithms are inefficient and/or infeasible when applied to distributed environmental models, as exemplified by the watershed management scenario analysis case study presented as part of this dissertation. This is because the larger numbers of non-linearly interacting parameters and the more complex structures of distributed environmental models make automated calibration and uncertainty analysis more computationally demanding compared to traditional basin-averaged models. To improve efficiency and feasibility of automated calibration and uncertainty assessment of distributed models, recent research has been focusing on using the response surface methodology (RSM) to approximate objective functions such as sum of squared residuals and Bayesian inference likelihoods. This dissertation presents (i) results on a novel study of factors that affect the performance of RSM approximation during Bayesian calibration and uncertainty analysis, (ii) a new 'random local fitting' (RLF) algorithm that improves RSM approximation for large sampling domains and (iii) application of a developed automated uncertainty analysis framework that uses MCMC sampling and a spline-based radial basis approximation function enhanced by the RLF algorithm to a fully-distributed hydrologic model case study. Using the MCMC sampling and response surface approximation framework for automated parameter and predictive uncertainty assessment of a distributed environmental model is novel. While extended testing of the developed MCMC uncertainty analysis paradigm is necessary, the results presented show that the new framework is robust and efficient for the case studied and similar distributed environmental models. As distributed environmental models continue to find use in climate change studies, flood forecasting, water resource management and land use studies, results of this study will have increasing importance to automated model assessment. Potential future research from this dissertation is the investigation of how model parameter sensitivities and inter-dependencies affect the performance of response surface approximation. 'Co-authored by Lucy Marshall.'

APA, Harvard, Vancouver, ISO, and other styles

48

Basak, Subhasish. "Multipathogen quantitative risk assessment in raw milk soft cheese : monotone integration and Bayesian optimization." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG021.

Full text

Abstract:

Ce manuscrit se concentre sur l'optimisation Bayésienne d'un modèle d'appréciation quantitative des risques microbiologiques (AQRM) dans le cadre du projet ArtiSaneFood soutenu par l'Union européenne. L'objectif est d'établir des stratégies de bio-intervention eﬃcaces pour les fabricants de fromage au lait cru en France, en s'appuyant sur trois types de travaux : 1) le développement d'un modèle AQRM multipathogène pour un fromage de type pâte molle au lait cru, 2) étudier des méthodes d'intégration monotone pour l'estimation des sorties du modèle AQRM et 3) la conception d'un algorithme d'optimisation Bayésienne adapté à un simulateur stochastique et coûteux.Dans la première partie, nous proposons un modèle AQRM multipathogène construit sur la base d'études existantes (voir, par exemple, Bonifait et al., 2021, Perrin et al., 2014, Sanaa et al., 2004, Strickland et al., 2023). Ce modèle est conçu pour estimer l'impact des maladies d'origine alimentaire sur la santé publique, causées par des agents pathogènes tels que Escherichia coli entérohémorragiques (EHEC), Salmonella et Listeria monocytogenes, potentiellement présents dans le fromage de type pâte molle au lait cru. Ce modèle “farm-to-fork” intègre les mesure de maitrise liées aux tests microbiologiques du lait et du fromage, permettant d'estimer les coûts associés aux interventions. Une implémentation du modèle AQRM pour EHEC est fournie en R et dans le cadre FSKX (Basak et al., under review). La deuxième partie de ce manuscrit explore l'application potentielle de méthodes d'intégration séquentielle, exploitant les propriétés de monotonie et de bornage des sorties du simulateur. Nous menons une revue de littérature approfondie sur les méthodes d'intégration existantes (voir, par exemple, Kiefer, 1957, Novak, 1992), et examinons les résultats théoriques concernant leur convergence. Notre contribution comprend la proposition d'améliorations à ces méthodes et la discussion des déﬁs associés à leur application dans le domaine de l'AQRM.Dans la dernière partie de ce manuscrit, nous proposons un algorithme Bayésien d'optimisation multiobjectif pour estimer les entrées optimales de Pareto d'un simulateur stochastique et coûteux en calcul. L'approche proposée est motivée par le principe de “Stepwise Uncertainty Reduction” (SUR) (voir, par exemple, Vazquez and Bect, 2009, Vazquez and Martinez, 2006, Villemonteix et al., 2007), avec un critère d'échantillonnage basé sur weighted integrated mean squared error (w-IMSE). Nous présentons une évaluation numérique comparant l'algorithme proposé avec PALS (Pareto Active Learning for Stochastic simulators) (Barracosa et al., 2021), sur un ensemble de problèmes de test bi-objectifs. Nous proposons également une extension (Basak et al., 2022a) de l'algorithme PALS, adaptée au cas d'application de l'AQRM
This manuscript focuses on Bayesian optimization of a quantitative microbiological risk assessment (QMRA) model, in the context of the European project ArtiSaneFood, supported by the PRIMA program. The primary goal is to establish eﬃcient bio-intervention strategies for cheese producers in France.This work is divided into three broad directions: 1) development and implementation of a multipathogen QMRA model for raw milk soft cheese, 2) studying monotone integration methods for estimating outputs of the QMRA model, and 3) designing a Bayesian optimization algorithm tailored for a stochastic and computationally expensive simulator.In the ﬁrst part we propose a multipathogen QMRA model, built upon existing studies in the literature (see, e.g., Bonifait et al., 2021, Perrin et al., 2014, Sanaa et al., 2004, Strickland et al., 2023). This model estimates the impact of foodborne illnesses on public health, caused by pathogenic STEC, Salmonella and Listeria monocytogenes, which can potentially be present in raw milk soft cheese. This farm-to-fork model also implements the intervention strategies related to mlik and cheese testing, which allows to estimate the cost of intervention. An implementation of the QMRA model for STEC is provided in R and in the FSKX framework (Basak et al., under review). The second part of this manuscript investigates the potential application of sequential integration methods, leveraging the monotonicity and boundedness properties of the simulator outputs. We conduct a comprehensive literature review on existing integration methods (see, e.g., Kiefer, 1957, Novak, 1992), and delve into the theoretical ﬁndings regarding their convergence. Our contribution includes proposing enhancements to these methods and discussion on the challenges associated with their application in the QMRA domain.In the ﬁnal part of this manuscript, we propose a Bayesian multiobjective optimization algorithm for estimating the Pareto optimal inputs of a stochastic and computationally expensive simulator. The proposed approach is motivated by the principle of Stepwise Uncertainty Reduction (SUR) (see, e.g., Vazquezand Bect, 2009, Vazquez and Martinez, 2006, Villemonteix et al., 2007), with a weighted integrated mean squared error (w-IMSE) based sampling criterion, focused on the estimation of the Pareto front. A numerical benchmark is presented, comparing the proposed algorithm with PALS (Pareto Active Learning for Stochastic simulators) (Barracosa et al., 2021), over a set of bi-objective test problems. We also propose an extension (Basak et al., 2022a) of the PALS algorithm, tailored to the QMRA application case

APA, Harvard, Vancouver, ISO, and other styles

49

Shayegh, Soheil. "Learning in integrated optimization models of climate change and economy." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/54012.

Full text

Abstract:

Integrated assessment models are powerful tools for providing insight into the interaction between the economy and climate change over a long time horizon. However, knowledge of climate parameters and their behavior under extreme circumstances of global warming is still an active area of research. In this thesis we incorporated the uncertainty in one of the key parameters of climate change, climate sensitivity, into an integrated assessment model and showed how this affects the choice of optimal policies and actions. We constructed a new, multi-step-ahead approximate dynamic programing (ADP) algorithm to study the effects of the stochastic nature of climate parameters. We considered the effect of stochastic extreme events in climate change (tipping points) with large economic loss. The risk of an extreme event drives tougher GHG reduction actions in the near term. On the other hand, the optimal policies in post-tipping point stages are similar to or below the deterministic optimal policies. Once the tipping point occurs, the ensuing optimal actions tend toward more moderate policies. Previous studies have shown the impacts of economic and climate shocks on the optimal abatement policies but did not address the correlation among uncertain parameters. With uncertain climate sensitivity, the risk of extreme events is linked to the variations in climate sensitivity distribution. We developed a novel Bayesian framework to endogenously interrelate the two stochastic parameters. The results in this case are clustered around the pre-tipping point optimal policies of the deterministic climate sensitivity model. Tougher actions are more frequent as there is more uncertainty in likelihood of extreme events in the near future. This affects the optimal policies in post-tipping point states as well, as they tend to utilize more conservative actions. As we proceed in time toward the future, the (binary) status of the climate will be observed and the prior distribution of the climate sensitivity parameter will be updated. The cost and climate tradeoffs of new technologies are key to decisions in climate policy. Here we focus on electricity generation industry and contrast the extremes in electricity generation choices: making choices on new generation facilities based on cost only and in the absence of any climate policy, versus making choices based on climate impacts only regardless of the generation costs. Taking the expected drop in cost as experience grows into account when selecting the portfolio of generation, on a pure cost-minimization basis, renewable technologies displace coal and natural gas within two decades even when climate damage is not considered in the choice of technologies. This is the natural gas as a bridge fuel scenario, and technology advancement to bring down the cost of renewables requires some commitment to renewables generation in the near term. Adopting the objective of minimizing climate damage, essentially moving immediately to low greenhouse gas generation technologies, results in faster cost reduction of new technologies and may result in different technologies becoming dominant in global electricity generation. Thus today’s choices for new electricity generation by individual countries and utilities have implications not only for their direct costs and the global climate, but also for the future costs and availability of emerging electricity generation options.

APA, Harvard, Vancouver, ISO, and other styles

50

Zhu, Zhanxing. "Integrating local information for inference and optimization in machine learning." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20980.

Full text

Abstract:

In practice, machine learners often care about two key issues: one is how to obtain a more accurate answer with limited data, and the other is how to handle large-scale data (often referred to as “Big Data” in industry) for efficient inference and optimization. One solution to the first issue might be aggregating learned predictions from diverse local models. For the second issue, integrating the information from subsets of the large-scale data is a proven way of achieving computation reduction. In this thesis, we have developed some novel frameworks and schemes to handle several scenarios in each of the two salient issues. For aggregating diverse models – in particular, aggregating probabilistic predictions from different models – we introduce a spectrum of compositional methods, Rényi divergence aggregators, which are maximum entropy distributions subject to biases from individual models, with the Rényi divergence parameter dependent on the bias. Experiments are implemented on various simulated and real-world datasets to verify the findings. We also show the theoretical connections between Rényi divergence aggregators and machine learning markets with isoelastic utilities. The second issue involves inference and optimization with large-scale data. We consider two important scenarios: one is optimizing large-scale Convex-Concave Saddle Point problem with a Separable structure, referred as Sep-CCSP; and the other is large-scale Bayesian posterior sampling. Two different settings of Sep-CCSP problem are considered, Sep-CCSP with strongly convex functions and non-strongly convex functions. We develop efficient stochastic coordinate descent methods for both of the two cases, which allow fast parallel processing for large-scale data. Both theoretically and empirically, it is demonstrated that the developed methods perform comparably, or more often, better than state-of-the-art methods. To handle the scalability issue in Bayesian posterior sampling, the stochastic approximation technique is employed, i.e., only touching a small mini batch of data items to approximate the full likelihood or its gradient. In order to deal with subsampling error introduced by stochastic approximation, we propose a covariance-controlled adaptive Langevin thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. This method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Bayesian Optimization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles