Dissertations / Theses: 'Approximate Decision'

1

Xie, Chen. "DYNAMIC DECISION APPROXIMATE EMPIRICAL REWARD (DDAER) PROCESSES." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1398991609.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Pratikakis, Nikolaos. "Multistage decisions and risk in Markov decision processes towards effective approximate dynamic programming architectures /." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/31654.

Full text

Abstract:

Thesis (Ph.D)--Chemical Engineering, Georgia Institute of Technology, 2009.
Committee Chair: Jay H. Lee; Committee Member: Martha Grover; Committee Member: Matthew J. Realff; Committee Member: Shabbir Ahmed; Committee Member: Stylianos Kavadias. Part of the SMARTech Electronic Thesis and Dissertation Collection.

APA, Harvard, Vancouver, ISO, and other styles

3

Bailey, David Thomas. "Development of an optimal spatial decision-making system using approximate reasoning." Queensland University of Technology, 2005. http://eprints.qut.edu.au/16202/.

Full text

Abstract:

There is a recognised need for the continued improvement of both the techniques and technology for spatial decision support in infrastructure site selection. Many authors have noted that current methodologies are inadequate for real-world site selection decisions carried out by heterogeneous groups of decision-makers under uncertainty. Nevertheless despite numerous limitations inherent in current spatial problem solving methods, spatial decision support systems have been proven to increase decision-maker effectiveness when used. However, due to the real or perceived difficulty of using these systems few applications are actually in use to support decision-makers in siting decisions. The most common difficulties encountered involve standardising criterion ratings, and communicating results. This research has focused on the use of Approximate Reasoning to improve the techniques and technology of spatial decision support, and make them easier to use and understand. The algorithm developed in this research (ARAISS) is based on the use of natural language to describe problem variables such as suitability, certainty, risk and consensus. The algorithm uses a method based on type II fuzzy sets to represent problem variables. ARAISS was subsequently incorporated into a new Spatial Decision Support System (InfraPlanner) and validated by use in a real-world site selection problem at Australia's Brisbane Airport. Results indicate that Approximate Reasoning is a promising method for spatial infrastructure planning decisions. Natural language inputs and outputs, combined with an easily understandable multiple decision-maker framework created an environment conducive to information sharing and consensus building among parties. Future research should focus on the use of Genetic Algorithms and other Artificial Intelligence techniques to broaden the scope of existing work.

APA, Harvard, Vancouver, ISO, and other styles

4

Yu, Huizhen Ph D. Massachusetts Institute of Technology. "Approximate solution methods for partially observable Markov and semi-Markov decision processes." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/35299.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (p. 165-169).
We consider approximation methods for discrete-time infinite-horizon partially observable Markov and semi-Markov decision processes (POMDP and POSMDP). One of the main contributions of this thesis is a lower cost approximation method for finite-space POMDPs with the average cost criterion, and its extensions to semi-Markov partially observable problems and constrained POMDP problems, as well as to problems with the undiscounted total cost criterion. Our method is an extension of several lower cost approximation schemes, proposed individually by various authors, for discounted POMDP problems. We introduce a unified framework for viewing all of these schemes together with some new ones. In particular, we establish that due to the special structure of hidden states in a POMDP, there is a class of approximating processes, which are either POMDPs or belief MDPs, that provide lower bounds to the optimal cost function of the original POMDP problem. Theoretically, POMDPs with the long-run average cost criterion are still not fully understood.
(cont.) The major difficulties relate to the structure of the optimal solutions, such as conditions for a constant optimal cost function, the existence of solutions to the optimality equations, and the existence of optimal policies that are stationary and deterministic. Thus, our lower bound result is useful not only in providing a computational method, but also in characterizing the optimal solution. We show that regardless of these theoretical difficulties, lower bounds of the optimal liminf average cost function can be computed efficiently by solving modified problems using multichain MDP algorithms, and the approximating cost functions can be also used to obtain suboptimal stationary control policies. We prove the asymptotic convergence of the lower bounds under certain assumptions. For semi-Markov problems and total cost problems, we show that the same method can be applied for computing lower bounds of the optimal cost function. For constrained average cost POMDPs, we show that lower bounds of the constrained optimal cost function can be computed by solving finite-dimensional LPs. We also consider reinforcement learning methods for POMDPs and MDPs. We propose an actor-critic type policy gradient algorithm that uses a structured policy known as a finite-state controller.
(cont.) We thus provide an alternative to the earlier actor-only algorithm GPOMDP. Our work also clarifies the relationship between the reinforcement learning methods for POMDPs and those for MDPs. For average cost MDPs, we provide a convergence and convergence rate analysis for a least squares temporal difference (TD) algorithm, called LSPE, and previously proposed for discounted problems. We use this algorithm in the critic portion of the policy gradient algorithm for POMDPs with finite-state controllers. Finally, we investigate the properties of the limsup and liminf average cost functions of various types of policies. We show various convexity and concavity properties of these costfunctions, and we give a new necessary condition for the optimal liminf average cost to be constant. Based on this condition, we prove the near-optimality of the class of finite-state controllers under the assumption of a constant optimal liminf average cost. This result provides a theoretical guarantee for the finite-state controller approach.
by Huizhen Yu.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

5

Löhndorf, Nils, David Wozabal, and Stefan Minner. "Optimizing Trading Decisions for Hydro Storage Systems using Approximate Dual Dynamic Programming." INFORMS, 2013. http://dx.doi.org/10.1287/opre.2013.1182.

Full text

Abstract:

We propose a new approach to optimize operations of hydro storage systems with multiple connected reservoirs whose operators participate in wholesale electricity markets. Our formulation integrates short-term intraday with long-term interday decisions. The intraday problem considers bidding decisions as well as storage operation during the day and is formulated as a stochastic program. The interday problem is modeled as a Markov decision process of managing storage operation over time, for which we propose integrating stochastic dual dynamic programming with approximate dynamic programming. We show that the approximate solution converges towards an upper bound of the optimal solution. To demonstrate the efficiency of the solution approach, we fit an econometric model to actual price and in inflow data and apply the approach to a case study of an existing hydro storage system. Our results indicate that the approach is tractable for a real-world application and that the gap between theoretical upper and a simulated lower bound decreases sufficiently fast. (authors' abstract)

APA, Harvard, Vancouver, ISO, and other styles

6

Astaraky, Davood. "A Simulation Based Approximate Dynamic Programming Approach to Multi-class, Multi-resource Surgical Scheduling." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/23622.

Full text

Abstract:

The thesis focuses on a model that seeks to address patient scheduling step of the surgical scheduling process to determine the number of surgeries to perform in a given day. Specifically, provided a master schedule that provides a cyclic breakdown of total OR availability into specific daily allocations to each surgical specialty, we look to provide a scheduling policy for all surgeries that minimizes a combination of the lead time between patient request and surgery date, overtime in the ORs and congestion in the wards. We cast the problem of generating optimal control strategies into the framework of Markov Decision Process (MDP). The Approximate Dynamic Programming (ADP) approach has been employed to solving the model which would otherwise be intractable due to the size of the state space. We assess performance of resulting policy and quality of the driven policy through simulation and we provide our policy insights and conclusions.

APA, Harvard, Vancouver, ISO, and other styles

7

Chen, Xiaoting. "Optimal Control of Non-Conventional Queueing Networks: A Simulation-Based Approximate Dynamic Programming Approach." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1427799942.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Lipsky, Ari Moshe. "Bayesian decision-theoretic trial design operating characteristics and ethics, an approximate method, and time-trend bias /." Diss., Restricted to subscribing institutions, 2009. http://proquest.umi.com/pqdweb?did=1970030561&sid=4&Fmt=2&clientId=1564&RQT=309&VName=PQD.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Sosnowski, Scott T. "Approximate Action Selection For Large, Coordinating, Multiagent Systems." Case Western Reserve University School of Graduate Studies / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1459468867.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Ramirez, Jose A. "Optimal and Simulation-Based Approximate Dynamic Programming Approaches for the Control of Re-Entrant Line Manufacturing Models." University of Cincinnati / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1282329260.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

McInerney, Robert E. "Decision making under uncertainty." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:a34e87ad-8330-42df-8ba6-d55f10529331.

Full text

Abstract:

Operating and interacting in an environment requires the ability to manage uncertainty and to choose definite courses of action. In this thesis we look to Bayesian probability theory as the means to achieve the former, and find that through rigorous application of the rules it prescribes we can, in theory, solve problems of decision making under uncertainty. Unfortunately such methodology is intractable in realworld problems, and thus approximation of one form or another is inevitable. Many techniques make use of heuristic procedures for managing uncertainty. We note that such methods suffer unreliable performance and rely on the specification of ad-hoc variables. Performance is often judged according to long-term asymptotic performance measures which we also believe ignores the most complex and relevant parts of the problem domain. We therefore look to develop principled approximate methods that preserve the meaning of Bayesian theory but operate with the scalability of heuristics. We start doing this by looking at function approximation in continuous state and action spaces using Gaussian Processes. We develop a novel family of covariance functions which allow tractable inference methods to accommodate some of the uncertainty lost by not following full Bayesian inference. We also investigate the exploration versus exploitation tradeoff in the context of the Multi-Armed Bandit, and demonstrate that principled approximations behave close to optimal behaviour and perform significantly better than heuristics on a range of experimental test beds.

APA, Harvard, Vancouver, ISO, and other styles

12

Holguin, Mijail Gamarra. "Planejamento probabilístico usando programação dinâmica assíncrona e fatorada." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-14042013-131306/.

Full text

Abstract:

Processos de Decisão Markovianos (Markov Decision Process - MDP) modelam problemas de tomada de decisão sequencial em que as possíveis ações de um agente possuem efeitos probabilísticos sobre os estados sucessores (que podem ser definidas por matrizes de transição de estados). Programação dinâmica em tempo real (Real-time dynamic programming - RTDP), é uma técnica usada para resolver MDPs quando existe informação sobre o estado inicial. Abordagens tradicionais apresentam melhor desempenho em problemas com matrizes esparsas de transição de estados porque podem alcançar eficientemente a convergência para a política ótima, sem ter que visitar todos os estados. Porém essa vantagem pode ser perdida em problemas com matrizes densas de transição, nos quais muitos estados podem ser alcançados em um passo (por exemplo, problemas de controle com eventos exógenos). Uma abordagem para superar essa limitação é explorar regularidades existentes na dinâmica do domínio através de uma representação fatorada, isto é, uma representação baseada em variáveis de estado. Nesse trabalho de mestrado, propomos um novo algoritmo chamado de FactRTDP (RTDP Fatorado), e sua versão aproximada aFactRTDP (RTDP Fatorado e Aproximado), que é a primeira versão eficiente fatorada do algoritmo clássico RTDP. Também propomos outras 2 extensões desses algoritmos, o FactLRTDP e aFactLRTDP, que rotulam estados cuja função valor convergiu para o ótimo. Os resultados experimentais mostram que estes novos algoritmos convergem mais rapidamente quando executados em domínios com matrizes de transição densa e tem bom comportamento online em domínios com matrizes de transição densa com pouca dependência entre as variáveis de estado.
Markov Decision Process (MDP) model problems of sequential decision making, where the possible actions have probabilistic effects on the successor states (defined by state transition matrices). Real-time dynamic programming (RTDP), is a technique for solving MDPs when there exists information about the initial state. Traditional approaches show better performance in problems with sparse state transition matrices, because they can achieve the convergence to optimal policy efficiently, without visiting all states. But, this advantage can be lose in problems with dense state transition matrices, in which several states can be achieved in a step (for example, control problems with exogenous events). An approach to overcome this limitation is to explore regularities existing in the domain dynamics through a factored representation, i.e., a representation based on state variables. In this master thesis, we propose a new algorithm called FactRTDP (Factored RTDP), and its approximate version aFactRTDP (Approximate and Factored RTDP), that are the first factored efficient versions of the classical RTDP algorithm. We also propose two other extensions, FactLRTDP and aFactLRTDP, that label states for which the value function has converged to the optimal. The experimental results show that when these new algorithms are executed in domains with dense transition matrices, they converge faster. And they have a good online performance in domains with dense transition matrices and few dependencies among state variables.

APA, Harvard, Vancouver, ISO, and other styles

13

Rösch, Philipp. "Design von Stichproben in analytischen Datenbanken." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2009. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-22916.

Full text

Abstract:

Aktuelle Studien belegen ein rasantes, mehrdimensionales Wachstum in analytischen Datenbanken: Das Datenvolumen verzehnfachte sich in den letzten vier Jahren, die Anzahl der Nutzer wuchs um durchschnittlich 25% pro Jahr und die Anzahl der Anfragen verdoppelte sich seit 2004 jährlich. Bei den Anfragen handelt es sich zunehmend um komplexe Verbundanfragen mit Aggregationen; sie sind häufig explorativer Natur und werden interaktiv an das System gestellt. Eine Möglichkeit, der Forderung nach Interaktivität bei diesem starken, mehrdimensionalen Wachstum nachzukommen, stellen Stichproben und eine darauf aufsetzende näherungsweise Anfrageverarbeitung dar. Diese Lösung bietet signifikant kürzere Antwortzeiten sowie Schätzungen mit probabilistischen Fehlergrenzen. Mit den Operationen Verbund, Gruppierung und Aggregation als Hauptbestandteile analytischer Anfragen ergeben sich folgende Anforderungen an das Design von Stichproben in analytischen Datenbanken: Zwischen den Stichproben fremdschlüsselverbundener Relationen ist die referenzielle Integrität zu gewährleisten, sämtliche Gruppen sind angemessen zu repräsentieren und Aggregationsattribute sind auf extreme Werte zu untersuchen. In dieser Dissertation wird für jedes dieser Teilprobleme ein Stichprobenverfahren vorgestellt, das sich durch speicherplatzbeschränkte Stichproben und geringe Schätzfehler auszeichnet. Im ersten der vorgestellten Verfahren wird durch eine korrelierte Stichprobenerhebung die referenzielle Integrität bei minimalem zusätzlichen Speicherplatz gewährleistet. Das zweite vorgestellte Stichprobenverfahren hat durch eine Berücksichtigung der Streuung der Daten eine angemessene Repräsentation sämtlicher Gruppen zur Folge und unterstützt damit beliebige Gruppierungen, und im dritten Verfahren ermöglicht eine mehrdimensionale Ausreißerbehandlung geringe Schätzfehler für beliebig viele Aggregationsattribute. Für jedes dieser Verfahren wird die Qualität der resultierenden Stichprobe diskutiert und bei der Berechnung speicherplatzbeschränkter Stichproben berücksichtigt. Um den Berechnungsaufwand und damit die Systembelastung gering zu halten, werden für jeden Algorithmus Heuristiken vorgestellt, deren Kennzeichen hohe Effizienz und eine geringe Beeinflussung der Stichprobenqualität sind. Weiterhin werden alle möglichen Kombinationen der vorgestellten Stichprobenverfahren betrachtet; diese Kombinationen ermöglichen eine zusätzliche Verringerung der Schätzfehler und vergrößern gleichzeitig das Anwendungsspektrum der resultierenden Stichproben. Mit der Kombination aller drei Techniken wird ein Stichprobenverfahren vorgestellt, das alle Anforderungen an das Design von Stichproben in analytischen Datenbanken erfüllt und die Vorteile der Einzellösungen vereint. Damit ist es möglich, ein breites Spektrum an Anfragen mit hoher Genauigkeit näherungsweise zu beantworten
Recent studies have shown the fast and multi-dimensional growth in analytical databases: Over the last four years, the data volume has risen by a factor of 10; the number of users has increased by an average of 25% per year; and the number of queries has been doubling every year since 2004. These queries have increasingly become complex join queries with aggregations; they are often of an explorative nature and interactively submitted to the system. One option to address the need for interactivity in the context of this strong, multi-dimensional growth is the use of samples and an approximate query processing approach based on those samples. Such a solution offers significantly shorter response times as well as estimates with probabilistic error bounds. Given that joins, groupings and aggregations are the main components of analytical queries, the following requirements for the design of samples in analytical databases arise: 1) The foreign-key integrity between the samples of foreign-key related tables has to be preserved. 2) Any existing groups have to be represented appropriately. 3) Aggregation attributes have to be checked for extreme values. For each of these sub-problems, this dissertation presents sampling techniques that are characterized by memory-bounded samples and low estimation errors. In the first of these presented approaches, a correlated sampling process guarantees the referential integrity while only using up a minimum of additional memory. The second illustrated sampling technique considers the data distribution, and as a result, any arbitrary grouping is supported; all groups are appropriately represented. In the third approach, the multi-column outlier handling leads to low estimation errors for any number of aggregation attributes. For all three approaches, the quality of the resulting samples is discussed and considered when computing memory-bounded samples. In order to keep the computation effort - and thus the system load - at a low level, heuristics are provided for each algorithm; these are marked by high efficiency and minimal effects on the sampling quality. Furthermore, the dissertation examines all possible combinations of the presented sampling techniques; such combinations allow to additionally reduce estimation errors while increasing the range of applicability for the resulting samples at the same time. With the combination of all three techniques, a sampling technique is introduced that meets all requirements for the design of samples in analytical databases and that merges the advantages of the individual techniques. Thereby, the approximate but very precise answering of a wide range of queries becomes a true possibility

APA, Harvard, Vancouver, ISO, and other styles

14

Regatti, Jayanth Reddy. "Dynamic Routing for Fuel Optimization in Autonomous Vehicles." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1524145002064074.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Yin, Biao. "Contrôle adaptatif des feux de signalisation dans les carrefours : modélisation du système de trafic dynamique et approches de résolution." Thesis, Belfort-Montbéliard, 2015. http://www.theses.fr/2015BELF0279/document.

Full text

Abstract:

La régulation adaptative des feux de signalisation est un problème très important. Beaucoup de chercheurs travaillent continuellement afin de résoudre les problémes liés à l’embouteillage dans les intersections urbaines. Il devient par conséquent très utile d’employer des algorithmes intelligents afin d’améliorer les performances de régulation et la qualité du service. Dans cette thèse, nous essayons d'étudier ce problème d’une part à travers une modèlisation microscopique et dynamique en temps discret, et d’autre part en explorant plusieurs approches de résoltion pour une intersection isolée ainsi que pour un réseau distribué d'intersections.La première partie se concentre sur la modélisation dynamique des problèmes des feux de signalisation ainsi que de la charge du réseau d’intersections. Le mode de la “séquence de phase adaptative” (APS) dans un plan de feux est d'abord considéré. Quant à la modélisation du contrôle des feux aux intersections, elle est formulée grâce à un processus décisionnel de markov (MDP). En particulier, la notion de “l'état du système accordable” est alors proposée pour la coordination du réseau de trafic. En outre, un nouveau modèle de “véhicule-suiveur” est proposé pour l'environnement de trafic. En se basant sur la modélisation proposée, les méthodes de contrôle des feux dans cette thèse comportent des algorithmes optimaux et quasi-optimaux. Deux algorithmes exacts de résolution basées sur la programmation dynamique (DP) sont alors étudiés et les résultats montrent certaines limites de cette solution DP surtout dans quelques cas complexes où l'espace d'états est assez important. En raison de l’importance du temps d’execution de l'algorithme DP et du manque d'information du modèle (notamment l’information exacte relative à l’arrivée des véhicules à l’intersection), nous avons opté pour un algorithme de programmation dynamique approximative (ADP). Enfin, un algorithme quasi-optimal utilisant l'ADP combinée à la méthode d’amélioration RLS-TD (λ) est choisi. Dans les simulations, en particulier avec l'intégration du mode de phase APS, l'algorithme proposé montre de bons résultats notamment en terme de performance et d'efficacité de calcul
Adaptive traffic signal control is a decision making optimization problem. People address this crucial problem constantly in order to solve the traffic congestion at urban intersections. It is very popular to use intelligent algorithms to improve control performances, such as traffic delay. In the thesis, we try to study this problem comprehensively with a microscopic and dynamic model in discrete-time, and investigate the related algorithms both for isolated intersection and distributed network control. At first, we focus on dynamic modeling for adaptive traffic signal control and network loading problems. The proposed adaptive phase sequence (APS) mode is highlighted as one of the signal phase control mechanisms. As for the modeling of signal control at intersections, problems are fundamentally formulated by Markov decision process (MDP), especially the concept of tunable system state is proposed for the traffic network coordination. Moreover, a new vehicle-following model supports for the network loading environment.Based on the model, signal control methods in the thesis are studied by optimal and near-optimal algorithms in turn. Two exact DP algorithms are investigated and results show some limitations of DP solution when large state space appears in complex cases. Because of the computational burden and unknown model information in dynamic programming (DP), it is suggested to use an approximate dynamic programming (ADP). Finally, the online near-optimal algorithm using ADP with RLS-TD(λ) is confirmed. In simulation experiments, especially with the integration of APS, the proposed algorithm indicates a great advantage in performance measures and computation efficiency

APA, Harvard, Vancouver, ISO, and other styles

16

Zhang, Jian. "Advance Surgery Scheduling with Consideration of Downstream Capacity Constraints and Multiple Sources of Uncertainty." Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCA023.

Full text

Abstract:

Les travaux de ce mémoire portent sur une gestion optimisée des blocs opératoires dans un service chirurgical. Les arrivées des patients chaque semaine, la durée des opérations et les temps de séjour des patients sont considérés comme des paramètres assujettis à des incertitudes. Chaque semaine, le gestionnaire hospitalier doit déterminer les blocs chirurgicaux à mettre en service et leur affecter certaines opérations figurant sur la liste d'attente. L'objectif est la minimisation d'une part des coûts liés à la réalisation et au report des opérations, et d'autre part des coûts hospitaliers liés aux ressources chirurgicaux. Lorsque nous considérons que les modèles de programmations mathématiques couramment utilisés dans la littérature n'optimisent pas la performance à long terme des programmes chirurgicaux, nous proposons un nouveau modèle d'optimisation à deux phases combinant le processus de décision Markovien (MDP) et la programmation stochastique. Le MDP de la première phase détermine les opérations à effectuer chaque semaine et minimise les coûts totaux sur un horizon infini. La programmation stochastique de la deuxième phase optimise les affectations des opérations sélectionnées dans les blocs chirurgicaux. Afin de résoudre la complexité des problèmes de grande taille, nous développons un algorithme de programmation dynamique approximatif basé sur l'apprentissage par renforcement et plusieurs autres heuristiques basés sur la génération de colonnes. Nous développons des applications numériques afin d'évaluer le modèle et les algorithmes proposés. Les résultats expérimentaux indiquent que ces algorithmes sont considérablement plus efficaces que les algorithmes traditionnels. Les programmes chirurgicaux du modèle d’optimisation à deux phases sont plus performants de manière significative que ceux d’un modèle de programmation stochastique classique en termes de temps d’attente des patients et de coûts totaux sur le long terme
This thesis deals with the advance scheduling of elective surgeries in an operating theatre that is composed of operating rooms and downstream recovery units. The arrivals of new patients in each week, the duration of each surgery, and the length-of-stay of each patient in the downstream recovery unit are subject to uncertainty. In each week, the surgery planner should determine the surgical blocks to open and assign some of the surgeries in the waiting list to the open surgical blocks. The objective is to minimize the patient-related costs incurred by performing and postponing surgeries as well as the hospital-related costs caused by the utilization of surgical resources. Considering that the pure mathematical programming models commonly used in literature do not optimize the long-term performance of the surgery schedules, we propose a novel two-phase optimization model that combines Markov decision process (MDP) and stochastic programming to overcome this drawback. The MDP model in the first phase determines the surgeries to be performed in each week and minimizes the expected total costs over an infinite horizon, then the stochastic programming model in the second phase optimizes the assignments of the selected surgeries to surgical blocks. In order to cope with the huge complexity of realistically sized problems, we develop a reinforcement-learning-based approximate dynamic programming algorithm and several column-generation-based heuristic algorithms as the solution approaches. We conduct numerical experiments to evaluate the model and algorithms proposed in this thesis. The experimental results indicate that the proposed algorithms are considerably more efficient than the traditional ones, and that the resulting schedules of the two-phase optimization model significantly outperform those of a conventional stochastic programming model in terms of the patients' waiting times and the total costs on the long run

APA, Harvard, Vancouver, ISO, and other styles

17

Višnja, Ognjenović. "Aproksimativna diskretizacija tabelarno organizovanih podataka." Phd thesis, Univerzitet u Novom Sadu, Tehnički fakultet Mihajlo Pupin u Zrenjaninu, 2016. https://www.cris.uns.ac.rs/record.jsf?recordId=101259&source=NDLTD&language=en.

Full text

Abstract:

Disertacija se bavi analizom uticaja raspodela podataka na rezultate algoritama diskretizacije u okviru procesa mašinskog učenja. Na osnovu izabranih baza i algoritama diskretizacije teorije grubih skupova i stabala odlučivanja, istražen je uticaj odnosa raspodela podataka i tačaka reza određene diskretizacije.Praćena je promena konzistentnosti diskretizovane tabele u zavisnosti od položaja redukovane tačke reza na histogramu. Definisane su fiksne tačke reza u zavisnosti od segmentacije multimodal raspodele, na osnovu kojih je moguće raditi redukciju preostalih tačaka reza. Za određivanje fiksnih tačaka konstruisan je algoritam FixedPoints koji ih određuje u skladu sa grubom segmentacijom multimodal raspodele.Konstruisan je algoritam aproksimativne diskretizacije APPROX MD za redukciju tačaka reza, koji koristi tačke reza dobijene algoritmom maksimalne razberivosti i parametre vezane za procenat nepreciznih pravila, ukupni procenat klasifikacije i broj tačaka redukcije. Algoritam je kompariran u odnosu na algoritam maksimalne razberivosti i u odnosu na algoritam maksimalne razberivosti sa aproksimativnim rešenjima za α=0,95.
This dissertation analyses the influence of data distribution on the results of discretization algorithms within the process of machine learning. Based on the chosen databases and the discretization algorithms within the rough set theory and decision trees, the influence of the data distribution-cuts relation within certain discretization has been researched.Changes in consistency of a discretized table, as dependent on the position of the reduced cut on the histogram, has been monitored. Fixed cuts have been defined, as dependent on the multimodal segmentation, on basis of which it is possible to do the reduction of the remaining cuts. To determine the fixed cuts, an algorithm FixedPoints has been constructed, determining these points in accordance with the rough segmentation of multimodal distribution.An algorithm for approximate discretization, APPROX MD, has been constructed for cuts reduction, using cuts obtained through the maximum discernibility (MD-Heuristic) algorithm and the parametres related to the percent of imprecise rules, the total classification percent and the number of reduction cuts. The algorithm has been compared to the MD algorithm and to the MD algorithm with approximate solutions for α=0,95.

APA, Harvard, Vancouver, ISO, and other styles

18

Aboud, Talal. "Développement d'un système interactif d'aide à la decision utilisant un raisonnement approximatif : dans le cadre de la problématique de tri." Paris 9, 1996. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=1996PA090024.

Full text

Abstract:

Le but de la thèse est de concevoir une méthode interactive pour traiter les problèmes de segmentation multicritère dans lesquels les performances des actions sont floues en appliquant un certain nombre d'heuristiques de bon sens. Ceci est réalisé en deux étapes: premièrement, éliminer l'effet flou du problème et effectuer une segmentation mono critère et, dernièrement, agréger des segmentations mono critères en une seule. Cette dernière opération est effectuée par une procédure interaction d'agrégation. La procédure utilise une affectation (réalisée par le décideur) comme référence pour affecter un certain nombre d'actions ou pour rétrécir l'intervalle des affectations possibles pour d'autres actions. Finalement, nous avons réalisé un logiciel expérimental pour tester la méthode. Les résultats sont encourageants en ce qui concerne la réduction du nombre de questions et le traitement des entrées floues

APA, Harvard, Vancouver, ISO, and other styles

19

Keesmaat, Sylvia C. "Welcoming in the Gentiles: a Biblical Model for Decision Making." Anglican Book Centre, 2004. http://hdl.handle.net/10756/296292.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Petrik, Marek. "Optimization-based approximate dynamic programming." 2010. https://scholarworks.umass.edu/dissertations/AAI3427564.

Full text

Abstract:

Reinforcement learning algorithms hold promise in many complex domains, such as resource management and planning under uncertainty. Most reinforcement learning algorithms are iterative—they successively approximate the solution based on a set of samples and features. Although these iterative algorithms can achieve impressive results in some domains, they are not sufficiently reliable for wide applicability; they often require extensive parameter tweaking to work well and provide only weak guarantees of solution quality. Some of the most interesting reinforcement learning algorithms are based on approximate dynamic programming (ADP). ADP, also known as value function approximation, approximates the value of being in each state. This thesis presents new reliable algorithms for ADP that use optimization instead of iterative improvement. Because these optimization–based algorithms explicitly seek solutions with favorable properties, they are easy to analyze, offer much stronger guarantees than iterative algorithms, and have few or no parameters to tweak. In particular, we improve on approximate linear programming — an existing method — and derive approximate bilinear programming — a new robust approximate method. The strong guarantees of optimization–based algorithms not only increase confidence in the solution quality, but also make it easier to combine the algorithms with other ADP components. The other components of ADP are samples and features used to approximate the value function. Relying on the simplified analysis of optimization–based methods, we derive new bounds on the error due to missing samples. These bounds are simpler, tighter, and more practical than the existing bounds for iterative algorithms and can be used to evaluate solution quality in practical settings. Finally, we propose homotopy methods that use the sampling bounds to automatically select good approximation features for optimization–based algorithms. Automatic feature selection significantly increases the flexibility and applicability of the proposed ADP methods. The methods presented in this thesis can potentially be used in many practical applications in artificial intelligence, operations research, and engineering. Our experimental results show that optimization–based methods may perform well on resource-management problems and standard benchmark problems and therefore represent an attractive alternative to traditional iterative methods.

APA, Harvard, Vancouver, ISO, and other styles

21

Poupart, Pascal. "Approximate value-directed belief state monitoring for partially observable Markov decision processes." Thesis, 2000. http://hdl.handle.net/2429/11462.

Full text

Abstract:

Partially observable Markov decision processes (POMDPs) provide a principled approach to planning under uncertainty. Unfortunately, several sources of intractability currently limit the application of POMDPs to simple problems. The following thesis is concerned with one source of intractability in particular, namely the belief state monitoring task. As an agent executes a plan, it must track the state of the world by updating its beliefs with respect to the current state. Then, based on its current beliefs, the agent can look up the next action to execute in its plan. In many situations, an agent may be required to decide in real-time which action to execute next. Thus, efficient algorithms to update the current belief state would be desirable. Unfortunately, exact belief state monitoring turns out to be very time consuming for many domains. This thesis introduces a value-directed framework to analyze and design approximation methods that speed up the monitoring task. The goal of approximate belief state monitoring is to trade monitoring accuracy for efficiency. Thus, this framework outlines a principled approach to quantify the impact of approximating belief states on the original plan. Since at any point in time, an action is executed based on the current belief state, it may be possible that a less desirable action ends up being executed as a result of the approximations used to infer the current belief state. The framework developped covers a wide range of approximation methods including projection schemes and density trees. First, several bounds on the loss in decision quality due to approximate belief state monitoring are derived. Then, given a class of approximation methods, a few search algorithms are proposed to seek a relatively good approximation scheme within the given class. These algorithms essentially try to minimize the bounds derived. Next, a vector space analysis is performed to gain some insights regarding which properties of approximation methods are likely to ensure a minimal impact on decision quality. Finally, faster algorithms (than the previous ones) are designed to search for approximation methods that exhibit such properties.

APA, Harvard, Vancouver, ISO, and other styles

22

Zhang, Juan. "Higher order conditional inference using parallels with approximate Bayesian techniques." 2008. http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.000050480.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Lin, Kuan-Hung, and 林冠宏. "An Approximate Square Prediction Criterion for H.264/AVC Mode Decision and Its VLSI Implementation." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/43285585003706455668.

Full text

Abstract:

碩士
國立成功大學
電機工程學系碩博士班
96
In this thesis, we propose an approximate square prediction criterion for H.264/AVC mode decision. Sum of absolute difference (SAD) and sum of squared difference (SSD) are two popular prediction criteria in the spatial domain. SSD achieves better video quality than SAD due to the square computation. The square operations take high computational complexity and large hardware cost. An efficient mode decision criterion is proposed to maintain the video quality compared to SSD. The proposed criterion much reduces the computational complexity and improves hardware performance by using a first-one-detector and shifter. Simulation results show the total number of logic gates is 31.2k, and the core size of layout is 435,786 μm2. The proposed intra prediction architecture takes 1,078 clock cycles to predict one macroblock. The proposed architecture using the SASD criterion reduces more than 30% critical time delay compared with that using SSD. The maximum operation frequency is 133 MHz. For the real-time requirement, the maximum frame size achieves 720p HD (1280×720)@30 frames/sec while the sequence format is 4:2:0.

APA, Harvard, Vancouver, ISO, and other styles

24

Gormley, Kevin Jerome. "Research and development planning : selecting and scheduling projects with approximate solutions to a Markov decision model /." 2007. http://wwwlib.umi.com/dissertations/fullcit/3282493.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Lakshminarayanan, Chandrashekar. "Approximate Dynamic Programming and Reinforcement Learning - Algorithms, Analysis and an Application." Thesis, 2015. http://etd.iisc.ernet.in/2005/3963.

Full text

Abstract:

Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as engineering, science and economics. Such problems can often be cast in the framework of Markov Decision Process (MDP). Solving an MDP requires computing the optimal value function and the optimal policy. The idea of dynamic programming (DP) and the Bellman equation (BE) are at the heart of solution methods. The three important exact DP methods are value iteration, policy iteration and linear programming. The exact DP methods compute the optimal value function and the optimal policy. However, the exact DP methods are inadequate in practice because the state space is often large and in practice, one might have to resort to approximate methods that compute sub-optimal policies. Further, in certain cases, the system observations are known only in the form of noisy samples and we need to design algorithms that learn from these samples. In this thesis we study interesting theoretical questions pertaining to approximate and learning algorithms, and also present an interesting application of MDPs in the domain of crowd sourcing. Approximate Dynamic Programming (ADP) methods handle the issue of large state space by computing an approximate value function and/or a sub-optimal policy. In this thesis, we are concerned with conditions that result in provably good policies. Motivated by the limitations of the PBE in the conventional linear algebra, we study the PBE in the (min, +) linear algebra. It is a well known fact that deterministic optimal control problems with cost/reward criterion are (min, +)/(max, +) linear and ADP methods have been developed for such systems in literature. However, it is straightforward to show that inﬁnite horizon discounted reward/cost MDPs are neither (min, +) nor (max, +) linear. We develop novel ADP schemes namely the Approximate Q Iteration (AQI) and Variational Approximate Q Iteration (VAQI), where the approximate solution is a (min, +) linear combination of a set of basis functions whose span constitutes a subsemimodule. We show that the new ADP methods are convergent and we present a bound on the performance of the sub-optimal policy. The Approximate Linear Program (ALP) makes use of linear function approximation (LFA) and oﬀers theoretical performance guarantees. Nevertheless, the ALP is diﬃcult to solve due to the presence of a large number of constraints and in practice, a reduced linear program (RLP) is solved instead. The RLP has a tractable number of constraints sampled from the original constraints of the ALP. Though the RLP is known to perform well in experiments, theoretical guarantees are available only for a speciﬁc RLP obtained under idealized assumptions. In this thesis, we generalize the RLP to deﬁne a generalized reduced linear program (GRLP) which has a tractable number of constraints that are obtained as positive linear combinations of the original constraints of the ALP. The main contribution here is the novel theoretical framework developed to obtain error bounds for any given GRLP. Reinforcement Learning (RL) algorithms can be viewed as sample trajectory based solution methods for solving MDPs. Typically, RL algorithms that make use of stochastic approximation (SA) are iterative schemes taking small steps towards the desired value at each iteration. Actor-Critic algorithms form an important sub-class of RL algorithms, wherein, the critic is responsible for policy evaluation and the actor is responsible for policy improvement. The actor and critic iterations have deferent step-size schedules, in particular, the step-sizes used by the actor updates have to be generally much smaller than those used by the critic updates. Such SA schemes that use deferent step-size schedules for deferent sets of iterates are known as multitimescale stochastic approximation schemes. One of the most important conditions required to ensure the convergence of the iterates of a multi-timescale SA scheme is that the iterates need to be stable, i.e., they should be uniformly bounded almost surely. However, the conditions that imply the stability of the iterates in a multi-timescale SA scheme have not been well established. In this thesis, we provide veritable conditions that imply stability of two timescale stochastic approximation schemes. As an example, we also demonstrate that the stability of a widely used actor-critic RL algorithm follows from our analysis. Crowd sourcing (crowd) is a new mode of organizing work in multiple groups of smaller chunks of tasks and outsourcing them to a distributed and large group of people in the form of an open call. Recently, crowd sourcing has become a major pool for human intelligence tasks (HITs) such as image labeling, form digitization, natural language processing, machine translation evaluation and user surveys. Large organizations/requesters are increasingly interested in crowd sourcing the HITs generated out of their internal requirements. Task starvation leads to huge variation in the completion times of the tasks posted on to the crowd. This is an issue for frequent requesters desiring predictability in the completion times of tasks speciﬁed in terms of percentage of tasks completed within a stipulated amount of time. An important task attribute that aﬀects the completion time of a task is its price. However, a pricing policy that does not take the dynamics of the crowd into account might fail to achieve the desired predictability in completion times. Here, we make use of the MDP framework to compute a pricing policy that achieves predictable completion times in simulations as well as real world experiments.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Approximate Decision'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles