Segui questo link per vedere altri tipi di pubblicazioni sul tema: Reinforcement learning (Machine learning).

Tesi sul tema "Reinforcement learning (Machine learning)"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Reinforcement learning (Machine learning)".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.

1

Hengst, Bernhard Computer Science &amp Engineering Faculty of Engineering UNSW. "Discovering hierarchy in reinforcement learning." Awarded by:University of New South Wales. Computer Science and Engineering, 2003. http://handle.unsw.edu.au/1959.4/20497.

Testo completo
Abstract (sommario):
This thesis addresses the open problem of automatically discovering hierarchical structure in reinforcement learning. Current algorithms for reinforcement learning fail to scale as problems become more complex. Many complex environments empirically exhibit hierarchy and can be modeled as interrelated subsystems, each in turn with hierarchic structure. Subsystems are often repetitive in time and space, meaning that they reoccur as components of different tasks or occur multiple times in different circumstances in the environment. A learning agent may sometimes scale to larger problems if it suc
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Tabell, Johnsson Marco, and Ala Jafar. "Efficiency Comparison Between Curriculum Reinforcement Learning & Reinforcement Learning Using ML-Agents." Thesis, Blekinge Tekniska Högskola, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20218.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Akrour, Riad. "Robust Preference Learning-based Reinforcement Learning." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112236/document.

Testo completo
Abstract (sommario):
Les contributions de la thèse sont centrées sur la prise de décisions séquentielles et plus spécialement sur l'Apprentissage par Renforcement (AR). Prenant sa source de l'apprentissage statistique au même titre que l'apprentissage supervisé et non-supervisé, l'AR a gagné en popularité ces deux dernières décennies en raisons de percées aussi bien applicatives que théoriques. L'AR suppose que l'agent (apprenant) ainsi que son environnement suivent un processus de décision stochastique Markovien sur un espace d'états et d'actions. Le processus est dit de décision parce que l'agent est appelé à ch
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Lee, Siu-keung, and 李少強. "Reinforcement learning for intelligent assembly automation." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B31244397.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Tebbifakhr, Amirhossein. "Machine Translation For Machines." Doctoral thesis, Università degli studi di Trento, 2021. http://hdl.handle.net/11572/320504.

Testo completo
Abstract (sommario):
Traditionally, Machine Translation (MT) systems are developed by targeting fluency (i.e. output grammaticality) and adequacy (i.e. semantic equivalence with the source text) criteria that reflect the needs of human end-users. However, recent advancements in Natural Language Processing (NLP) and the introduction of NLP tools in commercial services have opened new opportunities for MT. A particularly relevant one is related to the application of NLP technologies in low-resource language settings, for which the paucity of training data reduces the possibility to train reliable services. In this s
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Yang, Zhaoyuan Yang. "Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu152411491981452.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Scholz, Jonathan. "Physics-based reinforcement learning for autonomous manipulation." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54366.

Testo completo
Abstract (sommario):
With recent research advances, the dream of bringing domestic robots into our everyday lives has become more plausible than ever. Domestic robotics has grown dramatically in the past decade, with applications ranging from house cleaning to food service to health care. To date, the majority of the planning and control machinery for these systems are carefully designed by human engineers. A large portion of this effort goes into selecting the appropriate models and control techniques for each application, and these skills take years to master. Relieving the burden on human experts is therefo
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Cleland, Andrew Lewis. "Bounding Box Improvement with Reinforcement Learning." PDXScholar, 2018. https://pdxscholar.library.pdx.edu/open_access_etds/4438.

Testo completo
Abstract (sommario):
In this thesis, I explore a reinforcement learning technique for improving bounding box localizations of objects in images. The model takes as input a bounding box already known to overlap an object and aims to improve the fit of the box through a series of transformations that shift the location of the box by translation, or change its size or aspect ratio. Over the course of these actions, the model adapts to new information extracted from the image. This active localization approach contrasts with existing bounding-box regression methods, which extract information from the image only once.
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Piano, Francesco. "Deep Reinforcement Learning con PyTorch." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25340/.

Testo completo
Abstract (sommario):
Il Reinforcement Learning è un campo di ricerca del Machine Learning in cui la risoluzione di problemi da parte di un agente avviene scegliendo l’azione più idonea da eseguire attraverso un processo di apprendimento iterativo, in un ambiente dinamico che lo incentiva tramite ricompense. Il Deep Learning, anch’esso approccio del Machine Learning, sfruttando una rete neurale artificiale è in grado di applicare metodi di apprendimento per rappresentazione allo scopo di ottenere una struttura dei dati più idonea ad essere elaborata. Solo recentemente il Deep Reinforcement Learning, creato
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Suggs, Sterling. "Reinforcement Learning with Auxiliary Memory." BYU ScholarsArchive, 2021. https://scholarsarchive.byu.edu/etd/9028.

Testo completo
Abstract (sommario):
Deep reinforcement learning algorithms typically require vast amounts of data to train to a useful level of performance. Each time new data is encountered, the network must inefficiently update all of its parameters. Auxiliary memory units can help deep neural networks train more efficiently by separating computation from storage, and providing a means to rapidly store and retrieve precise information. We present four deep reinforcement learning models augmented with external memory, and benchmark their performance on ten tasks from the Arcade Learning Environment. Our discussion and insights
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Jesu, Alberto. "Reinforcement learning over encrypted data." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23257/.

Testo completo
Abstract (sommario):
Reinforcement learning is a particular paradigm of machine learning that, recently, has proved times and times again to be a very effective and powerful approach. On the other hand, cryptography usually takes the opposite direction. While machine learning aims at analyzing data, cryptography aims at maintaining its privacy by hiding such data. However, the two techniques can be jointly used to create privacy preserving models, able to make inferences on the data without leaking sensitive information. Despite the numerous amount of studies performed on machine learning and cryptography, reinfor
Gli stili APA, Harvard, Vancouver, ISO e altri
12

Gustafsson, Robin, and Lucas Fröjdendahl. "Machine Learning for Traffic Control of Unmanned Mining Machines : Using the Q-learning and SARSA algorithms." Thesis, KTH, Hälsoinformatik och logistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-260285.

Testo completo
Abstract (sommario):
Manual configuration of rules for unmanned mining machine traffic control can be time-consuming and therefore expensive. This paper presents a Machine Learning approach for automatic configuration of rules for traffic control in mines with autonomous mining machines by using Q-learning and SARSA. The results show that automation might be able to cut the time taken to configure traffic rules from 1-2 weeks to a maximum of approximately 6 hours which would decrease the cost of deployment. Tests show that in the worst case the developed solution is able to run continuously for 24 hours 82% of the
Gli stili APA, Harvard, Vancouver, ISO e altri
13

Mariani, Tommaso. "Deep reinforcement learning for industrial applications." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20548/.

Testo completo
Abstract (sommario):
In recent years there has been a growing attention from the world of research and companies in the field of Machine Learning. This interest, thanks mainly to the increasing availability of large amounts of data, and the respective strengthening of the hardware sector useful for their analysis, has led to the birth of Deep Learning. The growing computing capacity and the use of mathematical optimization techniques, already studied in depth but with few applications due to a low computational power, have then allowed the development of a new approach called Reinforcement Learning. This thesis w
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Cleland, Benjamin George. "Reinforcement Learning for Racecar Control." The University of Waikato, 2006. http://hdl.handle.net/10289/2507.

Testo completo
Abstract (sommario):
This thesis investigates the use of reinforcement learning to learn to drive a racecar in the simulated environment of the Robot Automobile Racing Simulator. Real-life race driving is known to be difficult for humans, and expert human drivers use complex sequences of actions. There are a large number of variables, some of which change stochastically and all of which may affect the outcome. This makes driving a promising domain for testing and developing Machine Learning techniques that have the potential to be robust enough to work in the real world. Therefore the principles of the algorithm
Gli stili APA, Harvard, Vancouver, ISO e altri
15

Suay, Halit Bener. "Reinforcement Learning from Demonstration." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-dissertations/173.

Testo completo
Abstract (sommario):
Off-the-shelf Reinforcement Learning (RL) algorithms suffer from slow learning performance, partly because they are expected to learn a task from scratch merely through an agent's own experience. In this thesis, we show that learning from scratch is a limiting factor for the learning performance, and that when prior knowledge is available RL agents can learn a task faster. We evaluate relevant previous work and our own algorithms in various experiments. Our first contribution is the first implementation and evaluation of an existing interactive RL algorithm in a real-world domain with a human
Gli stili APA, Harvard, Vancouver, ISO e altri
16

Pipe, Anthony Graham. "Reinforcement learning and knowledge transformation in mobile robotics." Thesis, University of the West of England, Bristol, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.364077.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
17

Chalup, Stephan Konrad. "Incremental learning with neural networks, evolutionary computation and reinforcement learning algorithms." Thesis, Queensland University of Technology, 2001.

Cerca il testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
18

Le, Piane Fabio. "Training cognitivo adattativo mediante Reinforcement Learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17289/.

Testo completo
Abstract (sommario):
La sclerosi multipla (SM) è una malattia autoimmune che colpisce il sistema nervoso centrale causando varie alterazioni organiche e funzionali. In particolare, una rilevante percentuale di pazienti sviluppa deficit in differenti domini cognitivi. Per limitare la progressione di tali deficit, team specialistici hanno ideato dei protocolli per la riabilitazione cognitiva. Per effettuare le sedute di riabilitazione, i pazienti devono recarsi in cliniche specializzate, necessitando dell'assistenza di personale qualificato e svolgendo gli esercizi tramite scrittura su carta. In seguito, si è inizi
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Rouet-Leduc, Bertrand. "Machine learning for materials science." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/267987.

Testo completo
Abstract (sommario):
Machine learning is a branch of artificial intelligence that uses data to automatically build inferences and models designed to generalise and make predictions. In this thesis, the use of machine learning in materials science is explored, for two different problems: the optimisation of gallium nitride optoelectronic devices, and the prediction of material failure in the setting of laboratory earthquakes. Light emitting diodes based on III-nitrides quantum wells have become ubiquitous as a light source, owing to their direct band-gap that covers UV, visible and infra-red light, and their very h
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Addis, Antonio. "Deep reinforcement learning optimization of video streaming." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.

Cerca il testo completo
Abstract (sommario):
Questa tesi si occuperà dell'ottimizzazione delle performance di streaming video attraverso internet, divenute particolarmente problematiche con l'avvento delle nuove risoluzioni ultraHD e i video a 360 gradi per la realtà virtuale. Verranno confrontate le performance ottenute con gli algoritmi che attualmente fanno parte dello stato dell'arte, e sviluppato un modello di reinforcement learning che sia capace di effettuare scelte per migliorare la QoE(quality of experience) durante una sessione di streaming. Per i video a 360 gradi, verrà inoltre implementata la tecnica snapchange, con questo
Gli stili APA, Harvard, Vancouver, ISO e altri
21

Janagam, Anirudh, and Saddam Hossen. "Analysis of Network Intrusion Detection System with Machine Learning Algorithms (Deep Reinforcement Learning Algorithm)." Thesis, Blekinge Tekniska Högskola, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-17126.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Weideman, Ryan. "Robot Navigation in Cluttered Environments with Deep Reinforcement Learning." DigitalCommons@CalPoly, 2019. https://digitalcommons.calpoly.edu/theses/2011.

Testo completo
Abstract (sommario):
The application of robotics in cluttered and dynamic environments provides a wealth of challenges. This thesis proposes a deep reinforcement learning based system that determines collision free navigation robot velocities directly from a sequence of depth images and a desired direction of travel. The system is designed such that a real robot could be placed in an unmapped, cluttered environment and be able to navigate in a desired direction with no prior knowledge. Deep Q-learning, coupled with the innovations of double Q-learning and dueling Q-networks, is applied. Two modifications of this a
Gli stili APA, Harvard, Vancouver, ISO e altri
23

PASQUALINI, LUCA. "Real World Problems through Deep Reinforcement Learning." Doctoral thesis, Università di Siena, 2022. http://hdl.handle.net/11365/1192945.

Testo completo
Abstract (sommario):
Reinforcement Learning (RL) represents a very promising field in the umbrella of Machine Learning (ML). Using algorithms inspired by psychology, specifically by the Operant Conditioning of Behaviorism, RL makes it possible to solve problems from scratch, without any prior knowledge nor data about the task at hand. When used in conjuction with Neural Networks (NNs), RL has proven to be especially effective: we call this Deep Reinforcement Learning (DRL). In recent past, DRL proved super-human capabilities on many games, but its real world applications are varied and range from robotics to gener
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Song, Yupu. "A Forex Trading System Using Evolutionary Reinforcement Learning." Digital WPI, 2017. https://digitalcommons.wpi.edu/etd-theses/1240.

Testo completo
Abstract (sommario):
Building automated trading systems has long been one of the most cutting-edge and exciting fields in the financial industry. In this research project, we built a trading system based on machine learning methods. We used the Recurrent Reinforcement Learning (RRL) algorithm as our fundamental algorithm, and by introducing Genetic Algorithms (GA) in the optimization procedure, we tackled the problems of picking good initial values of parameters and dynamically updating the learning speed in the original RRL algorithm. We call this optimization algorithm the Evolutionary Recurrent Reinforcement Le
Gli stili APA, Harvard, Vancouver, ISO e altri
25

Mitchell, Matthew Winston 1968. "An architecture for situated learning agents." Monash University, School of Computer Science and Software Engineering, 2003. http://arrow.monash.edu.au/hdl/1959.1/5553.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
26

Tarbouriech, Jean. "Goal-oriented exploration for reinforcement learning." Electronic Thesis or Diss., Université de Lille (2022-....), 2022. http://www.theses.fr/2022ULILB014.

Testo completo
Abstract (sommario):
Apprendre à atteindre des buts est une compétence à acquérir à grande pertinence pratique pour des agents intelligents. Par exemple, ceci englobe de nombreux problèmes de navigation (se diriger vers telle destination), de manipulation robotique (atteindre telle position du bras robotique) ou encore certains jeux (gagner en accomplissant tel objectif). En tant qu'être vivant interagissant avec le monde, je suis constamment motivé par l'atteinte de buts, qui varient en portée et difficulté.L'Apprentissage par Renforcement (AR) est un paradigme prometteur pour formaliser et apprendre des comporte
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Irani, Arya John. "Utilizing negative policy information to accelerate reinforcement learning." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53481.

Testo completo
Abstract (sommario):
A pilot study by Subramanian et al. on Markov decision problem task decomposition by humans revealed that participants break down tasks into both short-term subgoals with a defined end-condition (such as "go to food") and long-term considerations and invariants with no end-condition (such as "avoid predators"). In the context of Markov decision problems, behaviors having clear start and end conditions are well-modeled by an abstraction known as options, but no abstraction exists in the literature for continuous constraints imposed on the agent's behavior. We propose two representations to
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Tham, Chen Khong. "Modular on-line function approximation for scaling up reinforcement learning." Thesis, University of Cambridge, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.309702.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Dönmez, Halit Anil. "Collision Avoidance for Virtual Crowds Using Reinforcement Learning." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-210560.

Testo completo
Abstract (sommario):
Virtual crowd simulation is being used in a wide variety of applications such as video games, architectural designs and movies. It is important for creators to have a realistic crowd simulator that will be able to generate crowds that displays the behaviours needed. It is important to provide an easy to use tool for crowd generation which is fast and realistic. Reinforcement Learning was proposed for training an agent to display a certain behaviour. In this thesis, a Reinforcement Learning approach was implemented and the generated virtual crowds were evaluated. Q Learning method was selected
Gli stili APA, Harvard, Vancouver, ISO e altri
30

Svensson, Frida. "Scalable Distributed Reinforcement Learning for Radio Resource Management." Thesis, Linköpings universitet, Tillämpad matematik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177822.

Testo completo
Abstract (sommario):
There is a large potential for automation and optimization in radio access networks (RANs) using a data-driven approach to efficiently handle the increase in complexity due to the steep growth in traffic and new technologies introduced with the development of 5G. Reinforcement learning (RL) has natural applications in RAN control loops such as link adaptation, interference management and power control at different timescales commonly occurring in the RAN context. Elevating the status of data-driven solutions in RAN and building a new, scalable, distributed and data-friendly RAN architecture wi
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Larsson, Hannes. "Deep Reinforcement Learning for Cavity Filter Tuning." Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-354815.

Testo completo
Abstract (sommario):
In this Master's thesis the option of using deep reinforcement learning for cavity filter tuning has been explored. Several reinforcement learning algorithms have been explained and discussed, and then the deep deterministic policy gradient algorithm has been used to solve a simulated filter tuning problem. Both the filter environment and the reinforcement learning agent were implemented, with the filter environment making use of existing circuit models. The reinforcement learning agent learned how to tune filters with four poles and one transmission zero, or eight tune-able screws in total. A
Gli stili APA, Harvard, Vancouver, ISO e altri
32

Renner, Michael Robert. "Machine Learning Simulation: Torso Dynamics of Robotic Biped." Thesis, Virginia Tech, 2007. http://hdl.handle.net/10919/34602.

Testo completo
Abstract (sommario):
Military, Medical, Exploratory, and Commercial robots have much to gain from exchanging wheels for legs. However, the equations of motion of dynamic bipedal walker models are highly coupled and non-linear, making the selection of an appropriate control scheme difficult. A temporal difference reinforcement learning method known as Q-learning develops complex control policies through environmental exploration and exploitation. As a proof of concept, Q-learning was applied through simulation to a benchmark single pendulum swing-up/balance task; the value function was first approximated with a loo
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Nikolic, Marko. "Single asset trading: a recurrent reinforcement learning approach." Thesis, Mälardalens högskola, Akademin för utbildning, kultur och kommunikation, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-47505.

Testo completo
Abstract (sommario):
Asset trading using machine learning has become popular within the financial industry in the recent years. This can for instance be seen in the large number of daily trading volume which are defined by an automatic algorithm. This thesis presents a recurrent reinforcement learning model to trade an asset. The benefits, drawdowns and the derivations of the model are presented. Different parameters of the model are calibrated and tuned considering a traditional division between training and testing data set and also with the help of nested cross validation. The results of the single asset tradin
Gli stili APA, Harvard, Vancouver, ISO e altri
34

Emenonye, Don-Roberts Ugochukwu. "Application of Machine Learning to Multi Antenna Transmission and Machine Type Resource Allocation." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/99956.

Testo completo
Abstract (sommario):
Wireless communication systems is a well-researched area in electrical engineering that has continually evolved over the past decades. This constant evolution and development have led to well-formulated theoretical baselines in terms of reliability and efficiency. However, most communication baselines are derived by splitting the baseband communications into a series of modular blocks like modulation, coding, channel estimation, and orthogonal frequency modulation. Subsequently, these blocks are independently optimized. Although this has led to a very efficient and reliable process, a theoreti
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Barkino, Iliam. "Summary Statistic Selection with Reinforcement Learning." Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-390838.

Testo completo
Abstract (sommario):
Multi-armed bandit (MAB) algorithms could be used to select a subset of the k most informative summary statistics, from a pool of m possible summary statistics, by reformulating the subset selection problem as a MAB problem. This is suggested by experiments that tested five MAB algorithms (Direct, Halving, SAR, OCBA-m, and Racing) on the reformulated problem and comparing the results to two established subset selection algorithms (Minimizing Entropy and Approximate Sufficiency). The MAB algorithms yielded errors at par with the established methods, but in only a fraction of the time. Establish
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Cunha, João Alexandre da Silva Costa e. "Techniques for batch reinforcement learning in robotics." Doctoral thesis, Universidade de Aveiro, 2015. http://hdl.handle.net/10773/15735.

Testo completo
Abstract (sommario):
Doutoramento em Engenharia Informática<br>This thesis addresses the Batch Reinforcement Learning methods in Robotics. This sub-class of Reinforcement Learning has shown promising results and has been the focus of recent research. Three contributions are proposed that aim to extend the state-of-art methods allowing for a faster and more stable learning process, such as required for learning in Robotics. The Q-learning update-rule is widely applied, since it allows to learn without the presence of a model of the environment. However, this update-rule is transition-based and does not take
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Crandall, Jacob W. "Learning Successful Strategies in Repeated General-sum Games." Diss., CLICK HERE for online access, 2005. http://contentdm.lib.byu.edu/ETD/image/etd1156.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Wingate, David. "Solving Large MDPs Quickly with Partitioned Value Iteration." Diss., CLICK HERE for online access, 2004. http://contentdm.lib.byu.edu/ETD/image/etd437.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Beretta, Davide. "Experience Replay in Sparse Rewards Problems using Deep Reinforcement Techniques." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17531/.

Testo completo
Abstract (sommario):
In questo lavoro si introduce il lettore al Reinforcement Learning, un'area del Machine Learning su cui negli ultimi anni è stata fatta molta ricerca. In seguito vengono presentate alcune modifiche ad ACER, un algoritmo noto e molto interessante che fa uso di Experience Replay. Lo scopo è quello di cercare di aumentarne le performance su problemi generali ma in particolar modo sugli sparse reward problem. Per verificare la bontà delle idee proposte è utilizzato Montezuma's Revenge, un gioco sviluppato per Atari 2600 e considerato tra i più difficili da trattare.
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Vafaie, Parsa. "Learning in the Presence of Skew and Missing Labels Through Online Ensembles and Meta-reinforcement Learning." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42636.

Testo completo
Abstract (sommario):
Data streams are large sequences of data, possibly endless and temporarily ordered, that are common-place in Internet of Things (IoT) applications such as intrusion detection in computer networking, fraud detection in financial institutions, real-time tumor tracking in radiotherapy and social media analysis. Algorithms learning from such streams need to be able to construct near real-time models that continuously adapt to potential changes in patterns, in order to retain high performance throughout the stream. It follows that there are numerous challenges involved in supervised learning (or
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Staffolani, Alessandro. "A Reinforcement Learning Agent for Distributed Task Allocation." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20051/.

Testo completo
Abstract (sommario):
Al giorno d'oggi il reinforcement learning ha dimostrato di essere davvero molto efficace nel machine learning in svariati campi, come ad esempio i giochi, il riconoscimento vocale e molti altri. Perciò, abbiamo deciso di applicare il reinforcement learning ai problemi di allocazione, in quanto sono un campo di ricerca non ancora studiato con questa tecnica e perchè questi problemi racchiudono nella loro formulazione un vasto insieme di sotto-problemi con simili caratteristiche, per cui una soluzione per uno di essi si estende ad ognuno di questi sotto-problemi. In questo progetto abbiamo r
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Ceylan, Hakan. "Using Reinforcement Learning in Partial Order Plan Space." Thesis, University of North Texas, 2006. https://digital.library.unt.edu/ark:/67531/metadc5232/.

Testo completo
Abstract (sommario):
Partial order planning is an important approach that solves planning problems without completely specifying the orderings between the actions in the plan. This property provides greater flexibility in executing plans; hence making the partial order planners a preferred choice over other planning methodologies. However, in order to find partially ordered plans, partial order planners perform a search in plan space rather than in space of world states and an uninformed search in plan space leads to poor efficiency. In this thesis, I discuss applying a reinforcement learning method, called First-
Gli stili APA, Harvard, Vancouver, ISO e altri
43

Dazeley, R. "Investigations into Playing Chess Endgames using Reinforcement Learning." Thesis, Honours thesis, University of Tasmania, 2001. https://eprints.utas.edu.au/62/1/Final_Thesis.pdf.

Testo completo
Abstract (sommario):
Research in computer game playing has relied primarily on brute force searching approaches rather than any formal AI method. However, these methods may not be able to exceed human ability, as they need human expert knowledge to perform as well as they do. One recently popularized field of research known as reinforcement learning has shown good prospects in overcoming these limitations when applied to non-deterministic games. This thesis investigated whether the TD(_) algorithm, one method of reinforcement learning, using standard back-propagation neural networks for function generalization,
Gli stili APA, Harvard, Vancouver, ISO e altri
44

Miller, Eric D. "Biased Exploration in Offline Hierarchical Reinforcement Learning." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case160768140424212.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Qi, Dehu. "Multi-agent systems : integrating reinforcement learning, bidding and genetic algorithms /." free to MU campus, to others for purchase, 2002. http://wwwlib.umi.com/cr/mo/fullcit?p3060133.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
46

Sharma, Aakanksha. "Machine learning-based optimal load balancing in software-defined networks." Thesis, Federation University Australia, 2022. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/188228.

Testo completo
Abstract (sommario):
The global advancement of the Internet of Things (IoT) has poised the existing network traffic for explosive growth. The prediction in the literature shows that in the future, trillions of smart devices will connect to transfer useful information. Accommodating such proliferation of devices in the existing network infrastructure, referred to as the traditional network, is a significant challenge due to the absence of centralized control, making it tedious to implement the device management and network protocol updates. In addition, due to their inherently distributed features, applying machine
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Ngai, Chi-kit, and 魏智傑. "Reinforcement-learning-based autonomous vehicle navigation in a dynamically changing environment." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B39707386.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Buzzoni, Michele. "Reinforcement Learning in problemi di controllo del bilanciamento." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/15539/.

Testo completo
Abstract (sommario):
Si pone come obiettivo della tesi lo studio di algoritmi di reinforcement learning capaci di istruire un agente ad interagire correttamente con gli ambienti proposti con lo scopo di risolvere i problemi presentati. Nello specifico i problemi verteranno su un argomento comune: il balancing, ovvero problemi legati all'equilibrio. In particolare vengono presentati tre ambienti per il learning: due sono legati al conosciuto “cart-pole problem” in cui l’ambiente è composto da un carrello su cui è posto un palo. L’agente, muovendo il carrello, dovrà mantenere bilanciato il palo impedendo la sua cad
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Hayashi, Kazuki. "Reinforcement Learning for Optimal Design of Skeletal Structures." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263614.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Kuurne, Uussilta Dennis, and Viktor Olsson. "Deep Reinforcement Learning in Cart Pole and Pong." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-293856.

Testo completo
Abstract (sommario):
In this project, we aim to reproduce previous resultsachieved with Deep Reinforcement Learning. We present theMarkov Decision Process model as well as the algorithms Q-learning and Deep Q-learning Network (DQN). We implement aDQN agent, first in an environment called CartPole, and later inthe game Pong.Our agent was able to solve the CartPole environment in lessthan 300 episodes. We assess the impact some of the parametershad on the agents performance. The performance of the agentis particularly sensitive to the learning rate and seeminglyproportional to the dimension of the neural network. Th
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!