Log in

Relevant bibliographies by topics / Actor-critic learning / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Actor-critic learning.

Dissertations / Theses on the topic 'Actor-critic learning'

Author: Grafiati

Published: 2 June 2025

Last updated: 31 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 22 dissertations / theses for your research on the topic 'Actor-critic learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Baheri, Betis. "MARS: Multi-Scalable Actor-Critic Reinforcement Learning Scheduler." Kent State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=kent1595039454920637.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Hafez, Muhammad Burhan [Verfasser], and Stefan [Akademischer Betreuer] Wermter. "Intrinsically Motivated Actor-Critic for Robot Motor Learning / Muhammad Burhan Hafez ; Betreuer: Stefan Wermter." Hamburg : Staats- und Universitätsbibliothek Hamburg, 2020. http://d-nb.info/121148002X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Andersson, Marcus. "Complexity and problem solving : A tale of two systems." Thesis, Umeå universitet, Institutionen för psykologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-150937.

Full text

Abstract:

The purpose of this thesis is to investigate if increasing complexity for a problem makes a difference for a learning system with dual parts. The dual parts of the learning system are modelled after the Actor and Critic parts from the Actor-Critic algorithm, using the reinforcement learning framework. The results conclude that not any difference can be found in the relative performance in the Actor and Critic parts when increasing the complexity of a problem. These results could depend on technical difficulties in comparing the environments and the algorithms. The difference in complexity woul

APA, Harvard, Vancouver, ISO, and other styles

4

Ari, Evrim Onur. "Fuzzy Actor-critic Learning Based Intelligent Controller For High-level Motion Control Of Serpentine Robots." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606777/index.pdf.

Full text

Abstract:

In this thesis, an intelligent controller architecture for gait selection of a serpentine robot intended to be used in search and rescue tasks is designed, developed and simulated. The architecture is independent of the configuration of the robot and the robot is allowed to make different kind of movements, similar to grasping. Moreover, it is applicable to parallel processing in several aspects and it is an implementation of a controller network on robot segment network. In the architecture several behaviors are defined for each of the segments. Every behavior is realized in the form of Fuzzy

APA, Harvard, Vancouver, ISO, and other styles

5

Thomas, Philip S. "A Reinforcement Learning Controller for Functional Electrical Stimulation of a Human Arm." Case Western Reserve University School of Graduate Studies / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=case1246922202.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Vassilo, Kyle. "Single Image Super Resolution with Infrared Imagery and Multi-Step Reinforcement Learning." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1606146042238906.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Khamassi, Mehdi. "Rôles complémentaires du cortex préfrontal et du striatum dans l'apprentissage et le changement de stratégies de navigation fondées sur la récompense chez le rat : études électrophysiologiques et computationnelles : application à la robotique autonome simulée." Paris 6, 2007. https://tel.archives-ouvertes.fr/tel-00688927.

Full text

Abstract:

Afin d’atteindre efficacement des ressources, les mammifères ont la capacité de suivre différentes « stratégies de navigation » et peuvent passer d'une stratégie à l'autre lorsque les exigences de l'environnement changent. On distingue notamment des stratégies de type stimulus-réponse (S-R) et des stratégies plus élaborées nécessitant l’élaboration d’un modèle du monde. Cette thèse adopte une approche pluridisciplinaire – neurophysiologie, modélisation computationnelle, simulation robotique – afin de préciser certains rôles du striatum et du cortex préfrontal médial (mPFC) dans l'apprentissage

APA, Harvard, Vancouver, ISO, and other styles

8

Sola, Yoann. "Contributions to the development of deep reinforcement learning-based controllers for AUV." Thesis, Brest, École nationale supérieure de techniques avancées Bretagne, 2021. http://www.theses.fr/2021ENTA0015.

Full text

Abstract:

L’environnement marin est un cadre très hostile pour la robotique. Il est fortement non-structuré, très incertain et inclut beaucoup de perturbations externes qui ne peuvent pas être facilement prédites ou modélisées. Dans ce travail, nous allons essayer de contrôler un véhicule sous-marin autonome (AUV) afin d’effectuer une tâche de suivi de points de cheminement, en utilisant un contrôleur basé sur de l’apprentissage automatique. L’apprentissage automatique a permis de faire des progrès impressionnants dans de nombreux domaines différents ces dernières années, et le sous-domaine de l’apprent

APA, Harvard, Vancouver, ISO, and other styles

9

Pergolini, Diego. "Reinforcement Learning: un caso di studio nell'ambito della Animal-AI Olympics." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19415/.

Full text

Abstract:

Il Reinforcement Learning (RL) ha ottenuto grandi risulti negli ultimi anni, superando l'uomo nei giochi Atari, nel GO ed in e-sports come Starcraft e Dota2. Può però il RL affrontare sfide complesse che richiedono varie abilità cognitive per superare brillantemente situazioni complesse e variegate, come potrebbe fare un animale o un umano? Per dimostrarlo sarà necessario un benchmark robusto e significativo, come la competizione Animal-AI Olympics. I suoi organizzatori hanno messo a disposizione un'arena in cui un agente si muoverà ed interagirà con vari oggetti, con l’obiettivo di procurarsi

APA, Harvard, Vancouver, ISO, and other styles

10

Barakat, Anas. "Contributions to non-convex stochastic optimization and reinforcement learning." Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT030.

Full text

Abstract:

Cette thèse est centrée autour de l'analyse de convergence de certains algorithmes d'approximation stochastiques utilisés en machine learning appliqués à l'optimisation et à l'apprentissage par renforcement. La première partie de la thèse est dédiée à un célèbre algorithme en apprentissage profond appelé ADAM, utilisé pour entraîner des réseaux de neurones. Cette célèbre variante de la descente de gradient stochastique est plus généralement utilisée pour la recherche d'un minimiseur local d'une fonction. En supposant que la fonction objective est différentiable et non convexe, nous établissons

APA, Harvard, Vancouver, ISO, and other styles

11

Zimmer, Matthieu. "Apprentissage par renforcement développemental." Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008/document.

Full text

Abstract:

L'apprentissage par renforcement permet à un agent d'apprendre un comportement qui n'a jamais été préalablement défini par l'homme. L'agent découvre l'environnement et les différentes conséquences de ses actions à travers des interactions avec celui-ci : il apprend de sa propre expérience, sans avoir de connaissances préétablies des buts ni des effets de ses actions. Cette thèse s'intéresse à la façon dont l'apprentissage profond peut aider l'apprentissage par renforcement à gérer des espaces continus et des environnements ayant de nombreux degrés de liberté dans l'optique de résoudre des prob

APA, Harvard, Vancouver, ISO, and other styles

12

Zimmer, Matthieu. "Apprentissage par renforcement développemental." Electronic Thesis or Diss., Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008.

Full text

Abstract:

L'apprentissage par renforcement permet à un agent d'apprendre un comportement qui n'a jamais été préalablement défini par l'homme. L'agent découvre l'environnement et les différentes conséquences de ses actions à travers des interactions avec celui-ci : il apprend de sa propre expérience, sans avoir de connaissances préétablies des buts ni des effets de ses actions. Cette thèse s'intéresse à la façon dont l'apprentissage profond peut aider l'apprentissage par renforcement à gérer des espaces continus et des environnements ayant de nombreux degrés de liberté dans l'optique de résoudre des prob

APA, Harvard, Vancouver, ISO, and other styles

13

Khamassi, Mehdi. "Rôles complémentaires du cortex préfrontal et du striatum dans l'apprentissage et le changement de stratégies de navigation basées sur la récompense chez le rat." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2007. http://tel.archives-ouvertes.fr/tel-00688927.

Full text

Abstract:

Les mammifères ont la capacité de suivre différents comportements de navigation, définis comme des " stratégies " ne faisant pas forcément appel à des processus conscients, suivant la tâche spécifique qu'ils ont à résoudre. Dans certains cas où un indice visuel indique le but, ils peuvent suivre une simple stratégie stimulus-réponse (S-R). À l'opposé, d'autres tâches nécessitent que l'animal mette en oeuvre une stratégie plus complexe basée sur l'élaboration d'une certaine représentation de l'espace lui permettant de se localiser et de localiser le but dans l'environnement. De manière à se com

APA, Harvard, Vancouver, ISO, and other styles

14

Niedzwiedz, Christopher Allen. "Vision-Based Reinforcement Learning Using A Consolidated Actor-Critic Model." 2009. http://trace.tennessee.edu/utk_gradthes/548.

Full text

Abstract:

Vision-based machine learning agents are tasked with making decisions based on high-dimensional, noisy input, placing a heavy load on available resources. Moreover, observations typically provide only partial information with respect to the environment state, necessitating robust state inference by the agent. Reinforcement learning provides a framework for decision making with the goal of maximizing long-term reward. This thesis introduces a novel approach to vision-based reinforce- ment learning through the use of a consolidated actor-critic model (CACM). The approach takes advantage of artif

APA, Harvard, Vancouver, ISO, and other styles

15

Mazuecos, Perez Mauricio Diego. "Estudio de simplificación de oraciones con modelos actor-critic." Bachelor's thesis, 2019. http://hdl.handle.net/11086/11914.

Full text

Abstract:

La simplificación de oraciones es una tarea de Procesamiento del Lenguaje Natural que se centra en transformar escritos para que su gramática, estructura y palabras sean más sencillas de comprender, sin perder la semántica de la oración original. Como tal, no es una tarea simple de abordar y requiere métodos sofisticados que permitan definir las características que hacer a una oración simple. Al mismo tiempo, estos modelos deben tener una representación adecuada del significado de la oración, que no debe ser alterado durante el proceso de simplificación. En este trabajo de tesis se explora el

APA, Harvard, Vancouver, ISO, and other styles

16

Saxena, Naman. "Average Reward Actor-Critic with Deterministic Policy Search." Thesis, 2023. https://etd.iisc.ac.in/handle/2005/6175.

Full text

Abstract:

The average reward criterion is relatively less studied as most existing works in the Reinforcement Learning literature consider the discounted reward criterion. There are few recent works that present on-policy average reward actor-critic algorithms, but average reward off-policy actor-critic is relatively less explored. In this work, we present both on-policy and off-policy deterministic policy gradient theorems for the average reward performance criterion. Using these theorems, we also present an Average Reward Off-Policy Deep Deterministic Policy Gradient (ARO-DDPG) Algorithm. We first sho

APA, Harvard, Vancouver, ISO, and other styles

17

CHEN, HSIN-CHANG, and 陳信璋. "Actor-Critic Reinforcement Learning for Controller Design of Full Vehicle Active Suspension System." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/pxdd89.

Full text

Abstract:

碩士<br>逢甲大學<br>自動控制工程學系<br>106<br>Advances in science and technology drive the evolution of the intelligent vehicle technologies. In order to enhance the ride comfort, vehicle handling performance, and safety, the vehicle suspension system is becoming increasingly advanced. From variable stiffness springs to air springs of passive suspension, fixed gain damper to adjustable dynamic damper of semi-active suspension and even electromagnetic active suspension, the suspension system has become the trend of automobile engineering. This thesis is mainly focused on the development of the active contro

APA, Harvard, Vancouver, ISO, and other styles

18

Pereira, Bruno Alexandre Barbosa. "Deep reinforcement learning for robotic manipulation tasks." Master's thesis, 2021. http://hdl.handle.net/10773/33654.

Full text

Abstract:

The recent advances in Artificial Intelligence (AI) present new opportunities for robotics on many fronts. Deep Reinforcement Learning (DRL) is a sub-field of AI which results from the combination of Deep Learning (DL) and Reinforcement Learning (RL). It categorizes machine learning algorithms which learn directly from experience and offers a comprehensive framework for studying the interplay among learning, representation and decision-making. It has already been successfully used to solve tasks in many domains. Most notably, DRL agents learned to play Atari 2600 video games directly f

APA, Harvard, Vancouver, ISO, and other styles

19

Diddigi, Raghuram Bharadwaj. "Reinforcement Learning Algorithms for Off-Policy, Multi-Agent Learning and Applications to Smart Grids." Thesis, 2022. https://etd.iisc.ac.in/handle/2005/5673.

Full text

Abstract:

Reinforcement Learning (RL) algorithms are a popular class of algorithms for training an agent to learn desired behavior through interaction with an environment whose dynamics is unknown to the agent. RL algorithms combined with neural network architectures have enjoyed much success in various disciplines like games, medicine, energy management, economics and supply chain management. In our thesis, we study interesting extensions of standard single-agent RL settings, like off-policy and multi-agent settings. We discuss the motivations and importance of these settings and propose convergen

APA, Harvard, Vancouver, ISO, and other styles

20

Duarte, Ana Filipa de Sampaio Calçada. "Using Reinforcement Learning in the tuning of Central Pattern Generators." Master's thesis, 2012. http://hdl.handle.net/1822/28037.

Full text

Abstract:

Dissertação de mestrado em Engenharia Informática<br>É objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma recompensa cumulativa, tendo em conta o facto

APA, Harvard, Vancouver, ISO, and other styles

21

Lakshmanan, K. "Online Learning and Simulation Based Algorithms for Stochastic Optimization." Thesis, 2012. http://etd.iisc.ac.in/handle/2005/3245.

Full text

Abstract:

In many optimization problems, the relationship between the objective and parameters is not known. The objective function itself may be stochastic such as a long-run average over some random cost samples. In such cases finding the gradient of the objective is not possible. It is in this setting that stochastic approximation algorithms are used. These algorithms use some estimates of the gradient and are stochastic in nature. Amongst gradient estimation techniques, Simultaneous Perturbation Stochastic Approximation (SPSA) and Smoothed Functional(SF) scheme are widely used. In this thesis we hav

APA, Harvard, Vancouver, ISO, and other styles

22

Lakshmanan, K. "Online Learning and Simulation Based Algorithms for Stochastic Optimization." Thesis, 2012. http://hdl.handle.net/2005/3245.

Full text

Abstract:

In many optimization problems, the relationship between the objective and parameters is not known. The objective function itself may be stochastic such as a long-run average over some random cost samples. In such cases finding the gradient of the objective is not possible. It is in this setting that stochastic approximation algorithms are used. These algorithms use some estimates of the gradient and are stochastic in nature. Amongst gradient estimation techniques, Simultaneous Perturbation Stochastic Approximation (SPSA) and Smoothed Functional(SF) scheme are widely used. In this thesis we hav

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!