Log in

Relevant bibliographies by topics / Actor-critic learning

Contents

Journal articles
Dissertations / Theses
Book chapters
Conference papers
Reports

Academic literature on the topic 'Actor-critic learning'

Author: Grafiati

Published: 2 June 2025

Last updated: 31 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Actor-critic learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Actor-critic learning"

1

Zheng, Liyuan, Tanner Fiez, Zane Alumbaugh, Benjamin Chasnov, and Lillian J. Ratliff. "Stackelberg Actor-Critic: Game-Theoretic Reinforcement Learning Algorithms." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (2022): 9217–24. http://dx.doi.org/10.1609/aaai.v36i8.20908.

Full text

Abstract:

The hierarchical interaction between the actor and critic in actor-critic based reinforcement learning algorithms naturally lends itself to a game-theoretic interpretation. We adopt this viewpoint and model the actor and critic interaction as a two-player general-sum game with a leader-follower structure known as a Stackelberg game. Given this abstraction, we propose a meta-framework for Stackelberg actor-critic algorithms where the leader player follows the total derivative of its objective instead of the usual individual gradient. From a theoretical standpoint, we develop a policy gradient t

APA, Harvard, Vancouver, ISO, and other styles

2

Kim, Jong-Ho, Dae-Sung Kang, and Joo-Young Park. "Robot Locomotion via RLS-based Actor-Critic Learning." Journal of Fuzzy Logic and Intelligent Systems 15, no. 7 (2005): 893–98. http://dx.doi.org/10.5391/jkiis.2005.15.7.893.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Wang, Fulin, Bozhao Wang, and Wentao Tang. "Reinforcement Learning Optimization Scheduling of Micro-integrated Energy Systems Under Uncertain Factors." Journal of Physics: Conference Series 3012, no. 1 (2025): 012089. https://doi.org/10.1088/1742-6596/3012/1/012089.

Full text

Abstract:

Abstract This study focuses on the optimal scheduling of integrated energy systems under uncertain factors. In view of the impact of uncertainty in wind power and photovoltaic output on system scheduling, an optimization strategy based on the improved Soft Actor-Critic algorithm is proposed. This challenge is met by simulating uncertainty data and improving the Soft Actor-Critic objective function. A mathematical model of a micro-integrated energy system is constructed in this paper, covering equipment and uncertainty models, and an optimal scheduling model is constructed based on the Soft Act

APA, Harvard, Vancouver, ISO, and other styles

4

Grondman, Ivo, Maarten Vaandrager Lucian Busoniu, Robert Babuska, and Erik Schuitema. "Actor-Critic Control with Reference Model Learning." IFAC Proceedings Volumes 44, no. 1 (2011): 14723–28. http://dx.doi.org/10.3182/20110828-6-it-1002.00759.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Chen, Ruidong, and Jesse H. Goldberg. "Actor-critic reinforcement learning in the songbird." Current Opinion in Neurobiology 65 (December 2020): 1–9. http://dx.doi.org/10.1016/j.conb.2020.08.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Wang, Xue-Song, Yu-Hu Cheng, and Jian-Qiang Yi. "A fuzzy Actor–Critic reinforcement learning network." Information Sciences 177, no. 18 (2007): 3764–81. http://dx.doi.org/10.1016/j.ins.2007.03.012.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Wang, Mingyi, Jianhao Tang, Haoli Zhao, Zhenni Li, and Shengli Xie. "Automatic Compression of Neural Network with Deep Reinforcement Learning Based on Proximal Gradient Method." Mathematics 11, no. 2 (2023): 338. http://dx.doi.org/10.3390/math11020338.

Full text

Abstract:

In recent years, the model compression technique is very effective for deep neural network compression. However, many existing model compression methods rely heavily on human experience to explore a compression strategy between network structure, speed, and accuracy, which is usually suboptimal and time-consuming. In this paper, we propose a framework for automatically compressing models through the actor–critic structured deep reinforcement learning (DRL) which interacts with each layer in the neural network, where the actor network determines the compression strategy and the critic network e

APA, Harvard, Vancouver, ISO, and other styles

8

TAHAMI, EHSAN, AMIR HOMAYOUN JAFARI, and ALI FALLAH. "APPLICATION OF AN EVOLUTIONARY ACTOR–CRITIC REINFORCEMENT LEARNING METHOD FOR THE CONTROL OF A THREE-LINK MUSCULOSKELETAL ARM DURING A REACHING MOVEMENT." Journal of Mechanics in Medicine and Biology 13, no. 02 (2013): 1350040. http://dx.doi.org/10.1142/s0219519413500401.

Full text

Abstract:

In this paper, the control of a planar three-link musculoskeletal arm by using a revolutionary actor–critic reinforcement learning (RL) method during a reaching movement to a stationary target is presented. The arm model used in this study included three skeletal links (wrist, forearm, and upper arm), three joints (wrist, elbow, and shoulder without redundancy), and six non-linear monoarticular muscles (with redundancy), which were based on the Hill model. The learning control system was composed of actor, critic, and genetic algorithm (GA) parts. Two single-layer neural networks were used for

APA, Harvard, Vancouver, ISO, and other styles

9

Kong, Lingzhi. "A Vehicle Adaptive Cruise Control Method Based on Deep Reinforcement Learning Algorithm." Applied and Computational Engineering 125, no. 1 (2025): 102–9. https://doi.org/10.54254/2755-2721/2025.20196.

Full text

Abstract:

Adaptive cruise control (ACC) is an upgrade to the traditional cruise control system in vehicles. This paper presents an adaptive cruise control method based on the Deep Deterministic Policy Gradient (DDPG) algorithm. The Actor network computes actions based on the current state, adding noise to enhance exploration. The Critic network computes the Q-value of the current state-action pair. The Actor target network and Critic target network compute the Q-value of the next state-action pair. The gradient descent method is used to minimize the loss function of the Critic network, which includes th

APA, Harvard, Vancouver, ISO, and other styles

10

Hatakeyama, Hiroyuki, Shingo Mabu, Kotaro Hirasawa, and Jinglu Hu. "Genetic Network Programming with Actor-Critic." Journal of Advanced Computational Intelligence and Intelligent Informatics 11, no. 1 (2007): 79–86. http://dx.doi.org/10.20965/jaciii.2007.p0079.

Full text

Abstract:

A new graph-based evolutionary algorithm named “Genetic Network Programming, GNP” has been already proposed. GNP represents its solutions as graph structures, which can improve the expression ability and performance. In addition, GNP with Reinforcement Learning (GNP-RL) was proposed a few years ago. Since GNP-RL can do reinforcement learning during task execution in addition to evolution after task execution, it can search for solutions efficiently. In this paper, GNP with Actor-Critic (GNP-AC) which is a new type of GNP-RL is proposed. Originally, GNP deals with discrete information, but GNP-

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Actor-critic learning"

1

Baheri, Betis. "MARS: Multi-Scalable Actor-Critic Reinforcement Learning Scheduler." Kent State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=kent1595039454920637.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Hafez, Muhammad Burhan [Verfasser], and Stefan [Akademischer Betreuer] Wermter. "Intrinsically Motivated Actor-Critic for Robot Motor Learning / Muhammad Burhan Hafez ; Betreuer: Stefan Wermter." Hamburg : Staats- und Universitätsbibliothek Hamburg, 2020. http://d-nb.info/121148002X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Andersson, Marcus. "Complexity and problem solving : A tale of two systems." Thesis, Umeå universitet, Institutionen för psykologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-150937.

Full text

Abstract:

The purpose of this thesis is to investigate if increasing complexity for a problem makes a difference for a learning system with dual parts. The dual parts of the learning system are modelled after the Actor and Critic parts from the Actor-Critic algorithm, using the reinforcement learning framework. The results conclude that not any difference can be found in the relative performance in the Actor and Critic parts when increasing the complexity of a problem. These results could depend on technical difficulties in comparing the environments and the algorithms. The difference in complexity woul

APA, Harvard, Vancouver, ISO, and other styles

4

Ari, Evrim Onur. "Fuzzy Actor-critic Learning Based Intelligent Controller For High-level Motion Control Of Serpentine Robots." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606777/index.pdf.

Full text

Abstract:

In this thesis, an intelligent controller architecture for gait selection of a serpentine robot intended to be used in search and rescue tasks is designed, developed and simulated. The architecture is independent of the configuration of the robot and the robot is allowed to make different kind of movements, similar to grasping. Moreover, it is applicable to parallel processing in several aspects and it is an implementation of a controller network on robot segment network. In the architecture several behaviors are defined for each of the segments. Every behavior is realized in the form of Fuzzy

APA, Harvard, Vancouver, ISO, and other styles

5

Thomas, Philip S. "A Reinforcement Learning Controller for Functional Electrical Stimulation of a Human Arm." Case Western Reserve University School of Graduate Studies / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=case1246922202.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Vassilo, Kyle. "Single Image Super Resolution with Infrared Imagery and Multi-Step Reinforcement Learning." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1606146042238906.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Khamassi, Mehdi. "Rôles complémentaires du cortex préfrontal et du striatum dans l'apprentissage et le changement de stratégies de navigation fondées sur la récompense chez le rat : études électrophysiologiques et computationnelles : application à la robotique autonome simulée." Paris 6, 2007. https://tel.archives-ouvertes.fr/tel-00688927.

Full text

Abstract:

Afin d’atteindre efficacement des ressources, les mammifères ont la capacité de suivre différentes « stratégies de navigation » et peuvent passer d'une stratégie à l'autre lorsque les exigences de l'environnement changent. On distingue notamment des stratégies de type stimulus-réponse (S-R) et des stratégies plus élaborées nécessitant l’élaboration d’un modèle du monde. Cette thèse adopte une approche pluridisciplinaire – neurophysiologie, modélisation computationnelle, simulation robotique – afin de préciser certains rôles du striatum et du cortex préfrontal médial (mPFC) dans l'apprentissage

APA, Harvard, Vancouver, ISO, and other styles

8

Sola, Yoann. "Contributions to the development of deep reinforcement learning-based controllers for AUV." Thesis, Brest, École nationale supérieure de techniques avancées Bretagne, 2021. http://www.theses.fr/2021ENTA0015.

Full text

Abstract:

L’environnement marin est un cadre très hostile pour la robotique. Il est fortement non-structuré, très incertain et inclut beaucoup de perturbations externes qui ne peuvent pas être facilement prédites ou modélisées. Dans ce travail, nous allons essayer de contrôler un véhicule sous-marin autonome (AUV) afin d’effectuer une tâche de suivi de points de cheminement, en utilisant un contrôleur basé sur de l’apprentissage automatique. L’apprentissage automatique a permis de faire des progrès impressionnants dans de nombreux domaines différents ces dernières années, et le sous-domaine de l’apprent

APA, Harvard, Vancouver, ISO, and other styles

9

Pergolini, Diego. "Reinforcement Learning: un caso di studio nell'ambito della Animal-AI Olympics." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19415/.

Full text

Abstract:

Il Reinforcement Learning (RL) ha ottenuto grandi risulti negli ultimi anni, superando l'uomo nei giochi Atari, nel GO ed in e-sports come Starcraft e Dota2. Può però il RL affrontare sfide complesse che richiedono varie abilità cognitive per superare brillantemente situazioni complesse e variegate, come potrebbe fare un animale o un umano? Per dimostrarlo sarà necessario un benchmark robusto e significativo, come la competizione Animal-AI Olympics. I suoi organizzatori hanno messo a disposizione un'arena in cui un agente si muoverà ed interagirà con vari oggetti, con l’obiettivo di procurarsi

APA, Harvard, Vancouver, ISO, and other styles

10

Barakat, Anas. "Contributions to non-convex stochastic optimization and reinforcement learning." Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT030.

Full text

Abstract:

Cette thèse est centrée autour de l'analyse de convergence de certains algorithmes d'approximation stochastiques utilisés en machine learning appliqués à l'optimisation et à l'apprentissage par renforcement. La première partie de la thèse est dédiée à un célèbre algorithme en apprentissage profond appelé ADAM, utilisé pour entraîner des réseaux de neurones. Cette célèbre variante de la descente de gradient stochastique est plus généralement utilisée pour la recherche d'un minimiseur local d'une fonction. En supposant que la fonction objective est différentiable et non convexe, nous établissons

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Actor-critic learning"

1

Peters, Jan, Sethu Vijayakumar, and Stefan Schaal. "Natural Actor-Critic." In Machine Learning: ECML 2005. Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11564096_29.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Zhao, Shiyu. "Actor-Critic Methods." In Mathematical Foundations of Reinforcement Learning. Springer Nature Singapore, 2025. https://doi.org/10.1007/978-981-97-3944-8_10.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Xiao, Zhiqing. "AC: Actor–Critic." In Reinforcement Learning. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-19-4933-3_8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Sewak, Mohit. "Actor-Critic Models and the A3C." In Deep Reinforcement Learning. Springer Singapore, 2019. http://dx.doi.org/10.1007/978-981-13-8285-7_11.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Shang, Wenling, Douwe van der Wal, Herke van Hoof, and Max Welling. "Stochastic Activation Actor Critic Methods." In Machine Learning and Knowledge Discovery in Databases. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-46133-1_7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Tihomirov, Yunes, Roman Rybka, Alexey Serenko, and Alexander Sboev. "Actor-Critic Spiking Neural Network with RSTDP Actor Learning and TD-LTP Critic Learning." In Studies in Computational Intelligence. Springer Nature Switzerland, 2024. https://doi.org/10.1007/978-3-031-76516-2_41.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Röder, Frank, Manfred Eppe, Phuong D. H. Nguyen, and Stefan Wermter. "Curious Hierarchical Actor-Critic Reinforcement Learning." In Artificial Neural Networks and Machine Learning – ICANN 2020. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-61616-8_33.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Zhang, Hongming, Tianyang Yu, and Ruitong Huang. "Combine Deep Q-Networks with Actor-Critic." In Deep Reinforcement Learning. Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-4095-0_6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Zai, Alexander, and Brandon Brown. "Bewältigung komplexerer Probleme mit Actor-Critic-Methoden." In Einstieg in Deep Reinforcement Learning. Carl Hanser Verlag GmbH & Co. KG, 2020. http://dx.doi.org/10.3139/9783446466081.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Zheng, Jiaohao, Mehmet Necip Kurt, and Xiaodong Wang. "Integrated Actor-Critic for Deep Reinforcement Learning." In Lecture Notes in Computer Science. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86380-7_41.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Actor-critic learning"

1

Wang, Junqiu, Feng Xiang, and Bo Liu. "Ameliorating Learning Stability by Regularizing Soft Actor-Critic." In 2024 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2024. https://doi.org/10.1109/robio64047.2024.10907516.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Salaje, Amine, Thomas Chevet, and Nicolas Langlois. "Learning-based Nonlinear Model Predictive Control Using Deterministic Actor-Critic with Gradient Q-learning Critic." In 2024 18th International Conference on Control, Automation, Robotics and Vision (ICARCV). IEEE, 2024. https://doi.org/10.1109/icarcv63323.2024.10821506.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Nimo, Alejo Domínguez, Joaquín Mariano Piñeiro, Javier Esarte, Pablo Daniel Folino, and Sergio Alberino. "Learning Quadrupedal Motion Through Multi-Objective Soft Actor-Critic." In 2024 IEEE Biennial Congress of Argentina (ARGENCON). IEEE, 2024. http://dx.doi.org/10.1109/argencon62399.2024.10735806.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Li, Tongyue, Dianxi Shi, Songchang Jin, Zhen Wang, Huanhuan Yang, and Yang Chen. "Multi-Agent Hierarchical Graph Attention Actor-Critic Reinforcement Learning." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10888861.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Ai, Chen, Anzhi Xie, Suhao Feng, Hanjing Fu, and Zhigang Liu. "Design and Implementation of Event-Triggered Actor-Critic Reinforcement Learning Algorithm." In 2024 IEEE 3rd Industrial Electronics Society Annual On-Line Conference (ONCON). IEEE, 2024. https://doi.org/10.1109/oncon62778.2024.10931705.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Dong, Botao, Longyang Huang, Ning Pang, Ruonan Liu, Weidong Zhang, and Hongtian Chen. "Diffusion Actor with Behavior Critic Guidance Algorithm for Offline Reinforcement Learning." In 2024 7th International Conference on Robotics, Control and Automation Engineering (RCAE). IEEE, 2024. https://doi.org/10.1109/rcae62637.2024.10834229.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Masadeh, Ala'eddin, Zhengdao Wang, and Ahmed E. Kamal. "Selector-Actor-Critic and Tuner-Actor-Critic Algorithms for Reinforcement Learning." In 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP). IEEE, 2019. http://dx.doi.org/10.1109/wcsp.2019.8928124.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Fan, Zhou, Rui Su, Weinan Zhang, and Yong Yu. "Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/316.

Full text

Abstract:

In this paper we propose a hybrid architecture of actor-critic algorithms for reinforcement learning in parameterized action space, which consists of multiple parallel sub-actor networks to decompose the structured action space into simpler action spaces along with a critic network to guide the training of all sub-actor networks. While this paper is mainly focused on parameterized action space, the proposed architecture, which we call hybrid actor-critic, can be extended for more general action spaces which has a hierarchical structure. We present an instance of the hybrid actor-critic archite

APA, Harvard, Vancouver, ISO, and other styles

9

Miranda, Thiago S., and Heder S. Bernardino. "Distributional Safety Critic for Stochastic Latent Actor-Critic." In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2023. http://dx.doi.org/10.5753/eniac.2023.234620.

Full text

Abstract:

When employing reinforcement learning techniques in real-world applications, one may desire to constrain the agent by limiting actions that lead to potential damage, harm, or unwanted scenarios. Particularly, recent approaches focus on developing safe behavior under partial observability conditions. In this vein, we develop a method that combines distributional reinforcement learning techniques with methods used to facilitate learning in partially observable environments, called distributional safe stochastic latent actor-critic (DS-SLAC). We evaluate the DS-SLAC performance on four Safety-Gym

APA, Harvard, Vancouver, ISO, and other styles

10

Peters, James F. "Granular Computing in Actor-Critic Learning." In 2007 IEEE Symposium on Foundations of Computational Intelligence (FOCI 2007). IEEE, 2007. http://dx.doi.org/10.1109/foci.2007.372148.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Actor-critic learning"

1

Pasupuleti, Murali Krishna. Optimal Control and Reinforcement Learning: Theory, Algorithms, and Robotics Applications. National Education Services, 2025. https://doi.org/10.62311/nesx/rriv225.

Full text

Abstract:

Abstract: Optimal control and reinforcement learning (RL) are foundational techniques for intelligent decision-making in robotics, automation, and AI-driven control systems. This research explores the theoretical principles, computational algorithms, and real-world applications of optimal control and reinforcement learning, emphasizing their convergence for scalable and adaptive robotic automation. Key topics include dynamic programming, Hamilton-Jacobi-Bellman (HJB) equations, policy optimization, model-based RL, actor-critic methods, and deep RL architectures. The study also examines traject

APA, Harvard, Vancouver, ISO, and other styles

2

Rinuado, Christina, William Leonard, Christopher Morey, Theresa Coumbe, Jaylen Hopson, and Robert Hilborn. Artificial intelligence (AI)–enabled wargaming agent training. Engineer Research and Development Center (U.S.), 2024. http://dx.doi.org/10.21079/11681/48419.

Full text

Abstract:

Fiscal Year 2021 (FY21) work from the Engineer Research and Development Center Institute for Systems Engineering Research lever-aged deep reinforcement learning to develop intelligent systems (red team agents) capable of exhibiting credible behavior within a military course of action wargaming maritime framework infrastructure. Building from the FY21 research, this research effort sought to explore options to improve upon the wargaming framework infrastructure and to investigate opportunities to improve artificial intelligence (AI) agent behavior. Wargaming framework infrastructure enhancement

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!