Academic literature on the topic 'Safe RL'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Safe RL.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Safe RL"

1

Carr, Steven, Nils Jansen, Sebastian Junges, and Ufuk Topcu. "Safe Reinforcement Learning via Shielding under Partial Observability." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 14748–56. http://dx.doi.org/10.1609/aaai.v37i12.26723.

Full text
Abstract:
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment. A family of approaches to this problem assume domain knowledge in the form of a (partial) model of this environment to decide upon the safety of an action. A so-called shield forces the RL agent to select only safe actions. However, for adoption in various applications, one must look beyond enforcing safety and also ensure the applicability of RL with good performance. We extend the applicability of shields via tight integration wit
APA, Harvard, Vancouver, ISO, and other styles
2

Ma, Yecheng Jason, Andrew Shen, Osbert Bastani, and Jayaraman Dinesh. "Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 5 (2022): 5404–12. http://dx.doi.org/10.1609/aaai.v36i5.20478.

Full text
Abstract:
Reinforcement Learning (RL) agents in the real world must satisfy safety constraints in addition to maximizing a reward objective. Model-based RL algorithms hold promise for reducing unsafe real-world actions: they may synthesize policies that obey all constraints using simulated samples from a learned model. However, imperfect models can result in real-world constraint violations even for actions that are predicted to satisfy all constraints. We propose Conservative and Adaptive Penalty (CAP), a model-based safe RL framework that accounts for potential modeling errors by capturing model uncer
APA, Harvard, Vancouver, ISO, and other styles
3

Xu, Haoran, Xianyuan Zhan, and Xiangyu Zhu. "Constraints Penalized Q-learning for Safe Offline Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (2022): 8753–60. http://dx.doi.org/10.1609/aaai.v36i8.20855.

Full text
Abstract:
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment. This problem is more appealing for real world RL applications, in which data collection is costly or dangerous. Enforcing constraint satisfaction is non-trivial, especially in offline settings, as there is a potential large discrepancy between the policy distribution and the data distribution, causing errors in estimating the value of safety constraints. We s
APA, Harvard, Vancouver, ISO, and other styles
4

Thananjeyan, Brijen, Ashwin Balakrishna, Suraj Nair, et al. "Recovery RL: Safe Reinforcement Learning With Learned Recovery Zones." IEEE Robotics and Automation Letters 6, no. 3 (2021): 4915–22. http://dx.doi.org/10.1109/lra.2021.3070252.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Serrano-Cuevas, Jonathan, Eduardo F. Morales, and Pablo Hernández-Leal. "Safe reinforcement learning using risk mapping by similarity." Adaptive Behavior 28, no. 4 (2019): 213–24. http://dx.doi.org/10.1177/1059712319859650.

Full text
Abstract:
Reinforcement learning (RL) has been used to successfully solve sequential decision problem. However, considering risk at the same time as the learning process is an open research problem. In this work, we are interested in the type of risk that can lead to a catastrophic state. Related works that aim to deal with risk propose complex models. In contrast, we follow a simple, yet effective, idea: similar states might lead to similar risk. Using this idea, we propose risk mapping by similarity (RMS), an algorithm for discrete scenarios which infers the risk of newly discovered states by analyzin
APA, Harvard, Vancouver, ISO, and other styles
6

Cheng, Richard, Gábor Orosz, Richard M. Murray, and Joel W. Burdick. "End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3387–95. http://dx.doi.org/10.1609/aaai.v33i01.33013387.

Full text
Abstract:
Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) online learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages th
APA, Harvard, Vancouver, ISO, and other styles
7

Jurj, Sorin Liviu, Dominik Grundt, Tino Werner, Philipp Borchers, Karina Rothemann, and Eike Möhlmann. "Increasing the Safety of Adaptive Cruise Control Using Physics-Guided Reinforcement Learning." Energies 14, no. 22 (2021): 7572. http://dx.doi.org/10.3390/en14227572.

Full text
Abstract:
This paper presents a novel approach for improving the safety of vehicles equipped with Adaptive Cruise Control (ACC) by making use of Machine Learning (ML) and physical knowledge. More exactly, we train a Soft Actor-Critic (SAC) Reinforcement Learning (RL) algorithm that makes use of physical knowledge such as the jam-avoiding distance in order to automatically adjust the ideal longitudinal distance between the ego- and leading-vehicle, resulting in a safer solution. In our use case, the experimental results indicate that the physics-guided (PG) RL approach is better at avoiding collisions at
APA, Harvard, Vancouver, ISO, and other styles
8

Sakrihei, Helen. "Using automatic storage for ILL – experiences from the National Repository Library in Norway." Interlending & Document Supply 44, no. 1 (2016): 14–16. http://dx.doi.org/10.1108/ilds-11-2015-0035.

Full text
Abstract:
Purpose – The purpose of this paper is to share the Norwegian Repository Library (RL)’s experiences with an automatic storage for interlibrary lending (ILL). Design/methodology/approach – This paper describes how the RL uses the automatic storage to deliver ILL services to Norwegian libraries. Chaos storage is the main principle for storage. Findings – Using automatic storage for ILL is efficient, cost-effective and safe. Originality/value – The RL has used automatic storage since 2003, and it is one of a few libraries using this technology.
APA, Harvard, Vancouver, ISO, and other styles
9

Ding, Yuhao, and Javad Lavaei. "Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 6 (2023): 7396–404. http://dx.doi.org/10.1609/aaai.v37i6.25900.

Full text
Abstract:
We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments. In this problem, the reward/utility functions and the state transition functions are both allowed to vary arbitrarily over time as long as their cumulative variations do not exceed certain known variation budgets. Designing safe RL algorithms in time-varying environments is particularly challenging because of the need to integrate the constraint vi
APA, Harvard, Vancouver, ISO, and other styles
10

Tubeuf, Carlotta, Felix Birkelbach, Anton Maly, and René Hofmann. "Increasing the Flexibility of Hydropower with Reinforcement Learning on a Digital Twin Platform." Energies 16, no. 4 (2023): 1796. http://dx.doi.org/10.3390/en16041796.

Full text
Abstract:
The increasing demand for flexibility in hydropower systems requires pumped storage power plants to change operating modes and compensate reactive power more frequently. In this work, we demonstrate the potential of applying reinforcement learning (RL) to control the blow-out process of a hydraulic machine during pump start-up and when operating in synchronous condenser mode. Even though RL is a promising method that is currently getting much attention, safety concerns are stalling research on RL for the control of energy systems. Therefore, we present a concept that enables process control wi
APA, Harvard, Vancouver, ISO, and other styles
More sources
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!