To see the other types of publications on this topic, follow the link: Reinforcement Learning.

Journal articles on the topic 'Reinforcement Learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Reinforcement Learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Singh, Pranjal, Prasann Sharma, Yash Gupta, and Sampada Massey. "Reinforcement Learning for Portfolio Management." International Journal of Research Publication and Reviews 6, no. 4 (2025): 10374–77. https://doi.org/10.55248/gengpi.6.0425.1599.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Deora, Merin, and Sumit Mathur. "Reinforcement Learning." IJARCCE 6, no. 4 (2017): 178–81. http://dx.doi.org/10.17148/ijarcce.2017.6433.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Barto, Andrew G. "Reinforcement Learning." IFAC Proceedings Volumes 31, no. 29 (1998): 5. http://dx.doi.org/10.1016/s1474-6670(17)38315-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Woergoetter, Florentin, and Bernd Porr. "Reinforcement learning." Scholarpedia 3, no. 3 (2008): 1448. http://dx.doi.org/10.4249/scholarpedia.1448.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Moore, Brett L., Anthony G. Doufas, and Larry D. Pyeatt. "Reinforcement Learning." Anesthesia & Analgesia 112, no. 2 (2011): 360–67. http://dx.doi.org/10.1213/ane.0b013e31820334a7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Likas, Aristidis. "A Reinforcement Learning Approach to Online Clustering." Neural Computation 11, no. 8 (1999): 1915–32. http://dx.doi.org/10.1162/089976699300016025.

Full text
Abstract:
A general technique is proposed for embedding online clustering algorithms based on competitive learning in a reinforcement learning framework. The basic idea is that the clustering system can be viewed as a reinforcement learning system that learns through reinforcements to follow the clustering strategy we wish to implement. In this sense, the reinforcement guided competitive learning (RGCL) algorithm is proposed that constitutes a reinforcement-based adaptation of learning vector quantization (LVQ) with enhanced clustering capabilities. In addition, we suggest extensions of RGCL and LVQ tha
APA, Harvard, Vancouver, ISO, and other styles
7

Mardhatillah, Elsy. "Teacher’s Reinforcement in English Classroom in MTSS Darul Makmur Sungai Cubadak." Indonesian Research Journal On Education 3, no. 1 (2022): 825–32. http://dx.doi.org/10.31004/irje.v3i1.202.

Full text
Abstract:
This research was due to some problems found in MTsS Darul Makmur. First, some students were not motivated in learning. Second, sometime the teacher still uses Indonesian in giving reinforcements. Third, some Students did not care about the teacher's reinforcement. This study aimed to find out the types of reinforcement used by the teacher. Then, to find out the types of reinforcement often and rarely to be usedby the teacher. Then, to find out the reasons the teacher used certain reinforcements. Last, to find out how the teacher understands the reinforcement. This research used a qualitative
APA, Harvard, Vancouver, ISO, and other styles
8

Liaq, Mudassar, and Yungcheol Byun. "Autonomous UAV Navigation Using Reinforcement Learning." International Journal of Machine Learning and Computing 9, no. 6 (2019): 756–61. http://dx.doi.org/10.18178/ijmlc.2019.9.6.869.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Alrammal, Muath, and Munir Naveed. "Monte-Carlo Based Reinforcement Learning (MCRL)." International Journal of Machine Learning and Computing 10, no. 2 (2020): 227–32. http://dx.doi.org/10.18178/ijmlc.2020.10.2.924.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Nurmuhammet, Abdullayev. "DEEP REINFORCEMENT LEARNING ON STOCK DATA." Alatoo Academic Studies 23, no. 2 (2023): 505–18. http://dx.doi.org/10.17015/aas.2023.232.49.

Full text
Abstract:
This study proposes using Deep Reinforcement Learning (DRL) for stock trading decisions and prediction. DRL is a machine learning technique that enables agents to learn optimal strategies by interacting with their environment. The proposed model surpasses traditional models and can make informed trading decisions in real-time. The study highlights the feasibility of applying DRL in financial markets and its advantages in strategic decision- making. The model's ability to learn from market dynamics makes it a promising approach for stock market forecasting. Overall, this paper provides valuable
APA, Harvard, Vancouver, ISO, and other styles
11

Myers, Catherine. "LEARNING WITH DELAYED REINFORCEMENT THROUGH ATTENTION-DRIVEN BUFFERING." International Journal of Neural Systems 01, no. 04 (1991): 337–46. http://dx.doi.org/10.1142/s0129065791000376.

Full text
Abstract:
Learning with delayed reinforcement refers to situations where the reinforcement to a learning system occurs only at the end of a string of actions or outputs, and it must then be assigned back to the relevant actions. A method for accomplishing this is presented which buffers a small number of past actions based on the unpredictability of or attention to each as it occurs. This approach allows for the buffer size to be small, and yet learning can reach indefinitely far back into the past; it also allows the system to learn when reinforcement is not only delayed but also reinforcements from ot
APA, Harvard, Vancouver, ISO, and other styles
12

Fan, ZiSheng. "An exploration of reinforcement learning and deep reinforcement learning." Applied and Computational Engineering 73, no. 1 (2024): 154–59. http://dx.doi.org/10.54254/2755-2721/73/20240386.

Full text
Abstract:
Today, machine learning is evolving so quickly that new algorithms are always appearing. Deep neural networks in particular have shown positive outcomes in a variety of areas, including computer vision, natural language processing, and time series prediction. Its development moves at a very sluggish pace due to the high threshold. Therefore, a thorough examination of the reinforcement learning field should be required. This essay examines both the deep learning algorithm and the reinforcement learning operational procedure. The study identifies information retrieval, data mining, intelligent s
APA, Harvard, Vancouver, ISO, and other styles
13

Horie, Naoto, Tohgoroh Matsui, Koichi Moriyama, Atsuko Mutoh, and Nobuhiro Inuzuka. "Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning." Artificial Life and Robotics 24, no. 3 (2019): 352–59. http://dx.doi.org/10.1007/s10015-019-00523-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Lee, Dongsu, Chanin Eom, Sungwoo Choi, Sungkwan Kim, and Minhae Kwon. "Survey on Practical Reinforcement Learning : from Imitation Learning to Offline Reinforcement Learning." Journal of Korean Institute of Communications and Information Sciences 48, no. 11 (2023): 1405–17. http://dx.doi.org/10.7840/kics.2023.48.11.1405.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Osogami, Takayuki, and Rudy Raymond. "Determinantal Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 4659–66. http://dx.doi.org/10.1609/aaai.v33i01.33014659.

Full text
Abstract:
We study reinforcement learning for controlling multiple agents in a collaborative manner. In some of those tasks, it is insufficient for the individual agents to take relevant actions, but those actions should also have diversity. We propose the approach of using the determinant of a positive semidefinite matrix to approximate the action-value function in reinforcement learning, where we learn the matrix in a way that it represents the relevance and diversity of the actions. Experimental results show that the proposed approach allows the agents to learn a nearly optimal policy approximately t
APA, Harvard, Vancouver, ISO, and other styles
16

Pateria, Shubham, Budhitama Subagdja, Ah-hwee Tan, and Chai Quek. "Hierarchical Reinforcement Learning." ACM Computing Surveys 54, no. 5 (2021): 1–35. http://dx.doi.org/10.1145/3453160.

Full text
Abstract:
Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the
APA, Harvard, Vancouver, ISO, and other styles
17

Matsui, Tohgoroh. "Compound Reinforcement Learning." Transactions of the Japanese Society for Artificial Intelligence 26 (2011): 330–34. http://dx.doi.org/10.1527/tjsai.26.330.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Daoyi Dong, Chunlin Chen, Hanxiong Li, and Tzyh-Jong Tarn. "Quantum Reinforcement Learning." IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, no. 5 (2008): 1207–20. http://dx.doi.org/10.1109/tsmcb.2008.925743.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Farias, Vivek F., Ciamac C. Moallemi, Benjamin Van Roy, and Tsachy Weissman. "Universal Reinforcement Learning." IEEE Transactions on Information Theory 56, no. 5 (2010): 2441–54. http://dx.doi.org/10.1109/tit.2010.2043762.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Morimoto, Jun, and Kenji Doya. "Robust Reinforcement Learning." Neural Computation 17, no. 2 (2005): 335–59. http://dx.doi.org/10.1162/0899766053011528.

Full text
Abstract:
This letter proposes a new reinforcement learning (RL) paradigm that explicitly takes into account input disturbance as well as modeling errors. The use of environmental models in RL is quite popular for both off-line learning using simulations and for online action planning. However, the difference between the model and the real environment can lead to unpredictable, and often unwanted, results. Based on the theory of H∞ control, we consider a differential game in which a “disturbing” agent tries to make the worst possible disturbance while a “control” agent tries to make the best control inp
APA, Harvard, Vancouver, ISO, and other styles
21

Weiβ, Gerhard. "Distributed reinforcement learning." Robotics and Autonomous Systems 15, no. 1-2 (1995): 135–42. http://dx.doi.org/10.1016/0921-8890(95)00018-b.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Servedio, Maria R., Stein A. Sæther, and Glenn-Peter Sætre. "Reinforcement and learning." Evolutionary Ecology 23, no. 1 (2007): 109–23. http://dx.doi.org/10.1007/s10682-007-9188-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

ANDRECUT, M., and M. K. ALI. "FUZZY REINFORCEMENT LEARNING." International Journal of Modern Physics C 13, no. 05 (2002): 659–74. http://dx.doi.org/10.1142/s0129183102003450.

Full text
Abstract:
Fuzzy logic represents an extension of classical logic, giving modes of approximate reasoning in an environment of uncertainty and imprecision. Fuzzy inference systems incorporates human knowledge into their knowledge base on the conclusions of the fuzzy rules, which are affected by subjective decisions. In this paper we show how the reinforcement learning technique can be used to tune the conclusion part of a fuzzy inference system. The fuzzy reinforcement learning technique is illustrated using two examples: the cart centering problem and the autonomous navigation problem.
APA, Harvard, Vancouver, ISO, and other styles
24

Zhu, Ruoqing, Donglin Zeng, and Michael R. Kosorok. "Reinforcement Learning Trees." Journal of the American Statistical Association 110, no. 512 (2015): 1770–84. http://dx.doi.org/10.1080/01621459.2015.1036994.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Oku, Makito, and Kazuyuki Aihara. "Networked reinforcement learning." Artificial Life and Robotics 13, no. 1 (2008): 112–15. http://dx.doi.org/10.1007/s10015-008-0565-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Barto, Andrew G. "Reinforcement learning control." Current Opinion in Neurobiology 4, no. 6 (1994): 888–93. http://dx.doi.org/10.1016/0959-4388(94)90138-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Hernandez-Orallo, Jose. "Constructive reinforcement learning." International Journal of Intelligent Systems 15, no. 3 (2000): 241–64. http://dx.doi.org/10.1002/(sici)1098-111x(200003)15:3<241::aid-int6>3.0.co;2-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Aydin, Mehmet Emin, Rafet Durgut, and Abdur Rakib. "Why Reinforcement Learning?" Algorithms 17, no. 6 (2024): 269. http://dx.doi.org/10.3390/a17060269.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Muhammad Azhar, Mansoor Ahmed Khuhro, Muhammad Waqas, Umair Saeed, and Mehar Khan Niazi. "Comprehensive Study on Reinforcement Learning and Deep Reinforcement Learning Schemes." Sir Syed University Research Journal of Engineering & Technology 14, no. 2 (2024): 1–6. https://doi.org/10.33317/ssurj.638.

Full text
Abstract:
Reinforcement learning (RL) has emerged as a powerful tool for creating artificial intelligence systems (AIS) and solving problems which require sequential decision-making. Reinforcement learning has achieved some impressive achievements in recent years, surpassing humans in a variety of areas. According to recent research, deep learning (DL) techniques are used with techniques of reinforcement learning to recognize meaningful identification for a problem regarding high dimensional raw data input &amp; enough to solve artificial general intelligence (AGI). In addition to the main concepts, thi
APA, Harvard, Vancouver, ISO, and other styles
30

Schweighofer, Nicolas, and Kenji Doya. "Meta-learning in Reinforcement Learning." Neural Networks 16, no. 1 (2003): 5–9. http://dx.doi.org/10.1016/s0893-6080(02)00228-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Cetin, Edoardo, and Oya Celiktutan. "Learning Pessimism for Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 6 (2023): 6971–79. http://dx.doi.org/10.1609/aaai.v37i6.25852.

Full text
Abstract:
Off-policy deep reinforcement learning algorithms commonly compensate for overestimation bias during temporal-difference learning by utilizing pessimistic estimates of the expected target returns. In this work, we propose Generalized Pessimism Learning (GPL), a strategy employing a novel learnable penalty to enact such pessimism. In particular, we propose to learn this penalty alongside the critic with dual TD-learning, a new procedure to estimate and minimize the magnitude of the target returns bias with trivial computational cost. GPL enables us to accurately counteract overestimation bias t
APA, Harvard, Vancouver, ISO, and other styles
32

Pakhale, Devyani Vinod, and Prof. Samata V. Athawale. "Reinforcement Learning in Machine Learning." International Journal of Ingenious Research, Invention and Development (IJIRID) 3, no. 5 (2024): 531–36. https://doi.org/10.5281/zenodo.14230239.

Full text
Abstract:
Reinforcement Learning (RL) is a branch of machine learning focused on making decisions to maximize cumulative rewards in a given situation. Unlike supervised learning, which relies on a training dataset with predefined answers, RL involves learning through experience. In RL, an agent learns to achieve a goal in an uncertain, potentially complex environment by performing actions and receiving feedback through rewards or penalties. Reinforcement Learning (RL) has emerged as a transformative paradigm in artificial intelligence, enabling agents to learn optimal behaviors through interaction with
APA, Harvard, Vancouver, ISO, and other styles
33

Pusparini, Desy. "Giving Reinforcement with 2.0 Framework by Teacher: A Photovoice of Undergraduate Students in the EFL Classroom." JSSH (Jurnal Sains Sosial dan Humaniora) 3, no. 1 (2019): 21. http://dx.doi.org/10.30595/jssh.v3i1.3841.

Full text
Abstract:
Abstract - Reinforcement has been used in many areas of educational institution. In the learning activity, reinforcements are given by the teacher as feedback for what students have done. By using reinforcement in the learning activity, the students are expected to feel comfortable to show themselves by responding questions, giving feedback, and expressing their opinions in the class. This study aims to investigate the effect of giving reinforcement by the teacher towards student's learning motivation. This research used the photovoice method and SHOWeD Analysis. The participants are 27 studen
APA, Harvard, Vancouver, ISO, and other styles
34

Chakraborty, Montosh, Shivakrishna Gouroju, Pinki Garg, and Karthikeyan P. "PBL: An Effective Method Of Reinforcement Learning." International Journal of Integrative Medical Sciences 2, no. 6 (2015): 134–38. http://dx.doi.org/10.16965/ijims.2015.119.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

De, Ashis, Barun Mazumdar, Aritra Dhabal, Saikat Bhattacharjee, Aridip Maity, and Sourav Bandopadhyay. "Design of PID Controller using Reinforcement Learning." International Journal of Research Publication and Reviews 4, no. 11 (2023): 443–52. http://dx.doi.org/10.55248/gengpi.4.1123.113004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Jha, Ashutosh Chandra. "Automated Firewall Policy Generation with Reinforcement Learning." International journal of IoT 5, no. 1 (2025): 190–211. https://doi.org/10.55640/ijiot-05-01-10.

Full text
Abstract:
Network security would be incomplete without firewalls that control traffic flow through rule-based policies. The manual way to configure and manage firewall rules, however, is prone to various pitfalls; rules tend to become overly complex, human error occurs, and cyber threats continue to evolve. This work investigates the reinforcement learning (RL) - driven method for firewall policy generation, utilizing RL as an automated means for policy generation to increase adaptability and reduce administrative overhead. The proposed system utilizes RL agents that learn an optimal policy from real-ti
APA, Harvard, Vancouver, ISO, and other styles
37

Vafashoar, Reza, and Mohammad Reza Meybodi. "Reinforcement learning in learning automata and cellular learning automata via multiple reinforcement signals." Knowledge-Based Systems 169 (April 2019): 1–27. http://dx.doi.org/10.1016/j.knosys.2019.01.021.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Sachin, Samrat Medavarapu. "Cutting-Edge Developments in Reinforcement Learning Algorithms." Journal of Scientific and Engineering Research 9, no. 6 (2022): 103–7. https://doi.org/10.5281/zenodo.13606561.

Full text
Abstract:
pushing the boundaries of what is possible in areas such as decision making and control. This paper reviews the latest developments in RL algorithms, focusing on their innovative aspects and practical applications. We present a comparative analysis of several state-of-the-art RL algorithms, discuss their strengths and limitations, and propose future research directions to address existing challenges.
APA, Harvard, Vancouver, ISO, and other styles
39

Agrawal, Avinash J., Rashmi R. Welekar, Namita Parati, Pravin R. Satav, Uma Patel Thakur, and Archana V. Potnurwar. "Reinforcement Learning and Advanced Reinforcement Learning to Improve Autonomous Vehicle Planning." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 7s (2023): 652–60. http://dx.doi.org/10.17762/ijritcc.v11i7s.7526.

Full text
Abstract:
Planning for autonomous vehicles is a challenging process that involves navigating through dynamic and unpredictable surroundings while making judgments in real-time. Traditional planning methods sometimes rely on predetermined rules or customized heuristics, which could not generalize well to various driving conditions. In this article, we provide a unique framework to enhance autonomous vehicle planning by fusing conventional RL methods with cutting-edge reinforcement learning techniques. To handle many elements of planning issues, our system integrates cutting-edge algorithms including deep
APA, Harvard, Vancouver, ISO, and other styles
40

Vamvoudakis, Kyriakos G., Yan Wan, and Frank L. Lewis. "Workshop on Distributed Reinforcement Learning and Reinforcement-Learning Games [Conference Reports]." IEEE Control Systems 39, no. 6 (2019): 122–24. http://dx.doi.org/10.1109/mcs.2019.2938053.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Liu, Shiyi. "Research of Multi-agent Deep Reinforcement Learning based on Value Factorization." Highlights in Science, Engineering and Technology 39 (April 1, 2023): 848–54. http://dx.doi.org/10.54097/hset.v39i.6655.

Full text
Abstract:
One of the numerous multi-agents’ deep reinforcements learning methods and a hotspot for research in the field is multi-agent deep reinforcement learning based on value factorization. In order to effectively address the issues of environmental instability and the exponential expansion of action space in multi-agent systems, it uses some constraints to break down the joint action value function of the multi-agent system into a specific combination of individual action value functions. Firstly, in this paper, the reason for the factorization of value function is explained. The fundamentals of mu
APA, Harvard, Vancouver, ISO, and other styles
42

Bae, Jung Ho, Yun-Seong Kang, Sukmin Yoon, Yong-Duk Kim, and Sungho Kim. "Aircraft Reinforcement Learning using Curriculum Learning." Journal of KIISE 48, no. 6 (2021): 707–12. http://dx.doi.org/10.5626/jok.2021.48.6.707.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Matsubara, Takamitsu. "Learning Control Policies by Reinforcement Learning." Journal of the Robotics Society of Japan 36, no. 9 (2018): 597–600. http://dx.doi.org/10.7210/jrsj.36.597.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Fachantidis, Anestis, Matthew Taylor, and Ioannis Vlahavas. "Learning to Teach Reinforcement Learning Agents." Machine Learning and Knowledge Extraction 1, no. 1 (2017): 21–42. http://dx.doi.org/10.3390/make1010002.

Full text
Abstract:
In this article, we study the transfer learning model of action advice under a budget. We focus on reinforcement learning teachers providing action advice to heterogeneous students playing the game of Pac-Man under a limited advice budget. First, we examine several critical factors affecting advice quality in this setting, such as the average performance of the teacher, its variance and the importance of reward discounting in advising. The experiments show that the best performers are not always the best teachers and reveal the non-trivial importance of the coefficient of variation (CV) as a s
APA, Harvard, Vancouver, ISO, and other styles
45

NISHIZAWA, Chieko, and Hirokazu MATSUI. "Reinforcement learning with multiplex learning spaces." Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2016 (2016): 1P1–04b3. http://dx.doi.org/10.1299/jsmermd.2016.1p1-04b3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Kim, Man-Je, Hyunsoo Park, and Chang Wook Ahn. "Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning." Electronics 11, no. 7 (2022): 1069. http://dx.doi.org/10.3390/electronics11071069.

Full text
Abstract:
Control intelligence is a typical field where there is a trade-off between target objectives, and researchers in this field have longed for artificial intelligence that achieves the target objectives. Multi-objective deep reinforcement learning was sufficient to satisfy this need. In particular, multi-objective deep reinforcement learning methods based on policy optimization are leading the optimization of control intelligence. However, multi-objective reinforcement learning has difficulties when finding various Pareto optimals of multi-objectives due to the greedy nature of reinforcement lear
APA, Harvard, Vancouver, ISO, and other styles
47

Li, Chengan. "Research advanced in the integration of federated learning and reinforcement learning." Applied and Computational Engineering 40, no. 1 (2024): 147–54. http://dx.doi.org/10.54254/2755-2721/40/20230641.

Full text
Abstract:
Reinforcement learning (RL) and federated learning (FL) are two important machine learning paradigms. Reinforcement learning is concerned with enabling intelligence to learn optimal policies when interacting with an environment, while federated learning is concerned with collaboratively training models on distributed equipment while preserving data privacy. In recent years, the fusion and complementarity of reinforcement learning, and federated learning have attracted increasing research interest, providing new directions for the development of the machine learning community. Focusing on the i
APA, Harvard, Vancouver, ISO, and other styles
48

White, Devin, Mingkang Wu, Ellen Novoseller, Vernon J. Lawhern, Nicholas Waytowich, and Yongcan Cao. "Rating-Based Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 9 (2024): 10207–15. http://dx.doi.org/10.1609/aaai.v38i9.28886.

Full text
Abstract:
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss funct
APA, Harvard, Vancouver, ISO, and other styles
49

Yücesoy, Yiğit E. Yücesoy, and M. Borahan Tümer. "Hierarchical Reinforcement Learning with Context Detection (HRL-CD)." International Journal of Machine Learning and Computing 5, no. 5 (2015): 353–58. http://dx.doi.org/10.7763/ijmlc.2015.v5.533.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Shrivastava, Soumya. "Role of Reinforcement Learning in Financial Management Strategy." International Journal of Science and Research (IJSR) 11, no. 1 (2022): 1556–62. http://dx.doi.org/10.21275/sr22128210442.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!