Journal articles on the topic 'Policy gradient methods'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 journal articles for your research on the topic 'Policy gradient methods.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.
Peters, Jan. "Policy gradient methods." Scholarpedia 5, no. 11 (2010): 3698. http://dx.doi.org/10.4249/scholarpedia.3698.
Full textCai, Qingpeng, Ling Pan, and Pingzhong Tang. "Deterministic Value-Policy Gradients." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3316–23. http://dx.doi.org/10.1609/aaai.v34i04.5732.
Full textZhang, Matthew S., Murat A. Erdogdu, and Animesh Garg. "Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 9066–73. http://dx.doi.org/10.1609/aaai.v36i8.20891.
Full textAkella, Ravi Tej, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Animashree Anandkumar, and Yisong Yue. "Deep Bayesian Quadrature Policy Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6600–6608. http://dx.doi.org/10.1609/aaai.v35i8.16817.
Full textWang, Lin, Xingang Xu, Xuhui Zhao, Baozhu Li, Ruijuan Zheng, and Qingtao Wu. "A randomized block policy gradient algorithm with differential privacy in Content Centric Networks." International Journal of Distributed Sensor Networks 17, no. 12 (December 2021): 155014772110599. http://dx.doi.org/10.1177/15501477211059934.
Full textLe, Hung, Majid Abdolshah, Thommen K. George, Kien Do, Dung Nguyen, and Svetha Venkatesh. "Episodic Policy Gradient Training." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 7 (June 28, 2022): 7317–25. http://dx.doi.org/10.1609/aaai.v36i7.20694.
Full textCohen, Andrew, Xingye Qiao, Lei Yu, Elliot Way, and Xiangrong Tong. "Diverse Exploration via Conjugate Policies for Policy Gradient Methods." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3404–11. http://dx.doi.org/10.1609/aaai.v33i01.33013404.
Full textZhang, Junzi, Jongho Kim, Brendan O'Donoghue, and Stephen Boyd. "Sample Efficient Reinforcement Learning with REINFORCE." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10887–95. http://dx.doi.org/10.1609/aaai.v35i12.17300.
Full textYu, Hai-Tao, Degen Huang, Fuji Ren, and Lishuang Li. "Diagnostic Evaluation of Policy-Gradient-Based Ranking." Electronics 11, no. 1 (December 23, 2021): 37. http://dx.doi.org/10.3390/electronics11010037.
Full textBaxter, J., and P. L. Bartlett. "Infinite-Horizon Policy-Gradient Estimation." Journal of Artificial Intelligence Research 15 (November 1, 2001): 319–50. http://dx.doi.org/10.1613/jair.806.
Full textZhang, Kaiqing, Alec Koppel, Hao Zhu, and Tamer Başar. "Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies." SIAM Journal on Control and Optimization 58, no. 6 (January 2020): 3586–612. http://dx.doi.org/10.1137/19m1288012.
Full textZhang, Chuheng, Yuanqi Li, and Jian Li. "Policy Search by Target Distribution Learning for Continuous Control." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6770–77. http://dx.doi.org/10.1609/aaai.v34i04.6156.
Full textYang, Long, Yu Zhang, Gang Zheng, Qian Zheng, Pengfei Li, Jianhang Huang, and Gang Pan. "Policy Optimization with Stochastic Mirror Descent." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8823–31. http://dx.doi.org/10.1609/aaai.v36i8.20863.
Full textYing, Donghao, Mengzi Amy Guo, Yuhao Ding, Javad Lavaei, and Zuo-Jun Shen. "Policy-Based Primal-Dual Methods for Convex Constrained Markov Decision Processes." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10963–71. http://dx.doi.org/10.1609/aaai.v37i9.26299.
Full textJiang, Zhanhong, Xian Yeow Lee, Sin Yong Tan, Kai Liang Tan, Aditya Balu, Young M. Lee, Chinmay Hegde, and Soumik Sarkar. "MDPGT: Momentum-Based Decentralized Policy Gradient Tracking." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 9 (June 28, 2022): 9377–85. http://dx.doi.org/10.1609/aaai.v36i9.21169.
Full textMelo, Francisco. "Differential Eligibility Vectors for Advantage Updating and Gradient Methods." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (August 4, 2011): 441–46. http://dx.doi.org/10.1609/aaai.v25i1.7938.
Full textHerrera-Martí, David A. "Policy Gradient Approach to Compilation of Variational Quantum Circuits." Quantum 6 (September 8, 2022): 797. http://dx.doi.org/10.22331/q-2022-09-08-797.
Full textHambly, Ben, Renyuan Xu, and Huining Yang. "Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon." SIAM Journal on Control and Optimization 59, no. 5 (January 2021): 3359–91. http://dx.doi.org/10.1137/20m1382386.
Full textZhao, Feiran, Xingyun Fu, and Keyou You. "Globally Convergent Policy Gradient Methods for Linear Quadratic Control of Partially Observed Systems." IFAC-PapersOnLine 56, no. 2 (2023): 5506–11. http://dx.doi.org/10.1016/j.ifacol.2023.10.208.
Full textChen, Yan, and Tao Li. "Convergence of Policy Gradient Methods for Nash Equilibria in General-sum Stochastic Games." IFAC-PapersOnLine 56, no. 2 (2023): 3435–40. http://dx.doi.org/10.1016/j.ifacol.2023.10.1494.
Full textGiegrich, Michael, Christoph Reisinger, and Yufei Zhang. "Convergence of Policy Gradient Methods for Finite-Horizon Exploratory Linear-Quadratic Control Problems." SIAM Journal on Control and Optimization 62, no. 2 (March 22, 2024): 1060–92. http://dx.doi.org/10.1137/22m1533517.
Full textEcoffet, Paul, Nicolas Fontbonne, Jean-Baptiste André, and Nicolas Bredeche. "Policy search with rare significant events: Choosing the right partner to cooperate with." PLOS ONE 17, no. 4 (April 26, 2022): e0266841. http://dx.doi.org/10.1371/journal.pone.0266841.
Full textEcoffet, Paul, Nicolas Fontbonne, Jean-Baptiste André, and Nicolas Bredeche. "Policy search with rare significant events: Choosing the right partner to cooperate with." PLOS ONE 17, no. 4 (April 26, 2022): e0266841. http://dx.doi.org/10.1371/journal.pone.0266841.
Full textEcoffet, Paul, Nicolas Fontbonne, Jean-Baptiste André, and Nicolas Bredeche. "Policy search with rare significant events: Choosing the right partner to cooperate with." PLOS ONE 17, no. 4 (April 26, 2022): e0266841. http://dx.doi.org/10.1371/journal.pone.0266841.
Full textLi, Shilei, Meng Li, Jiongming Su, Shaofei Chen, Zhimin Yuan, and Qing Ye. "PP-PG: Combining Parameter Perturbation with Policy Gradient Methods for Effective and Efficient Explorations in Deep Reinforcement Learning." ACM Transactions on Intelligent Systems and Technology 12, no. 3 (May 16, 2021): 1–21. http://dx.doi.org/10.1145/3452008.
Full textChen, Haokun, Xinyi Dai, Han Cai, Weinan Zhang, Xuejian Wang, Ruiming Tang, Yuzhou Zhang, and Yong Yu. "Large-Scale Interactive Recommendation with Tree-Structured Policy Gradient." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3312–20. http://dx.doi.org/10.1609/aaai.v33i01.33013312.
Full textLi, Chengzhengxu, Xiaoming Liu, Yichen Wang, Duyi Li, Yu Lan, and Chao Shen. "Dialogue for Prompting: A Policy-Gradient-Based Discrete Prompt Generation for Few-Shot Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 16 (March 24, 2024): 18481–89. http://dx.doi.org/10.1609/aaai.v38i16.29809.
Full textChung, Hoon, Sung Joo Lee, Hyeong Bae Jeon, and Jeon Gue Park. "Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient." Applied Sciences 10, no. 10 (May 20, 2020): 3542. http://dx.doi.org/10.3390/app10103542.
Full textHu, Bin, Kaiqing Zhang, Na Li, Mehran Mesbahi, Maryam Fazel, and Tamer Başar. "Toward a Theoretical Foundation of Policy Optimization for Learning Control Policies." Annual Review of Control, Robotics, and Autonomous Systems 6, no. 1 (May 3, 2023): 123–58. http://dx.doi.org/10.1146/annurev-control-042920-020021.
Full textGuo, Xin, Anran Hu, and Junzi Zhang. "Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6774–82. http://dx.doi.org/10.1609/aaai.v36i6.20633.
Full textLou, Xingzhou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, and Yali Du. "TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 16 (March 24, 2024): 17496–504. http://dx.doi.org/10.1609/aaai.v38i16.29699.
Full textZeng, Fanyu, and Chen Wang. "Visual Navigation with Asynchronous Proximal Policy Optimization in Artificial Agents." Journal of Robotics 2020 (October 14, 2020): 1–7. http://dx.doi.org/10.1155/2020/8702962.
Full textDoya, Kenji. "Reinforcement Learning in Continuous Time and Space." Neural Computation 12, no. 1 (January 1, 2000): 219–45. http://dx.doi.org/10.1162/089976600300015961.
Full textMorimura, Tetsuro, Eiji Uchibe, Junichiro Yoshimoto, Jan Peters, and Kenji Doya. "Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning." Neural Computation 22, no. 2 (February 2010): 342–76. http://dx.doi.org/10.1162/neco.2009.12-08-922.
Full textZhou, Zixian, Mengda Huang, Feiyang Pan, Jia He, Xiang Ao, Dandan Tu, and Qing He. "Gradient-Adaptive Pareto Optimization for Constrained Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 11443–51. http://dx.doi.org/10.1609/aaai.v37i9.26353.
Full textKong, Rui, Chenyang Wu, and Zongzhang Zhang. "Generalizable Policy Improvement via Reinforcement Sampling (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 21 (March 24, 2024): 23546–47. http://dx.doi.org/10.1609/aaai.v38i21.30466.
Full textChen, Tianjian, Zhanpeng He, and Matei Ciocarlie. "Co-designing hardware and control for robot hands." Science Robotics 6, no. 54 (May 12, 2021): eabg2133. http://dx.doi.org/10.1126/scirobotics.abg2133.
Full textVasilaki, Eleni, Nicolas Frémaux, Robert Urbanczik, Walter Senn, and Wulfram Gerstner. "Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail." PLoS Computational Biology 5, no. 12 (December 4, 2009): e1000586. http://dx.doi.org/10.1371/journal.pcbi.1000586.
Full textLincoln, Richard, Stuart Galloway, Bruce Stephen, and Graeme Burt. "Comparing Policy Gradient and Value Function Based Reinforcement Learning Methods in Simulated Electrical Power Trade." IEEE Transactions on Power Systems 27, no. 1 (February 2012): 373–80. http://dx.doi.org/10.1109/tpwrs.2011.2166091.
Full textZhang, Haifei, Jian Xu, and Jianlin Qiu. "An Automatic Driving Control Method Based on Deep Deterministic Policy Gradient." Wireless Communications and Mobile Computing 2022 (January 24, 2022): 1–9. http://dx.doi.org/10.1155/2022/7739440.
Full textYang, Long, Qian Zheng, and Gang Pan. "Sample Complexity of Policy Gradient Finding Second-Order Stationary Points." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10630–38. http://dx.doi.org/10.1609/aaai.v35i12.17271.
Full textWu, Xiaoxia, Yuege Xie, Simon Shaolei Du, and Rachel Ward. "AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 8691–99. http://dx.doi.org/10.1609/aaai.v36i8.20848.
Full textSanghvi, Navyata, Shinnosuke Usami, Mohit Sharma, Joachim Groeger, and Kris Kitani. "Inverse Reinforcement Learning with Explicit Policy Estimates." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 9472–80. http://dx.doi.org/10.1609/aaai.v35i11.17141.
Full textFarsang, Mónika, and Luca Szegletes. "Controlling Agents by Constrained Policy Updates." SYSTEM THEORY, CONTROL AND COMPUTING JOURNAL 1, no. 2 (December 31, 2021): 33–39. http://dx.doi.org/10.52846/stccj.2021.1.2.24.
Full textMutti, Mirco, Lorenzo Pratissoli, and Marcello Restelli. "Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 9028–36. http://dx.doi.org/10.1609/aaai.v35i10.17091.
Full textFosse, E., M. K. Helgesen, S. Hagen, and S. Torp. "Addressing the social determinants of health at the local level: Opportunities and challenges." Scandinavian Journal of Public Health 46, no. 20_suppl (February 2018): 47–52. http://dx.doi.org/10.1177/1403494817743896.
Full textWu, Runjia, Fangqing Gu, Hai-lin Liu, and Hongjian Shi. "UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient." Wireless Communications and Mobile Computing 2022 (March 14, 2022): 1–12. http://dx.doi.org/10.1155/2022/9017079.
Full textZhou, Conghang, Jianxing Li, Yujing Shi, and Zhirui Lin. "Research on Multi-Robot Formation Control Based on MATD3 Algorithm." Applied Sciences 13, no. 3 (January 31, 2023): 1874. http://dx.doi.org/10.3390/app13031874.
Full textGao, Tianhan, Shen Gao, Jun Xu, and Qihui Zhao. "DDRCN: Deep Deterministic Policy Gradient Recommendation Framework Fused with Deep Cross Networks." Applied Sciences 13, no. 4 (February 16, 2023): 2555. http://dx.doi.org/10.3390/app13042555.
Full textLong, Yun, Youfei Lu, Hongwei Zhao, Renbo Wu, Tao Bao, and Jun Liu. "Multilayer Deep Deterministic Policy Gradient for Static Safety and Stability Analysis of Novel Power Systems." International Transactions on Electrical Energy Systems 2023 (April 21, 2023): 1–14. http://dx.doi.org/10.1155/2023/4295384.
Full text