Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Action algorithms.

Статті в журналах з теми "Action algorithms"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Action algorithms".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Moraes, Rubens O., Mario A. Nascimento, and Levi H. S. Lelis. "Asymmetric Action Abstractions for Planning in Real-Time Strategy Games." Journal of Artificial Intelligence Research 75 (November 30, 2022): 1103–37. http://dx.doi.org/10.1613/jair.1.13769.

Повний текст джерела
Анотація:
Action abstractions restrict the number of legal actions available for real-time planning in zero-sum extensive-form games, thus allowing algorithms to focus their search on a set of promising actions. Even though unabstracted game trees can lead to optimal policies, due to real-time constraints and the tree size, they are not a practical choice. In this context, we introduce an action abstraction scheme which we call asymmetric action abstraction. Asymmetric abstractions allow search algorithms to “pay more attention” to some aspects of the game by unevenly dividing the algorithm’s search eff
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Geißer, Florian, David Speck, and Thomas Keller. "Trial-Based Heuristic Tree Search for MDPs with Factored Action Spaces." Proceedings of the International Symposium on Combinatorial Search 11, no. 1 (2021): 38–47. http://dx.doi.org/10.1609/socs.v11i1.18533.

Повний текст джерела
Анотація:
MDPs with factored action spaces, i.e., where actions are described as assignments to a set of action variables, allow reasoning over action variables instead of action states, yet most algorithms only consider a grounded action representation. This includes algorithms that are instantiations of the Trial-based Heuristic Tree Search (THTS) framework, such as AO* or UCT. To be able to reason over factored action spaces, we propose a generalization of THTS where nodes that branch over all applicable actions are replaced with subtrees that consist of nodes that represent the decision for a single
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Gite, Shilpa, and Himanshu Agrawal. "Early Prediction of Driver's Action Using Deep Neural Networks." International Journal of Information Retrieval Research 9, no. 2 (2019): 11–27. http://dx.doi.org/10.4018/ijirr.2019040102.

Повний текст джерела
Анотація:
Intelligent transportation systems (ITSs) are one of the most widely-discussed and researched topic across the world. The researchers have focused on the early prediction of a driver's movements before drivers actually perform actions, which might suggest a driver to take a corrective action while driving and thus, avoid the risk of an accident. This article presents an improved deep-learning technique to predict a driver's action before he performs that action, a few seconds in advance. This is considering both the inside context (of the driver) and the outside context (of the road), and fuse
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Fathi, Yahya, and Craig Tovey. "Affirmative action algorithms." Mathematical Programming 34, no. 3 (1986): 292–301. http://dx.doi.org/10.1007/bf01582232.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Wu, Songjiao. "Image Recognition of Standard Actions in Sports Videos Based on Feature Fusion." Traitement du Signal 38, no. 6 (2021): 1801–7. http://dx.doi.org/10.18280/ts.380624.

Повний текст джерела
Анотація:
Standard actions are crucial to sports training of athletes and daily exercise of ordinary people. There are two key issues in sports action recognition: the extraction of sports action features, and the classification of sports actions. The existing action recognition algorithms cannot work effectively on sports competitions, which feature high complexity, fine class granularity, and fast action speed. To solve the problem, this paper develops an image recognition method of standard actions in sports videos, which merges local and global features. Firstly, the authors combed through the funct
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Moraes, Rubens, Julian Mariño, Levi Lelis, and Mario Nascimento. "Action Abstractions for Combinatorial Multi-Armed Bandit Tree Search." Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 14, no. 1 (2018): 74–80. http://dx.doi.org/10.1609/aiide.v14i1.13018.

Повний текст джерела
Анотація:
Search algorithms based on combinatorial multi-armed bandits (CMABs) are promising for dealing with state-space sequential decision problems. However, current CMAB-based algorithms do not scale to problem domains with very large actions spaces, such as real-time strategy games played in large maps. In this paper we introduce CMAB-based search algorithms that use action abstraction schemes to reduce the action space considered during search. One of the approaches we introduce use regular action abstractions (A1N), while the other two use asymmetric action abstractions (A2N and A3N). Empirical r
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Le, Hai S., Brendan Juba, and Roni Stern. "Learning Safe Action Models with Partial Observability." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 18 (2024): 20159–67. http://dx.doi.org/10.1609/aaai.v38i18.29995.

Повний текст джерела
Анотація:
A common approach for solving planning problems is to model them in a formal language such as the Planning Domain Definition Language (PDDL), and then use an appropriate PDDL planner. Several algorithms for learning PDDL models from observations have been proposed but plans created with these learned models may not be sound. We propose two algorithms for learning PDDL models that are guaranteed to be safe to use even when given observations that include partially observable states. We analyze these algorithms theoretically, characterizing the sample complexity each algorithm requires to guaran
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Hou, Yueqi, Xiaolong Liang, Jiaqiang Zhang, Qisong Yang, Aiwu Yang, and Ning Wang. "Exploring the Use of Invalid Action Masking in Reinforcement Learning: A Comparative Study of On-Policy and Off-Policy Algorithms in Real-Time Strategy Games." Applied Sciences 13, no. 14 (2023): 8283. http://dx.doi.org/10.3390/app13148283.

Повний текст джерела
Анотація:
Invalid action masking is a practical technique in deep reinforcement learning to prevent agents from taking invalid actions. Existing approaches rely on action masking during policy training and utilization. This study focuses on developing reinforcement learning algorithms that incorporate action masking during training but can be used without action masking during policy execution. The study begins by conducting a theoretical analysis to elucidate the distinction between naive policy gradient and invalid action policy gradient. Based on this analysis, we demonstrate that the naive policy gr
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Rani, Seema, and Saurabh Charaya. "Improving the Performance of OLSR in Wireless Networks using Reinforcement Learning Algorithms." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 7s (2023): 166–72. http://dx.doi.org/10.17762/ijritcc.v11i7s.6988.

Повний текст джерела
Анотація:
The Optimized Link State Routing Protocol is a popular proactive routing protocol used in wireless mesh networks. However, like many routing protocols, OLSR can suffer from inefficiencies and suboptimal performance in certain network conditions. To address these issues, researchers have proposed using reinforcement learning algorithms to improve the routing decisions made by OLSR. This paper explores the use of three RL algorithms - Q-Learning, SARSA, and DQN - to improve the performance of OLSR. Each algorithm is described in detail, and their application to OLSR is explained. In particular,
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Yang, Jianhua. "A Deep Learning and Clustering Extraction Mechanism for Recognizing the Actions of Athletes in Sports." Computational Intelligence and Neuroscience 2022 (March 24, 2022): 1–9. http://dx.doi.org/10.1155/2022/2663834.

Повний текст джерела
Анотація:
In sports, the essence of a complete technical action is a complete information structure pattern and the athlete’s judgment of the action is actually the identification of the movement information structure pattern. Action recognition refers to the ability of the human brain to distinguish a perceived action from other actions and obtain predictive response information when it identifies and confirms it according to the constantly changing motion information on the field. Action recognition mainly includes two aspects: one is to obtain the required action information based on visual observati
Стилі APA, Harvard, Vancouver, ISO та ін.
11

Abdallah, S., and V. Lesser. "A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics." Journal of Artificial Intelligence Research 33 (December 17, 2008): 521–49. http://dx.doi.org/10.1613/jair.2628.

Повний текст джерела
Анотація:
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Due to the complexity of the problem, the majority of the previously developed MARL algorithms assumed agents either had some knowledge of the underlying game (such as Nash equilibria) and/or observed other agents actions and the rewards they received. We introduce a new MARL algorithm called the Weighted Policy Learner (WPL), which allows agents to reach a Nash Equilibrium (NE) in benchmark 2-player-2-action games with minimum knowledge. Using WPL, the only feedback an agent needs is
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Raghavan, Aswin, Saket Joshi, Alan Fern, Prasad Tadepalli, and Roni Khardon. "Planning in Factored Action Spaces with Symbolic Dynamic Programming." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 1802–8. http://dx.doi.org/10.1609/aaai.v26i1.8364.

Повний текст джерела
Анотація:
We consider symbolic dynamic programming (SDP) for solving Markov Decision Processes (MDP) with factored state and action spaces, where both states and actions are described by sets of discrete variables. Prior work on SDP has considered only the case of factored states and ignored structure in the action space, causing them to scale poorly in terms of the number of action variables. Our main contribution is to present the first SDP-based planning algorithm for leveraging both state and action space structure in order to compute compactly represented value functions and policies. Since our new
Стилі APA, Harvard, Vancouver, ISO та ін.
13

Alexander-Reindorf, Nii-Emil, and Paul Cotae. "Collaborative Cost Multi-Agent Decision-Making Algorithm with Factored-Value Monte Carlo Tree Search and Max-Plus." Games 14, no. 6 (2023): 75. http://dx.doi.org/10.3390/g14060075.

Повний текст джерела
Анотація:
In this paper, we describe the Factored Value MCTS Hybrid Cost-Max-Plus algorithm, a collection of decision-making algorithms (centralized, decentralized, and hybrid) for a multi-agent system in a collaborative setting that considers action costs. Our proposed algorithm is made up of two steps. In the first step, each agent searches for the best individual actions with the lowest cost using the Monte Carlo Tree Search (MCTS) algorithm. Each agent’s most promising activities are chosen and presented to the team. The Hybrid Cost Max-Plus method is utilized for joint action selection in the secon
Стилі APA, Harvard, Vancouver, ISO та ін.
14

Abdelrazik, Mostafa A., Abdelhaliem Zekry, and Wael A. Mohamed. "Efficient Hybrid Algorithm for Human Action Recognition." Journal of Image and Graphics 11, no. 1 (2023): 72–81. http://dx.doi.org/10.18178/joig.11.1.72-81.

Повний текст джерела
Анотація:
Recently, researchers have sought to find the ideal way to recognize human actions through video using artificial intelligence due to the multiplicity of applications that rely on it in many fields. In general, the methods have been divided into traditional methods and deep learning methods, which have provided a qualitative leap in the field of computer vision. Convolutional neural network CNN and recurrent neural network RNN are the most popular algorithms used with images and video. The researchers combined the two algorithms to search for the best results in a lot of research. In an attemp
Стилі APA, Harvard, Vancouver, ISO та ін.
15

Christiansen, Alan D., and Kenneth Y. Goldberg. "Comparing two algorithms for automatic planning by robots in stochastic environments." Robotica 13, no. 6 (1995): 565–73. http://dx.doi.org/10.1017/s0263574700018646.

Повний текст джерела
Анотація:
SummaryPlanning a sequence of robot actions is especially difficult when the outcome of actions is uncertain, as is inevitable when interacting with the physical environment. In this paper we consider the case of finite state and action spaces where actions can be modeled as Markov transitions. Finding a plan that achieves a desired state with maximum probability is known to be an NP-Complete problem. We consider two algorithms: an exponential-time algorithm that maximizes probability, and a polynomial-time algorithm that maximizes a lower bound on the probability. As these algorithms trade of
Стилі APA, Harvard, Vancouver, ISO та ін.
16

Yu, Xiaoyang, Youfang Lin, Shuo Wang, and Sheng Han. "Solving Action Semantic Conflict in Physically Heterogeneous Multi-Agent Reinforcement Learning with Generalized Action-Prediction Optimization." Applied Sciences 15, no. 5 (2025): 2580. https://doi.org/10.3390/app15052580.

Повний текст джерела
Анотація:
Traditional multi-agent reinforcement learning (MARL) algorithms typically implement global parameter sharing across various types of heterogeneous agents without meticulously differentiating between different action semantics. This approach results in the action semantic conflict problem, which decreases the generalization ability of policy networks across heterogeneous types of agents and decreases the cooperation among agents in intricate scenarios. Conversely, completely independent agent parameters significantly escalate computational costs and training complexity. To address these challe
Стилі APA, Harvard, Vancouver, ISO та ін.
17

Guo, Yifan, and Zhiping Liu. "UAV Path Planning Based on Deep Reinforcement Learning." International Journal of Advanced Network, Monitoring and Controls 8, no. 3 (2023): 81–88. http://dx.doi.org/10.2478/ijanmc-2023-0068.

Повний текст джерела
Анотація:
Abstract Path planning is one of the very important aspects of UAV navigation control, which refers to the UAV searching for an optimal or near-optimal route from the starting point to the end point according to the performance indexes such as time, distance, et al. The path planning problem has a long history and has more abundant algorithms. The path planning problem has a long history and a rich set of algorithms, but most of the current algorithms require a known environment, however, in most cases, the environment model is difficult to describe and obtain, and the algorithms perform less
Стилі APA, Harvard, Vancouver, ISO та ін.
18

Rodrigues, Nelson R. P., Nuno M. C. da Costa, César Melo, et al. "Fusion Object Detection and Action Recognition to Predict Violent Action." Sensors 23, no. 12 (2023): 5610. http://dx.doi.org/10.3390/s23125610.

Повний текст джерела
Анотація:
In the context of Shared Autonomous Vehicles, the need to monitor the environment inside the car will be crucial. This article focuses on the application of deep learning algorithms to present a fusion monitoring solution which was three different algorithms: a violent action detection system, which recognizes violent behaviors between passengers, a violent object detection system, and a lost items detection system. Public datasets were used for object detection algorithms (COCO and TAO) to train state-of-the-art algorithms such as YOLOv5. For violent action detection, the MoLa InCar dataset w
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Wu, Yuchuan, Shengfeng Qi, Feng Hu, Shuangbao Ma, Wen Mao, and Wei Li. "Recognizing activities of the elderly using wearable sensors: a comparison of ensemble algorithms based on boosting." Sensor Review 39, no. 6 (2019): 743–51. http://dx.doi.org/10.1108/sr-11-2018-0309.

Повний текст джерела
Анотація:
Purpose In human action recognition based on wearable sensors, most previous studies have focused on a single type of sensor and single classifier. This study aims to use a wearable sensor based on flexible sensors and a tri-axial accelerometer to collect action data of elderly people. It uses a statistical modeling approach based on the ensemble algorithm to classify actions and verify its validity. Design/methodology/approach Nine types of daily actions were collected by the wearable sensor device from a group of elderly volunteers, and the time-domain features of the action sequences were e
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Amir, E., and A. Chang. "Learning Partially Observable Deterministic Action Models." Journal of Artificial Intelligence Research 33 (November 20, 2008): 349–402. http://dx.doi.org/10.1613/jair.2575.

Повний текст джерела
Анотація:
We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and
Стилі APA, Harvard, Vancouver, ISO та ін.
21

Mudge, Michael E., and J. P. Killingbeck. "Microcomputer Algorithms: Action for Algebra." Mathematical Gazette 76, no. 476 (1992): 305. http://dx.doi.org/10.2307/3619164.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
22

Huang, Pan, Yanping Li, Xiaoyi Lv, Wen Chen, and Shuxian Liu. "Recognition of Common Non-Normal Walking Actions Based on Relief-F Feature Selection and Relief-Bagging-SVM." Sensors 20, no. 5 (2020): 1447. http://dx.doi.org/10.3390/s20051447.

Повний текст джерела
Анотація:
Action recognition algorithms are widely used in the fields of medical health and pedestrian dead reckoning (PDR). The classification and recognition of non-normal walking actions and normal walking actions are very important for improving the accuracy of medical health indicators and PDR steps. Existing motion recognition algorithms focus on the recognition of normal walking actions, and the recognition of non-normal walking actions common to daily life is incomplete or inaccurate, resulting in a low overall recognition accuracy. This paper proposes a microelectromechanical system (MEMS) acti
Стилі APA, Harvard, Vancouver, ISO та ін.
23

Kim, Beomjoon, Kyungjae Lee, Sungbin Lim, Leslie Kaelbling, and Tomas Lozano-Perez. "Monte Carlo Tree Search in Continuous Spaces Using Voronoi Optimistic Optimization with Regret Bounds." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (2020): 9916–24. http://dx.doi.org/10.1609/aaai.v34i06.6546.

Повний текст джерела
Анотація:
Many important applications, including robotics, data-center management, and process control, require planning action sequences in domains with continuous state and action spaces and discontinuous objective functions. Monte Carlo tree search (MCTS) is an effective strategy for planning in discrete action spaces. We provide a novel MCTS algorithm (voot) for deterministic environments with continuous action spaces, which, in turn, is based on a novel black-box function-optimization algorithm (voo) to efficiently sample actions. The voo algorithm uses Voronoi partitioning to guide sampling, and i
Стилі APA, Harvard, Vancouver, ISO та ін.
24

Abduljabbar Ali, Mohammed, Abir Jaafar Hussain, and Ahmed T. Sadiq. "Deep Learning Algorithms for Human Fighting Action Recognition." International Journal of Online and Biomedical Engineering (iJOE) 18, no. 02 (2022): 71–87. http://dx.doi.org/10.3991/ijoe.v18i02.28019.

Повний текст джерела
Анотація:
— Human action recognition using skeletons has been employed in various applications, including healthcare robots, human-computer interaction, and surveillance systems. Recently, deep learning systems have been used in various applications, such as object classification. In contrast to conventional techniques, one of the most prominent convolutional neural network deep learning algorithms extracts image features from its operations. Machine learning in computer vision applications faces many challenges, including human action recognition in real time. Despite significant improvements, videos a
Стилі APA, Harvard, Vancouver, ISO та ін.
25

Xuan, Zifeng, Yunfei Liu, and Xinxin Peng. "Improved Q-learning Algorithm to Solve the Permutation Flow Shop Scheduling Problem." International Journal of Mechanical and Electrical Engineering 2, no. 3 (2024): 63–68. http://dx.doi.org/10.62051/ijmee.v2n3.07.

Повний текст джерела
Анотація:
A modified Q-learning algorithm is proposed for the permutation flow shop scheduling problem. This algorithm initializes the environment with the job sequence and considers each processable job as an executable action. A reward function is defined as the reciprocal of the completion time. Moreover, the completion time is calculated using the principle of diagonalization of a two-dimensional matrix, significantly enhancing computational efficiency. The Boltzmann action exploration strategy is designed, where the probability of selecting an action decreases as the temperature coefficient T decre
Стилі APA, Harvard, Vancouver, ISO та ін.
26

Chen, Chen, Hongyao Tang, Jianye Hao, Wulong Liu, and Zhaopeng Meng. "Addressing Action Oscillations through Learning Policy Inertia." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (2021): 7020–27. http://dx.doi.org/10.1609/aaai.v35i8.16864.

Повний текст джерела
Анотація:
Deep reinforcement learning (DRL) algorithms have been demonstrated to be effective on a wide range of challenging decision making and control tasks. However, these methods typically suffer from severe action oscillations in particular in discrete action setting, which means that agents select different actions within consecutive steps even though states only slightly differ. This issue is often neglected since we usually evaluate the quality of a policy using cumulative rewards only. Action oscillation strongly affects the user experience and even causes serious potential security menace espe
Стилі APA, Harvard, Vancouver, ISO та ін.
27

Langlois, Eric D., and Tom Everitt. "How RL Agents Behave When Their Actions Are Modified." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 13 (2021): 11586–94. http://dx.doi.org/10.1609/aaai.v35i13.17378.

Повний текст джерела
Анотація:
Reinforcement learning in complex environments may require supervision to prevent the agent from attempting dangerous actions. As a result of supervisor intervention, the executed action may differ from the action specified by the policy. How does this affect learning? We present the Modified-Action Markov Decision Process, an extension of the MDP model that allows actions to differ from the policy. We analyze the asymptotic behaviours of common reinforcement learning algorithms in this setting and show that they adapt in different ways: some completely ignore modifications while others go to
Стилі APA, Harvard, Vancouver, ISO та ін.
28

Lee, Joongkyu, Seung Joon Park, Yunhao Tang, and Min-hwan Oh. "Learning Uncertainty-Aware Temporally-Extended Actions." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (2024): 13391–99. http://dx.doi.org/10.1609/aaai.v38i12.29241.

Повний текст джерела
Анотація:
In reinforcement learning, temporal abstraction in the action space, exemplified by action repetition, is a technique to facilitate policy learning through extended actions. However, a primary limitation in previous studies of action repetition is its potential to degrade performance, particularly when sub-optimal actions are repeated. This issue often negates the advantages of action repetition. To address this, we propose a novel algorithm named Uncertainty-aware Temporal Extension (UTE). UTE employs ensemble methods to accurately measure uncertainty during action extension. This feature all
Стилі APA, Harvard, Vancouver, ISO та ін.
29

Rai, Ankush, and Jagadeesh Kannan R. "A REVIEW ON MACHINE LEARNING ALGORITHMS ON HUMAN ACTION RECOGNITION." Asian Journal of Pharmaceutical and Clinical Research 10, no. 13 (2017): 406. http://dx.doi.org/10.22159/ajpcr.2017.v10s1.19977.

Повний текст джерела
Анотація:
Human action recognition is a vital field of computer vision research. Its applications incorporate observation frameworks, patient monitoring frameworks, and an assortment of frameworks that include interactions between persons and electronic gadgets, for example, human-computer interfaces. The vast majority of these applications require an automated recognition of abnormal or anomalistic action states, made out of various straightforward (or nuclear) actions of persons. This study gives an overview of different best in class research papers on human movement recognition. Open datasets intend
Стилі APA, Harvard, Vancouver, ISO та ін.
30

Yamauchi, Sho, and Keiji Suzuki. "Algorithm for Base Action Set Generation Focusing on Undiscovered Sensor Values." Applied Sciences 9, no. 1 (2019): 161. http://dx.doi.org/10.3390/app9010161.

Повний текст джерела
Анотація:
Previous machine learning algorithms use a given base action set designed by hand or enable locomotion for a complicated task through trial and error processes with a sophisticated reward function. These generated actions are designed for a specific task, which makes it difficult to apply them to other tasks. This paper proposes an algorithm to obtain a base action set that does not depend on specific tasks and that is usable universally. The proposed algorithm enables as much interoperability among multiple tasks and machine learning methods as possible. A base action set that effectively cha
Стилі APA, Harvard, Vancouver, ISO та ін.
31

Yuan, Yuyu, Pengqian Zhao, Ting Guo, and Hongpu Jiang. "Counterfactual-Based Action Evaluation Algorithm in Multi-Agent Reinforcement Learning." Applied Sciences 12, no. 7 (2022): 3439. http://dx.doi.org/10.3390/app12073439.

Повний текст джерела
Анотація:
Multi-agent reinforcement learning (MARL) algorithms have made great achievements in various scenarios, but there are still many problems in solving sequential social dilemmas (SSDs). In SSDs, the agent’s actions not only change the instantaneous state of the environment but also affect the latent state which will, in turn, affect all agents. However, most of the current reinforcement learning algorithms focus on analyzing the value of instantaneous environment state while ignoring the study of the latent state, which is very important for establishing cooperation. Therefore, we propose a nove
Стилі APA, Harvard, Vancouver, ISO та ін.
32

Qiu, Xianxu, Haiming Huang, Weiwei Chen, Qiuzhen Lin, Wei-Neng Chen, and Fuchun Sun. "Evolutionary Reinforcement Learning with Parameterized Action Primitives for Diverse Manipulation Tasks." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 14 (2025): 14655–63. https://doi.org/10.1609/aaai.v39i14.33606.

Повний текст джерела
Анотація:
Reinforcement learning (RL) has shown promising performance in tackling robotic manipulation tasks (RMTs), which require learning a prolonged sequence of manipulation actions to control robots efficiently. However, most RL algorithms often suffer from two problems when solving RMTs: inefficient exploration due to the extremely large action space and catastrophic forgetting due to the poor sampling efficiency. To alleviate these problems, this paper introduces an Evolutionary Reinforcement Learning algorithm with parameterized Action Primitives, called ERLAP, which combines the advantages of an
Стилі APA, Harvard, Vancouver, ISO та ін.
33

Li, Lei, and Tingting Yang. "Reconstruction of physical dance teaching content and movement recognition based on a machine learning model." 3C TIC: Cuadernos de desarrollo aplicados a las TIC 12, no. 1 (2023): 267–85. http://dx.doi.org/10.17993/3ctic.2023.121.267-285.

Повний текст джерела
Анотація:
With the technological development of movement recognition based on machine learning model algorithms, the content and movements for physical dance teaching are also seeking changes and innovations. In this paper, a set of three-dimensional convolutional neural network recognition algorithms based on a machine learning model is constructed through the collection to recognition of sports dance movement data. By collecting the skeleton information of typical movements of physical dance, a typical movement dataset of physical dance is constructed, which is recognized by the improved 3D convolutio
Стилі APA, Harvard, Vancouver, ISO та ін.
34

Bonet, Blai, and Hector Geffner. "Action Selection for MDPs: Anytime AO* Versus UCT." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 1749–55. http://dx.doi.org/10.1609/aaai.v26i1.8369.

Повний текст джерела
Анотація:
In the presence of non-admissible heuristics, A* and other best-first algorithms can be converted into anytime optimal algorithms over OR graphs, by simply continuing the search after the first solution is found. The same trick, however, does not work for best-first algorithms over AND/OR graphs, that must be able to expand leaf nodes of the explicit graph that are not necessarily part of the best partial solution. Anytime optimal variants of AO* must thus address an exploration-exploitation tradeoff: they cannot just ”exploit”, they must keep exploring as well. In this work, we develop one su
Стилі APA, Harvard, Vancouver, ISO та ін.
35

Lee, Jongmin, Wonseok Jeon, Geon-Hyeong Kim, and Kee-Eung Kim. "Monte-Carlo Tree Search in Continuous Action Spaces with Value Gradients." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (2020): 4561–68. http://dx.doi.org/10.1609/aaai.v34i04.5885.

Повний текст джерела
Анотація:
Monte-Carlo Tree Search (MCTS) is the state-of-the-art online planning algorithm for large problems with discrete action spaces. However, many real-world problems involve continuous action spaces, where MCTS is not as effective as in discrete action spaces. This is mainly due to common practices such as coarse discretization of the entire action space and failure to exploit local smoothness. In this paper, we introduce Value-Gradient UCT (VG-UCT), which combines traditional MCTS with gradient-based optimization of action particles. VG-UCT simultaneously performs a global search via UCT with re
Стилі APA, Harvard, Vancouver, ISO та ін.
36

Avoundjian, Tigran, Julia C. Dombrowski, Matthew R. Golden, et al. "Comparing Methods for Record Linkage for Public Health Action: Matching Algorithm Validation Study." JMIR Public Health and Surveillance 6, no. 2 (2020): e15917. http://dx.doi.org/10.2196/15917.

Повний текст джерела
Анотація:
Background Many public health departments use record linkage between surveillance data and external data sources to inform public health interventions. However, little guidance is available to inform these activities, and many health departments rely on deterministic algorithms that may miss many true matches. In the context of public health action, these missed matches lead to missed opportunities to deliver interventions and may exacerbate existing health inequities. Objective This study aimed to compare the performance of record linkage algorithms commonly used in public health practice. Me
Стилі APA, Harvard, Vancouver, ISO та ін.
37

Wadhai, Prajwal Ashok. "Algolizer Using ReactJS." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 04 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem30733.

Повний текст джерела
Анотація:
The Algorithm Visualizer Project is an interactive and educational tool designed to illustrate various algorithms' functionality and efficiency through visual representations. Algorithms are fundamental to computer science, but their abstract nature can be challenging to comprehend. This project aims to bridge that gap by providing a user-friendly interface that visually demonstrates algorithms in action. The visualizer offers a platform where users can select from a range of algorithms, such as sorting (e.g., Bubble Sort, Merge Sort). Each algorithm is showcased step-by-step, allowing users t
Стилі APA, Harvard, Vancouver, ISO та ін.
38

Gillies, E. A., A. G. Y. Johnston, and C. R. McInnes. "Action Selection Algorithms for Autonomous Microspacecraft." Journal of Guidance, Control, and Dynamics 22, no. 6 (1999): 914–16. http://dx.doi.org/10.2514/2.4473.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
39

WIESE, U. J. "CLUSTER ALGORITHM SOLUTION OF SIGN AND COMPLEX ACTION PROBLEMS." International Journal of Modern Physics B 17, no. 28 (2003): 5435–47. http://dx.doi.org/10.1142/s0217979203020545.

Повний текст джерела
Анотація:
Numerical simulations of numerous quantum systems suffer from notorious sign or complex action problems. In such cases, the Boltzmann factors contributing to the path integral are in general not positive. As a consequence, standard Monte Carlo algorithms based on importance sampling fail. Meron-cluster algorithms realize a general strategy for solving sign problems by canceling explicitly all negative contributions. The remaining uncancelled positive contributions are then generated using importance sampling. The general nature of the sign problem is discussed and its solution with a meron-clu
Стилі APA, Harvard, Vancouver, ISO та ін.
40

Sharp, Graham R. "Algorithmic Recognition of Group Actions on Orbitals." LMS Journal of Computation and Mathematics 2 (1999): 1–27. http://dx.doi.org/10.1112/s146115700000005x.

Повний текст джерела
Анотація:
AbstractAn algorithm is given tjat recognises (in O(lN2 log N) time, where N is the size of the input and l the depth of a precalculated Schreier tree) when a transitive group, (G, Ω) is the action on one orbit of the action of G on the set Γ(2) of ordered pairs of distinct elements of some G-set Γ (that us, Ωis isomorphic to an orbital of (G,Γ)). This may be adapted to list all essentially different such actions in O(lN4log N)time, where N is the sum of sizes of the input and output. This will be a useful tool for reducing the degree of a permutation group as an aid to further study of the gr
Стилі APA, Harvard, Vancouver, ISO та ін.
41

Liu, Jinsong, Chenghan Xie, Qi Deng, Dongdong Ge, and Yinyu Ye. "Sketched Newton Value Iteration for Large-Scale Markov Decision Processes." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (2024): 13936–44. http://dx.doi.org/10.1609/aaai.v38i12.29301.

Повний текст джерела
Анотація:
Value Iteration (VI) is one of the most classic algorithms for solving Markov Decision Processes (MDPs), which lays the foundations for various more advanced reinforcement learning algorithms, such as Q-learning. VI may take a large number of iterations to converge as it is a first-order method. In this paper, we introduce the Newton Value Iteration (NVI) algorithm, which eliminates the impact of action space dimension compared to some previous second-order methods. Consequently, NVI can efficiently handle MDPs with large action spaces. Building upon NVI, we propose a novel approach called Ske
Стилі APA, Harvard, Vancouver, ISO та ін.
42

Wang, Yu, Xiaoqing Chen, Jiaoqun Li, and Zengxiang Lu. "Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition: Enabling Miner Unsafe Action Recognition." Sensors 24, no. 14 (2024): 4557. http://dx.doi.org/10.3390/s24144557.

Повний текст джерела
Анотація:
The unsafe action of miners is one of the main causes of mine accidents. Research on underground miner unsafe action recognition based on computer vision enables relatively accurate real-time recognition of unsafe action among underground miners. A dataset called unsafe actions of underground miners (UAUM) was constructed and included ten categories of such actions. Underground images were enhanced using spatial- and frequency-domain enhancement algorithms. A combination of the YOLOX object detection algorithm and the Lite-HRNet human key-point detection algorithm was utilized to obtain skelet
Стилі APA, Harvard, Vancouver, ISO та ін.
43

Merlis, Nadav, and Shie Mannor. "Lenient Regret for Multi-Armed Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (2021): 8950–57. http://dx.doi.org/10.1609/aaai.v35i10.17082.

Повний текст джерела
Анотація:
We consider the Multi-Armed Bandit (MAB) problem, where an agent sequentially chooses actions and observes rewards for the actions it took. While the majority of algorithms try to minimize the regret, i.e., the cumulative difference between the reward of the best action and the agent's action, this criterion might lead to undesirable results. For example, in large problems, or when the interaction with the environment is brief, finding an optimal arm is infeasible, and regret-minimizing algorithms tend to over-explore. To overcome this issue, algorithms for such settings should instead focus o
Стилі APA, Harvard, Vancouver, ISO та ін.
44

Bequette, B. Wayne. "Glucose Clamp Algorithms and Insulin Time-Action Profiles." Journal of Diabetes Science and Technology 3, no. 5 (2009): 1005–13. http://dx.doi.org/10.1177/193229680900300503.

Повний текст джерела
Анотація:
Motivation: Most current insulin pumps include an insulin-on-board (IOB) feature to help subjects avoid problems associated with “insulin stacking.” In addition, many control algorithms proposed for a closed-loop artificial pancreas make use of IOB to reduce the probability of hypoglycemic events that often occur due to the integral action of the controller. The IOB curves are generated from the pharmacodynamic (time-activity profiles) actions of subcutaneous insulin, which are obtained from glycemic clamp studies. Methods: Glycemic clamp algorithms are reviewed and in silico studies are perfo
Стилі APA, Harvard, Vancouver, ISO та ін.
45

Ke, Fengyi, and Qian Zhang. "Research on aerobics action modal recognition algorithm based on fuzzy system and reinforcement learning." Molecular & Cellular Biomechanics 21, no. 3 (2024): 645. http://dx.doi.org/10.62617/mcb645.

Повний текст джерела
Анотація:
Nowadays, human movement recognition technology has received a high degree of attention and has been used in a variety of fields such as intelligent security and motion analysis. The traditional action recognition method relies on artificial extraction of features, not only the recognition efficiency is low, and the recognition accuracy is not high, has been unable to meet the requirements of action recognition. The action recognition method based on reinforcement learning can automatically extract features, greatly simplifying the process of manual feature extraction in the traditional method
Стилі APA, Harvard, Vancouver, ISO та ін.
46

Yang, Hangqi. "Analysis and study on path planning algorithms in the further mobile action." Journal of Physics: Conference Series 2824, no. 1 (2024): 012006. http://dx.doi.org/10.1088/1742-6596/2824/1/012006.

Повний текст джерела
Анотація:
Abstract This study investigates the use of four common path planning algorithms - RRT (Rapidly-Exploring Random Tree), Dijkstra, A*, and ACO (Ant Colony Optimization) - for autonomous endpoint search in pre-defined grid mazes. Path planning is a crucial problem in robot navigation, involving the ability to find optimal paths with the minimum running time in complex environments. In this research, it firstly introduces the basic principles and mechanisms of each algorithm. RRT is a random sampling-based algorithm that explores the search space by generating a tree-like structure randomly. Base
Стилі APA, Harvard, Vancouver, ISO та ін.
47

Jiménez, Sergio, Anders Jonsson, and Héctor Palacios. "Temporal Planning With Required Concurrency Using Classical Planning." Proceedings of the International Conference on Automated Planning and Scheduling 25 (April 8, 2015): 129–37. http://dx.doi.org/10.1609/icaps.v25i1.13731.

Повний текст джерела
Анотація:
In this paper we describe two novel algorithms for temporal planning. The first algorithm, TP, is an adaptation of the TEMPO algorithm. It compiles each temporal action into two classical actions, corresponding to the start and end of the temporal action, but handles the temporal constraints on actions through a modification of the Fast Downward planning system. The second algorithm, TPSHE, is a pure compilation from temporal to classical planning for the case in which required concurrency only appears in the form of single hard envelopes. We describe novel classes of temporal planning instanc
Стилі APA, Harvard, Vancouver, ISO та ін.
48

Niazi, Abdolkarim, Norizah Redzuan, Raja Ishak Raja Hamzah, and Sara Esfandiari. "Improvement on Supporting Machine Learning Algorithm for Solving Problem in Immediate Decision Making." Advanced Materials Research 566 (September 2012): 572–79. http://dx.doi.org/10.4028/www.scientific.net/amr.566.572.

Повний текст джерела
Анотація:
In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the ne
Стилі APA, Harvard, Vancouver, ISO та ін.
49

., Mehvish, and Ravinder Pal Singh. "Random Forest and Extreme Learning Machine Algorithms for High Accuracy Credit Card Fraud Detection." International Journal for Research in Applied Science and Engineering Technology 11, no. 9 (2023): 892–95. http://dx.doi.org/10.22214/ijraset.2023.55752.

Повний текст джерела
Анотація:
Abstract: The banking sector is facing a huge issue with credit card fraud, and research has shown that machine learning algorithms are a useful tool for identifying fraudulent actions of this kind. In this investigation, we offer a method for detecting fraudulent use of credit cards that makes use of a hybrid of two machine learning algorithms known as Random Forest (RF) and Extreme Learning Machine (ELM). We compiled a dataset using information obtained from a wide variety of sources, and then we preprocessed it to eliminate any inconsistencies and errors. Following this, the RF and ELM algo
Стилі APA, Harvard, Vancouver, ISO та ін.
50

Mukherjee, Shohin, and Maxim Likhachev. "GePA*SE: Generalized Edge-Based Parallel A* for Slow Evaluations." Proceedings of the International Symposium on Combinatorial Search 16, no. 1 (2023): 153–57. http://dx.doi.org/10.1609/socs.v16i1.27295.

Повний текст джерела
Анотація:
Parallel search algorithms have been shown to improve planning speed by harnessing the multithreading capability of modern processors. One such algorithm PA*SE achieves this by parallelizing state expansions, whereas another algorithm ePA*SE achieves this by effectively parallelizing edge evaluations. ePA*SE targets domains in which the action space comprises actions with expensive but similar evaluation times. However, in a number of robotics domains, the action space is heterogenous in the computational effort required to evaluate the cost of an action and its outcome. Motivated by this, we
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!