Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: Action algorithms.

Статті в журналах з теми "Action algorithms"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "Action algorithms".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Moraes, Rubens O., Mario A. Nascimento, and Levi H. S. Lelis. "Asymmetric Action Abstractions for Planning in Real-Time Strategy Games." Journal of Artificial Intelligence Research 75 (November 30, 2022): 1103–37. http://dx.doi.org/10.1613/jair.1.13769.

Повний текст джерела
Анотація:
Action abstractions restrict the number of legal actions available for real-time planning in zero-sum extensive-form games, thus allowing algorithms to focus their search on a set of promising actions. Even though unabstracted game trees can lead to optimal policies, due to real-time constraints and the tree size, they are not a practical choice. In this context, we introduce an action abstraction scheme which we call asymmetric action abstraction. Asymmetric abstractions allow search algorithms to “pay more attention” to some aspects of the game by unevenly dividing the algorithm’s search effort amongst different aspects of the game. We also introduce four algorithms that search in asymmetrically abstracted game trees to evaluate the effectiveness of our abstraction schemes. Two of our algorithms are adaptations of algorithms developed for searching in action-abstracted spaces, Portfolio Greedy Search and Stratified Strategy Selection, and the other two are adaptations of an algorithm developed for searching in unabstracted spaces, NaïveMCTS. An extensive set of experiments in a real-time strategy game shows that search algorithms using asymmetric abstractions are able to outperform all other search algorithms tested.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Geißer, Florian, David Speck, and Thomas Keller. "Trial-Based Heuristic Tree Search for MDPs with Factored Action Spaces." Proceedings of the International Symposium on Combinatorial Search 11, no. 1 (2021): 38–47. http://dx.doi.org/10.1609/socs.v11i1.18533.

Повний текст джерела
Анотація:
MDPs with factored action spaces, i.e., where actions are described as assignments to a set of action variables, allow reasoning over action variables instead of action states, yet most algorithms only consider a grounded action representation. This includes algorithms that are instantiations of the Trial-based Heuristic Tree Search (THTS) framework, such as AO* or UCT. To be able to reason over factored action spaces, we propose a generalization of THTS where nodes that branch over all applicable actions are replaced with subtrees that consist of nodes that represent the decision for a single action variable. We show that many THTS algorithms retain their theoretical properties under the generalised framework, and show how to approximate any state-action heuristic to a heuristic for partial action assignments. This allows to guide a UCT variant that is able to create exponentially fewer nodes than the same algorithm that considers ground actions. An empirical evaluation on the benchmark set of the probabilistic track of the latest International Planning Competition validates the benefits of the approach.
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Gite, Shilpa, and Himanshu Agrawal. "Early Prediction of Driver's Action Using Deep Neural Networks." International Journal of Information Retrieval Research 9, no. 2 (2019): 11–27. http://dx.doi.org/10.4018/ijirr.2019040102.

Повний текст джерела
Анотація:
Intelligent transportation systems (ITSs) are one of the most widely-discussed and researched topic across the world. The researchers have focused on the early prediction of a driver's movements before drivers actually perform actions, which might suggest a driver to take a corrective action while driving and thus, avoid the risk of an accident. This article presents an improved deep-learning technique to predict a driver's action before he performs that action, a few seconds in advance. This is considering both the inside context (of the driver) and the outside context (of the road), and fuses them together to anticipate the actions. To predict the driver's action accurately, the proposed work is inspired by recent developments in recurrent neural networks (RNN) with long short term memory (LSTM) algorithms. The performance merit of the proposed algorithm is compared with four other algorithms and the results suggest that the proposed algorithm outperforms the other algorithms using a range of performance metrics.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Fathi, Yahya, and Craig Tovey. "Affirmative action algorithms." Mathematical Programming 34, no. 3 (1986): 292–301. http://dx.doi.org/10.1007/bf01582232.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Wu, Songjiao. "Image Recognition of Standard Actions in Sports Videos Based on Feature Fusion." Traitement du Signal 38, no. 6 (2021): 1801–7. http://dx.doi.org/10.18280/ts.380624.

Повний текст джерела
Анотація:
Standard actions are crucial to sports training of athletes and daily exercise of ordinary people. There are two key issues in sports action recognition: the extraction of sports action features, and the classification of sports actions. The existing action recognition algorithms cannot work effectively on sports competitions, which feature high complexity, fine class granularity, and fast action speed. To solve the problem, this paper develops an image recognition method of standard actions in sports videos, which merges local and global features. Firstly, the authors combed through the functions and performance required for the recognition of standard actions of sports, and proposed an attention-based local feature extraction algorithm for the frames of sports match videos. Next, a sampling algorithm was developed based on time-space compression, and a standard sports action recognition algorithm was designed based on time-space feature fusion, with the aim to fuse the time-space features of the standard actions in sports match videos, and to overcome the underfitting problem of direct fusion of time-space features extracted by the attention mechanism. The workflow of these algorithms was explained in details. Experimental results confirm the effectiveness of our approach.
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Moraes, Rubens, Julian Mariño, Levi Lelis, and Mario Nascimento. "Action Abstractions for Combinatorial Multi-Armed Bandit Tree Search." Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 14, no. 1 (2018): 74–80. http://dx.doi.org/10.1609/aiide.v14i1.13018.

Повний текст джерела
Анотація:
Search algorithms based on combinatorial multi-armed bandits (CMABs) are promising for dealing with state-space sequential decision problems. However, current CMAB-based algorithms do not scale to problem domains with very large actions spaces, such as real-time strategy games played in large maps. In this paper we introduce CMAB-based search algorithms that use action abstraction schemes to reduce the action space considered during search. One of the approaches we introduce use regular action abstractions (A1N), while the other two use asymmetric action abstractions (A2N and A3N). Empirical results on MicroRTS show that A1N, A2N, and A3N are able to outperform an existing CMAB-based algorithm in matches played in large maps, and A3N is able to outperform all state-of-the-art search algorithms tested.
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Le, Hai S., Brendan Juba, and Roni Stern. "Learning Safe Action Models with Partial Observability." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 18 (2024): 20159–67. http://dx.doi.org/10.1609/aaai.v38i18.29995.

Повний текст джерела
Анотація:
A common approach for solving planning problems is to model them in a formal language such as the Planning Domain Definition Language (PDDL), and then use an appropriate PDDL planner. Several algorithms for learning PDDL models from observations have been proposed but plans created with these learned models may not be sound. We propose two algorithms for learning PDDL models that are guaranteed to be safe to use even when given observations that include partially observable states. We analyze these algorithms theoretically, characterizing the sample complexity each algorithm requires to guarantee probabilistic completeness. We also show experimentally that our algorithms are often better than FAMA, a state-of-the-art PDDL learning algorithm.
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Hou, Yueqi, Xiaolong Liang, Jiaqiang Zhang, Qisong Yang, Aiwu Yang, and Ning Wang. "Exploring the Use of Invalid Action Masking in Reinforcement Learning: A Comparative Study of On-Policy and Off-Policy Algorithms in Real-Time Strategy Games." Applied Sciences 13, no. 14 (2023): 8283. http://dx.doi.org/10.3390/app13148283.

Повний текст джерела
Анотація:
Invalid action masking is a practical technique in deep reinforcement learning to prevent agents from taking invalid actions. Existing approaches rely on action masking during policy training and utilization. This study focuses on developing reinforcement learning algorithms that incorporate action masking during training but can be used without action masking during policy execution. The study begins by conducting a theoretical analysis to elucidate the distinction between naive policy gradient and invalid action policy gradient. Based on this analysis, we demonstrate that the naive policy gradient is a valid gradient and is equivalent to the proposed composite objective algorithm, which optimizes both the masked policy and the original policy in parallel. Moreover, we propose an off-policy algorithm for invalid action masking that employs the masked policy for sampling while optimizing the original policy. To compare the effectiveness of these algorithms, experiments are conducted using a simplified real-time strategy (RTS) game simulator called Gym-μRTS. Based on empirical findings, we recommend utilizing the off-policy algorithm for addressing most tasks while employing the composite objective algorithm for handling more complex tasks.
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Rani, Seema, and Saurabh Charaya. "Improving the Performance of OLSR in Wireless Networks using Reinforcement Learning Algorithms." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 7s (2023): 166–72. http://dx.doi.org/10.17762/ijritcc.v11i7s.6988.

Повний текст джерела
Анотація:
The Optimized Link State Routing Protocol is a popular proactive routing protocol used in wireless mesh networks. However, like many routing protocols, OLSR can suffer from inefficiencies and suboptimal performance in certain network conditions. To address these issues, researchers have proposed using reinforcement learning algorithms to improve the routing decisions made by OLSR. This paper explores the use of three RL algorithms - Q-Learning, SARSA, and DQN - to improve the performance of OLSR. Each algorithm is described in detail, and their application to OLSR is explained. In particular, the network is represented as a Markov decision process, where each node is a state, and each link between nodes is an action. The reward for taking an action is determined by the quality of the link, and the goal is to maximize the cumulative reward over a sequence of actions. Q-Learning is a simple and effective algorithm that estimates the value of each possible action in a given state. SARSA is a similar algorithm that takes into account the current policy when estimating the value of each action. DQN uses a neural network to approximate the Q-values of each action in a given state, providing more accurate estimates in complex network environments. Overall, all three RL algorithms can be used to improve the routing decisions made by OLSR. This paper provides a comprehensive overview of the application of RL algorithms to OLSR and highlights the potential benefits of using these algorithms to improve the performance of wireless networks.
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Yang, Jianhua. "A Deep Learning and Clustering Extraction Mechanism for Recognizing the Actions of Athletes in Sports." Computational Intelligence and Neuroscience 2022 (March 24, 2022): 1–9. http://dx.doi.org/10.1155/2022/2663834.

Повний текст джерела
Анотація:
In sports, the essence of a complete technical action is a complete information structure pattern and the athlete’s judgment of the action is actually the identification of the movement information structure pattern. Action recognition refers to the ability of the human brain to distinguish a perceived action from other actions and obtain predictive response information when it identifies and confirms it according to the constantly changing motion information on the field. Action recognition mainly includes two aspects: one is to obtain the required action information based on visual observation and the other is to judge the action based on the obtained action information, but the neuropsychological mechanism of this process is still unknown. In this paper, a new key frame extraction method based on the clustering algorithm and multifeature fusion is proposed for sports videos with complex content, many scenes, and rich actions. First, a variety of features are fused, and then, similarity measurement can be used to describe videos with complex content more completely and comprehensively; second, a clustering algorithm is used to cluster sports video sequences according to scenes, eliminating the need for shots in the case of many scenes. It is difficult and complicated to detect segmentation; third, extracting key frames according to the minimum motion standard can more accurately represent the video content with rich actions. At the same time, the clustering algorithm used in this paper is improved to enhance the offline computing efficiency of the key frame extraction system. Based on the analysis of the advantages and disadvantages of the classical convolutional neural network and recurrent neural network algorithms in deep learning, this paper proposes an improved convolutional network and optimization based on the recognition and analysis of human actions under complex scenes, complex actions, and fast motion compared to post-neural network and hybrid neural network algorithm. Experiments show that the algorithm achieves similar human observation of athletes’ training execution and completion. Compared with other algorithms, it has been verified that it has very high learning rate and accuracy for the athlete’s action recognition.
Стилі APA, Harvard, Vancouver, ISO та ін.
11

Abdallah, S., and V. Lesser. "A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics." Journal of Artificial Intelligence Research 33 (December 17, 2008): 521–49. http://dx.doi.org/10.1613/jair.2628.

Повний текст джерела
Анотація:
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Due to the complexity of the problem, the majority of the previously developed MARL algorithms assumed agents either had some knowledge of the underlying game (such as Nash equilibria) and/or observed other agents actions and the rewards they received. We introduce a new MARL algorithm called the Weighted Policy Learner (WPL), which allows agents to reach a Nash Equilibrium (NE) in benchmark 2-player-2-action games with minimum knowledge. Using WPL, the only feedback an agent needs is its own local reward (the agent does not observe other agents actions or rewards). Furthermore, WPL does not assume that agents know the underlying game or the corresponding Nash Equilibrium a priori. We experimentally show that our algorithm converges in benchmark two-player-two-action games. We also show that our algorithm converges in the challenging Shapley's game where previous MARL algorithms failed to converge without knowing the underlying game or the NE. Furthermore, we show that WPL outperforms the state-of-the-art algorithms in a more realistic setting of 100 agents interacting and learning concurrently. An important aspect of understanding the behavior of a MARL algorithm is analyzing the dynamics of the algorithm: how the policies of multiple learning agents evolve over time as agents interact with one another. Such an analysis not only verifies whether agents using a given MARL algorithm will eventually converge, but also reveals the behavior of the MARL algorithm prior to convergence. We analyze our algorithm in two-player-two-action games and show that symbolically proving WPL's convergence is difficult, because of the non-linear nature of WPL's dynamics, unlike previous MARL algorithms that had either linear or piece-wise-linear dynamics. Instead, we numerically solve WPL's dynamics differential equations and compare the solution to the dynamics of previous MARL algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Raghavan, Aswin, Saket Joshi, Alan Fern, Prasad Tadepalli, and Roni Khardon. "Planning in Factored Action Spaces with Symbolic Dynamic Programming." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 1802–8. http://dx.doi.org/10.1609/aaai.v26i1.8364.

Повний текст джерела
Анотація:
We consider symbolic dynamic programming (SDP) for solving Markov Decision Processes (MDP) with factored state and action spaces, where both states and actions are described by sets of discrete variables. Prior work on SDP has considered only the case of factored states and ignored structure in the action space, causing them to scale poorly in terms of the number of action variables. Our main contribution is to present the first SDP-based planning algorithm for leveraging both state and action space structure in order to compute compactly represented value functions and policies. Since our new algorithm can potentially require more space than when action structure is ignored, our second contribution is to describe an approach for smoothly trading-off space versus time via recursive conditioning. Finally, our third contribution is to introduce a novel SDP approximation that often significantly reduces planning time with little loss in quality by exploiting action structure in weakly coupled MDPs. We present empirical results in three domains with factored action spaces that show that our algorithms scale much better with the number of action variables as compared to state-of-the-art SDP algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
13

Alexander-Reindorf, Nii-Emil, and Paul Cotae. "Collaborative Cost Multi-Agent Decision-Making Algorithm with Factored-Value Monte Carlo Tree Search and Max-Plus." Games 14, no. 6 (2023): 75. http://dx.doi.org/10.3390/g14060075.

Повний текст джерела
Анотація:
In this paper, we describe the Factored Value MCTS Hybrid Cost-Max-Plus algorithm, a collection of decision-making algorithms (centralized, decentralized, and hybrid) for a multi-agent system in a collaborative setting that considers action costs. Our proposed algorithm is made up of two steps. In the first step, each agent searches for the best individual actions with the lowest cost using the Monte Carlo Tree Search (MCTS) algorithm. Each agent’s most promising activities are chosen and presented to the team. The Hybrid Cost Max-Plus method is utilized for joint action selection in the second step. The Hybrid Cost Max-Plus algorithm improves the well-known centralized and distributed Max-Plus algorithm by incorporating the cost of actions in agent interactions. The Max-Plus algorithm employed the Coordination Graph framework, which exploits agent dependencies to decompose the global payoff function as the sum of local terms. In terms of the number of agents and their interactions, the suggested Factored Value MCTS-Hybrid Cost-Max-Plus method is online, anytime, distributed, and scalable. Our contribution competes with state-of-the-art methodologies and algorithms by leveraging the locality of agent interactions for planning and acting utilizing MCTS and Max-Plus algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
14

Abdelrazik, Mostafa A., Abdelhaliem Zekry, and Wael A. Mohamed. "Efficient Hybrid Algorithm for Human Action Recognition." Journal of Image and Graphics 11, no. 1 (2023): 72–81. http://dx.doi.org/10.18178/joig.11.1.72-81.

Повний текст джерела
Анотація:
Recently, researchers have sought to find the ideal way to recognize human actions through video using artificial intelligence due to the multiplicity of applications that rely on it in many fields. In general, the methods have been divided into traditional methods and deep learning methods, which have provided a qualitative leap in the field of computer vision. Convolutional neural network CNN and recurrent neural network RNN are the most popular algorithms used with images and video. The researchers combined the two algorithms to search for the best results in a lot of research. In an attempt to obtain improved results in motion recognition through video, we present in this paper a combined algorithm, which is divided into two main parts, CNN and RNN. In the first part there is a preprocessing stage to make the video frame suitable for the input of both CNN networks which consist of a fusion of Inception-ResNet-V2 and GoogleNet to obtain activations, with the previously trained wights in Inception-ResNet-V2 and GoogleNet and then passed to a deep Gated Recurrent Units (GRU) connected to a fully connected SoftMax layer to recognize and distinguish the human action in the video. The results show that the proposed algorithm gives better accuracy of 97.97% with the UCF101 dataset and 73.12% in the hdmb51 data set compared to those present in the related literature.
Стилі APA, Harvard, Vancouver, ISO та ін.
15

Christiansen, Alan D., and Kenneth Y. Goldberg. "Comparing two algorithms for automatic planning by robots in stochastic environments." Robotica 13, no. 6 (1995): 565–73. http://dx.doi.org/10.1017/s0263574700018646.

Повний текст джерела
Анотація:
SummaryPlanning a sequence of robot actions is especially difficult when the outcome of actions is uncertain, as is inevitable when interacting with the physical environment. In this paper we consider the case of finite state and action spaces where actions can be modeled as Markov transitions. Finding a plan that achieves a desired state with maximum probability is known to be an NP-Complete problem. We consider two algorithms: an exponential-time algorithm that maximizes probability, and a polynomial-time algorithm that maximizes a lower bound on the probability. As these algorithms trade off plan time for plan quality, we compare their performance on a mechanical system for orienting parts. Our results lead us to identify two properties of stochastic actions that can be used to choose between these planning algorithms for other applications.
Стилі APA, Harvard, Vancouver, ISO та ін.
16

Yu, Xiaoyang, Youfang Lin, Shuo Wang, and Sheng Han. "Solving Action Semantic Conflict in Physically Heterogeneous Multi-Agent Reinforcement Learning with Generalized Action-Prediction Optimization." Applied Sciences 15, no. 5 (2025): 2580. https://doi.org/10.3390/app15052580.

Повний текст джерела
Анотація:
Traditional multi-agent reinforcement learning (MARL) algorithms typically implement global parameter sharing across various types of heterogeneous agents without meticulously differentiating between different action semantics. This approach results in the action semantic conflict problem, which decreases the generalization ability of policy networks across heterogeneous types of agents and decreases the cooperation among agents in intricate scenarios. Conversely, completely independent agent parameters significantly escalate computational costs and training complexity. To address these challenges, we introduce an adaptive MARL algorithm named Generalized Action-Prediction Optimization (GAPO). First, we introduce the Generalized Action Space (GAS), which represents the union of all agent actions with distinct semantics. All agents first compute their unified representation in the GAS, and then generate their heterogeneous action policies with different available action masks. Second, in order to further improve cooperation between heterogeneous groups, we propose a Cross-Group Prediction (CGP) loss, which adaptively predicts the action policies of other groups by leveraging trajectory information. We integrate the GAPO into both value-based and policy-based MARL algorithms, giving rise to two practical algorithms: G-QMIX and G-MAPPO. Experimental results obtained within the SMAC, MPE, MAMuJoCo, and RPE environments demonstrate the superiority of G-QMIX and G-MAPPO over several state-of-the-art MARL methods, validating the effectiveness of our proposed adaptive generalized MARL approach.
Стилі APA, Harvard, Vancouver, ISO та ін.
17

Guo, Yifan, and Zhiping Liu. "UAV Path Planning Based on Deep Reinforcement Learning." International Journal of Advanced Network, Monitoring and Controls 8, no. 3 (2023): 81–88. http://dx.doi.org/10.2478/ijanmc-2023-0068.

Повний текст джерела
Анотація:
Abstract Path planning is one of the very important aspects of UAV navigation control, which refers to the UAV searching for an optimal or near-optimal route from the starting point to the end point according to the performance indexes such as time, distance, et al. The path planning problem has a long history and has more abundant algorithms. The path planning problem has a long history and a rich set of algorithms, but most of the current algorithms require a known environment, however, in most cases, the environment model is difficult to describe and obtain, and the algorithms perform less satisfactorily. To address the above problems, this paper proposes a UAV path planning method based on deep reinforcement learning algorithm. Based on the OpenAI-GYM architecture, a 3D map environment model is constructed, with the map grid as the state set and 26 actions as the action set, which does not need an environment model and relies on its own interaction with the environment to complete the path planning task. The algorithm is based on stochastic process theory, modeling the path planning problem as a Markov Decision Process (MDP), fitting the UAV path planning decision function and state-action function, and designing the DQN algorithm model according to the state space, action space and network structure. The algorithm enables the intelligences to carry out strategy iteration efficiently. Through simulation, the DQN algorithm is verified to avoid obstacles and complete the path planning task in only about 160 rounds, which validates the effectiveness of the proposed path planning algorithm.
Стилі APA, Harvard, Vancouver, ISO та ін.
18

Rodrigues, Nelson R. P., Nuno M. C. da Costa, César Melo, et al. "Fusion Object Detection and Action Recognition to Predict Violent Action." Sensors 23, no. 12 (2023): 5610. http://dx.doi.org/10.3390/s23125610.

Повний текст джерела
Анотація:
In the context of Shared Autonomous Vehicles, the need to monitor the environment inside the car will be crucial. This article focuses on the application of deep learning algorithms to present a fusion monitoring solution which was three different algorithms: a violent action detection system, which recognizes violent behaviors between passengers, a violent object detection system, and a lost items detection system. Public datasets were used for object detection algorithms (COCO and TAO) to train state-of-the-art algorithms such as YOLOv5. For violent action detection, the MoLa InCar dataset was used to train on state-of-the-art algorithms such as I3D, R(2+1)D, SlowFast, TSN, and TSM. Finally, an embedded automotive solution was used to demonstrate that both methods are running in real-time.
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Wu, Yuchuan, Shengfeng Qi, Feng Hu, Shuangbao Ma, Wen Mao, and Wei Li. "Recognizing activities of the elderly using wearable sensors: a comparison of ensemble algorithms based on boosting." Sensor Review 39, no. 6 (2019): 743–51. http://dx.doi.org/10.1108/sr-11-2018-0309.

Повний текст джерела
Анотація:
Purpose In human action recognition based on wearable sensors, most previous studies have focused on a single type of sensor and single classifier. This study aims to use a wearable sensor based on flexible sensors and a tri-axial accelerometer to collect action data of elderly people. It uses a statistical modeling approach based on the ensemble algorithm to classify actions and verify its validity. Design/methodology/approach Nine types of daily actions were collected by the wearable sensor device from a group of elderly volunteers, and the time-domain features of the action sequences were extracted. The dimensionality of the feature vectors was reduced by linear discriminant analysis. An ensemble learning method based on XGBoost was used to build a model of elderly action recognition. Its performance was compared with the action recognition rate of other algorithms based on the Boosting algorithm, and with the accuracy of single classifier models. Findings The effectiveness of the method was validated by three experiments. The results show that XGBoost is able to classify nine daily actions of the elderly and achieve an average recognition rate of 94.8 per cent, which is superior to single classifiers and to other ensemble algorithms. Practical implications The research could have important implications for health care, including the treatment and rehabilitation of the elderly, and the prevention of falls. Originality/value Instead of using a single type of sensor, this research used a wearable sensor to obtain daily action data of the elderly. The results show that, by using the appropriate method, the device can obtain detailed data of joint action at a low cost. Comparing differences in performance, it was concluded that XGBoost is the most suitable algorithm for building a model of elderly action recognition. This method, together with a wearable sensor, can provide key data and accurate feedback information to monitor the elderly in their rehabilitation activities.
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Amir, E., and A. Chang. "Learning Partially Observable Deterministic Action Models." Journal of Artificial Intelligence Research 33 (November 20, 2008): 349–402. http://dx.doi.org/10.1613/jair.2575.

Повний текст джерела
Анотація:
We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.
Стилі APA, Harvard, Vancouver, ISO та ін.
21

Mudge, Michael E., and J. P. Killingbeck. "Microcomputer Algorithms: Action for Algebra." Mathematical Gazette 76, no. 476 (1992): 305. http://dx.doi.org/10.2307/3619164.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
22

Huang, Pan, Yanping Li, Xiaoyi Lv, Wen Chen, and Shuxian Liu. "Recognition of Common Non-Normal Walking Actions Based on Relief-F Feature Selection and Relief-Bagging-SVM." Sensors 20, no. 5 (2020): 1447. http://dx.doi.org/10.3390/s20051447.

Повний текст джерела
Анотація:
Action recognition algorithms are widely used in the fields of medical health and pedestrian dead reckoning (PDR). The classification and recognition of non-normal walking actions and normal walking actions are very important for improving the accuracy of medical health indicators and PDR steps. Existing motion recognition algorithms focus on the recognition of normal walking actions, and the recognition of non-normal walking actions common to daily life is incomplete or inaccurate, resulting in a low overall recognition accuracy. This paper proposes a microelectromechanical system (MEMS) action recognition method based on Relief-F feature selection and relief-bagging-support vector machine (SVM). Feature selection using the Relief-F algorithm reduces the dimensions by 16 and reduces the optimization time by an average of 9.55 s. Experiments show that the improved algorithm for identifying non-normal walking actions has an accuracy of 96.63%; compared with Decision Tree (DT), it increased by 11.63%; compared with k-nearest neighbor (KNN), it increased by 26.62%; and compared with random forest (RF), it increased by 11.63%. The average Area Under Curve (AUC) of the improved algorithm improved by 0.1143 compared to KNN, by 0.0235 compared to DT, and by 0.04 compared to RF.
Стилі APA, Harvard, Vancouver, ISO та ін.
23

Kim, Beomjoon, Kyungjae Lee, Sungbin Lim, Leslie Kaelbling, and Tomas Lozano-Perez. "Monte Carlo Tree Search in Continuous Spaces Using Voronoi Optimistic Optimization with Regret Bounds." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (2020): 9916–24. http://dx.doi.org/10.1609/aaai.v34i06.6546.

Повний текст джерела
Анотація:
Many important applications, including robotics, data-center management, and process control, require planning action sequences in domains with continuous state and action spaces and discontinuous objective functions. Monte Carlo tree search (MCTS) is an effective strategy for planning in discrete action spaces. We provide a novel MCTS algorithm (voot) for deterministic environments with continuous action spaces, which, in turn, is based on a novel black-box function-optimization algorithm (voo) to efficiently sample actions. The voo algorithm uses Voronoi partitioning to guide sampling, and is particularly efficient in high-dimensional spaces. The voot algorithm has an instance of voo at each node in the tree. We provide regret bounds for both algorithms and demonstrate their empirical effectiveness in several high-dimensional problems including two difficult robotics planning problems.
Стилі APA, Harvard, Vancouver, ISO та ін.
24

Abduljabbar Ali, Mohammed, Abir Jaafar Hussain, and Ahmed T. Sadiq. "Deep Learning Algorithms for Human Fighting Action Recognition." International Journal of Online and Biomedical Engineering (iJOE) 18, no. 02 (2022): 71–87. http://dx.doi.org/10.3991/ijoe.v18i02.28019.

Повний текст джерела
Анотація:
— Human action recognition using skeletons has been employed in various applications, including healthcare robots, human-computer interaction, and surveillance systems. Recently, deep learning systems have been used in various applications, such as object classification. In contrast to conventional techniques, one of the most prominent convolutional neural network deep learning algorithms extracts image features from its operations. Machine learning in computer vision applications faces many challenges, including human action recognition in real time. Despite significant improvements, videos are typically shot with at least 24 frames per second, meaning that the fastest classification technologies take time. Object detection algorithms must correctly identify and locate essential items, but they must also be speedy at prediction time to meet the real-time requirements of video processing. The fundamental goal of this research paper is to recognize the real-time state of human fighting to provide security in organizations by discovering and identifying problems through video surveillance. First, the images in the videos are investigated to locate human fight scenes using the YOLOv3 algorithm, which has been updated in this work. Our improvements to the YOLOv3 algorithm allowed us to accelerate the exploration of a group of humans in the images. The center locator feature in this algorithm was adopted as an essential indicator for measuring the safety distance between two persons. If it is less than a specific value specified in the code, they are tracked. Then, a deep sorting algorithm is used to track people. This framework is filtered to process and classify whether these two people continue to exceed the programmatically defined minimum safety distance. Finally, the content of the filter frame is categorized as combat scenes using the OpenPose technology and a trained VGG-16 algorithm, which classifies the situation as walking, hugging, or fighting. A dataset was created to train these algorithms in the three categories of walking, hugging, and fighting. The proposed methodology proved successful, exhibiting a classification accuracy for walking, hugging, and fighting of 95.0%, 87.4%, and 90.1%, respectively.
Стилі APA, Harvard, Vancouver, ISO та ін.
25

Xuan, Zifeng, Yunfei Liu, and Xinxin Peng. "Improved Q-learning Algorithm to Solve the Permutation Flow Shop Scheduling Problem." International Journal of Mechanical and Electrical Engineering 2, no. 3 (2024): 63–68. http://dx.doi.org/10.62051/ijmee.v2n3.07.

Повний текст джерела
Анотація:
A modified Q-learning algorithm is proposed for the permutation flow shop scheduling problem. This algorithm initializes the environment with the job sequence and considers each processable job as an executable action. A reward function is defined as the reciprocal of the completion time. Moreover, the completion time is calculated using the principle of diagonalization of a two-dimensional matrix, significantly enhancing computational efficiency. The Boltzmann action exploration strategy is designed, where the probability of selecting an action decreases as the temperature coefficient T decreases, and the probability of randomly selecting an action decreases, favoring the selection of actions corresponding to larger Q values. Finally, the performance of the proposed algorithm is validated using instances of permutation flow shop scheduling problems of different scales. By comparing the results with standard instances and other algorithms, the accuracy of the algorithm is demonstrated.
Стилі APA, Harvard, Vancouver, ISO та ін.
26

Chen, Chen, Hongyao Tang, Jianye Hao, Wulong Liu, and Zhaopeng Meng. "Addressing Action Oscillations through Learning Policy Inertia." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (2021): 7020–27. http://dx.doi.org/10.1609/aaai.v35i8.16864.

Повний текст джерела
Анотація:
Deep reinforcement learning (DRL) algorithms have been demonstrated to be effective on a wide range of challenging decision making and control tasks. However, these methods typically suffer from severe action oscillations in particular in discrete action setting, which means that agents select different actions within consecutive steps even though states only slightly differ. This issue is often neglected since we usually evaluate the quality of a policy using cumulative rewards only. Action oscillation strongly affects the user experience and even causes serious potential security menace especially in real-world domains with the main concern of safety, such as autonomous driving. In this paper, we introduce Policy Inertia Controller (PIC) which serves as a generic plug-in framework to off-the-shelf DRL algorithms, to enable adaptive balance between the optimality and smoothness in a formal way. We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates.Further, we derive a practical DRL algorithm, namely Nested Soft Actor-Critic. Experiments on a collection of autonomous driving tasks and several Atari games suggest that our approach demonstrates substantial oscillation reduction than a range of commonly adopted baselines with almost no performance degradation.
Стилі APA, Harvard, Vancouver, ISO та ін.
27

Langlois, Eric D., and Tom Everitt. "How RL Agents Behave When Their Actions Are Modified." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 13 (2021): 11586–94. http://dx.doi.org/10.1609/aaai.v35i13.17378.

Повний текст джерела
Анотація:
Reinforcement learning in complex environments may require supervision to prevent the agent from attempting dangerous actions. As a result of supervisor intervention, the executed action may differ from the action specified by the policy. How does this affect learning? We present the Modified-Action Markov Decision Process, an extension of the MDP model that allows actions to differ from the policy. We analyze the asymptotic behaviours of common reinforcement learning algorithms in this setting and show that they adapt in different ways: some completely ignore modifications while others go to various lengths in trying to avoid action modifications that decrease reward. By choosing the right algorithm, developers can prevent their agents from learning to circumvent interruptions or constraints, and better control agent responses to other kinds of action modification, like self-damage.
Стилі APA, Harvard, Vancouver, ISO та ін.
28

Lee, Joongkyu, Seung Joon Park, Yunhao Tang, and Min-hwan Oh. "Learning Uncertainty-Aware Temporally-Extended Actions." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (2024): 13391–99. http://dx.doi.org/10.1609/aaai.v38i12.29241.

Повний текст джерела
Анотація:
In reinforcement learning, temporal abstraction in the action space, exemplified by action repetition, is a technique to facilitate policy learning through extended actions. However, a primary limitation in previous studies of action repetition is its potential to degrade performance, particularly when sub-optimal actions are repeated. This issue often negates the advantages of action repetition. To address this, we propose a novel algorithm named Uncertainty-aware Temporal Extension (UTE). UTE employs ensemble methods to accurately measure uncertainty during action extension. This feature allows policies to strategically choose between emphasizing exploration or adopting an uncertainty-averse approach, tailored to their specific needs. We demonstrate the effectiveness of UTE through experiments in Gridworld and Atari 2600 environments. Our findings show that UTE outperforms existing action repetition algorithms, effectively mitigating their inherent limitations and significantly enhancing policy learning efficiency.
Стилі APA, Harvard, Vancouver, ISO та ін.
29

Rai, Ankush, and Jagadeesh Kannan R. "A REVIEW ON MACHINE LEARNING ALGORITHMS ON HUMAN ACTION RECOGNITION." Asian Journal of Pharmaceutical and Clinical Research 10, no. 13 (2017): 406. http://dx.doi.org/10.22159/ajpcr.2017.v10s1.19977.

Повний текст джерела
Анотація:
Human action recognition is a vital field of computer vision research. Its applications incorporate observation frameworks, patient monitoring frameworks, and an assortment of frameworks that include interactions between persons and electronic gadgets, for example, human-computer interfaces. The vast majority of these applications require an automated recognition of abnormal or anomalistic action states, made out of various straightforward (or nuclear) actions of persons. This study gives an overview of different best in class research papers on human movement recognition. Open datasets intended for the assessment of the recognition procedures are also discussed in this paper too, for comparing results of several methodologies on this datasets. We examine both the approaches produced for basic human actions and those for abnormal action states. These methodologies are taxonomically classified based on looking at the points of interest and constraints of every methodology. Space-time volume approaches and sequential methodologies that represent actions and perceive such action sets straightforwardly from images are discussed. Next, hierarchical recognition approaches for abnormal action states are introduced and looked at. Statistics based methodologies, syntactic methodologies, and description based methodologies for hierarchical recognition is examined in the paper.
Стилі APA, Harvard, Vancouver, ISO та ін.
30

Yamauchi, Sho, and Keiji Suzuki. "Algorithm for Base Action Set Generation Focusing on Undiscovered Sensor Values." Applied Sciences 9, no. 1 (2019): 161. http://dx.doi.org/10.3390/app9010161.

Повний текст джерела
Анотація:
Previous machine learning algorithms use a given base action set designed by hand or enable locomotion for a complicated task through trial and error processes with a sophisticated reward function. These generated actions are designed for a specific task, which makes it difficult to apply them to other tasks. This paper proposes an algorithm to obtain a base action set that does not depend on specific tasks and that is usable universally. The proposed algorithm enables as much interoperability among multiple tasks and machine learning methods as possible. A base action set that effectively changes the external environment was chosen as a candidate. The algorithm obtains this base action set on the basis of the hypothesis that an action to effectively change the external environment can be found by observing events to find undiscovered sensor values. The process of obtaining a base action set was validated through a simulation experiment with a differential wheeled robot.
Стилі APA, Harvard, Vancouver, ISO та ін.
31

Yuan, Yuyu, Pengqian Zhao, Ting Guo, and Hongpu Jiang. "Counterfactual-Based Action Evaluation Algorithm in Multi-Agent Reinforcement Learning." Applied Sciences 12, no. 7 (2022): 3439. http://dx.doi.org/10.3390/app12073439.

Повний текст джерела
Анотація:
Multi-agent reinforcement learning (MARL) algorithms have made great achievements in various scenarios, but there are still many problems in solving sequential social dilemmas (SSDs). In SSDs, the agent’s actions not only change the instantaneous state of the environment but also affect the latent state which will, in turn, affect all agents. However, most of the current reinforcement learning algorithms focus on analyzing the value of instantaneous environment state while ignoring the study of the latent state, which is very important for establishing cooperation. Therefore, we propose a novel counterfactual reasoning-based multi-agent reinforcement learning algorithm to evaluate the continuous contribution of agent actions on the latent state. We compute that using simulation reasoning and building an action evaluation network. Then through counterfactual reasoning, we can get a single agent’s influence on the environment. Using this continuous contribution as an intrinsic reward enables the agent to consider the collective, thereby promoting cooperation. We conduct experiments in the SSDs environment, and the results show that the collective reward is increased by at least 25% which demonstrates the excellent performance of our proposed algorithm compared to the state-of-the-art algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
32

Qiu, Xianxu, Haiming Huang, Weiwei Chen, Qiuzhen Lin, Wei-Neng Chen, and Fuchun Sun. "Evolutionary Reinforcement Learning with Parameterized Action Primitives for Diverse Manipulation Tasks." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 14 (2025): 14655–63. https://doi.org/10.1609/aaai.v39i14.33606.

Повний текст джерела
Анотація:
Reinforcement learning (RL) has shown promising performance in tackling robotic manipulation tasks (RMTs), which require learning a prolonged sequence of manipulation actions to control robots efficiently. However, most RL algorithms often suffer from two problems when solving RMTs: inefficient exploration due to the extremely large action space and catastrophic forgetting due to the poor sampling efficiency. To alleviate these problems, this paper introduces an Evolutionary Reinforcement Learning algorithm with parameterized Action Primitives, called ERLAP, which combines the advantages of an evolutionary algorithm (EA) and hierarchical RL (HRL) to solve diverse RMTs. A library of heterogeneous action primitives is constructed in HRL to enhance the exploration efficiency of robots and dual populations with new evolutionary operators are run in EA to optimize these primitive sequences, which can diversify the distribution of replay buffer and avoid catastrophic forgetting. The experiments show that ERLAP outperforms four state-of-the-art RL algorithms in simulated RMTs with dense rewards and can effectively avoid catastrophic forgetting in a set of more challenging simulated RMTs with sparse rewards.
Стилі APA, Harvard, Vancouver, ISO та ін.
33

Li, Lei, and Tingting Yang. "Reconstruction of physical dance teaching content and movement recognition based on a machine learning model." 3C TIC: Cuadernos de desarrollo aplicados a las TIC 12, no. 1 (2023): 267–85. http://dx.doi.org/10.17993/3ctic.2023.121.267-285.

Повний текст джерела
Анотація:
With the technological development of movement recognition based on machine learning model algorithms, the content and movements for physical dance teaching are also seeking changes and innovations. In this paper, a set of three-dimensional convolutional neural network recognition algorithms based on a machine learning model is constructed through the collection to recognition of sports dance movement data. By collecting the skeleton information of typical movements of physical dance, a typical movement dataset of physical dance is constructed, which is recognized by the improved 3D convolutional neural network recognition algorithm under the machine learning model, and the method is validated on the public dataset. The experimental results show that the 3D CNNs in this paper can produce relatively satisfactory results for sports dance action recognition with high accuracy of action recognition, which verifies the feasibility of the 3D convolutional neural network action recognition algorithm under the machine learning model for the acquisition to recognition of sports dance actions. It illustrates that the future can be better to open a new direction of physical dance education content through machine learning models in this form.
Стилі APA, Harvard, Vancouver, ISO та ін.
34

Bonet, Blai, and Hector Geffner. "Action Selection for MDPs: Anytime AO* Versus UCT." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 1749–55. http://dx.doi.org/10.1609/aaai.v26i1.8369.

Повний текст джерела
Анотація:
In the presence of non-admissible heuristics, A* and other best-first algorithms can be converted into anytime optimal algorithms over OR graphs, by simply continuing the search after the first solution is found. The same trick, however, does not work for best-first algorithms over AND/OR graphs, that must be able to expand leaf nodes of the explicit graph that are not necessarily part of the best partial solution. Anytime optimal variants of AO* must thus address an exploration-exploitation tradeoff: they cannot just ”exploit”, they must keep exploring as well. In this work, we develop one such variant of AO* and apply it to finite-horizon MDPs. This Anytime AO* algorithm eventually delivers an optimal policy while using non-admissible random heuristics that can be sampled, as when the heuristic is the cost of a base policy that can be sampled with rollouts. We then test Anytime AO* for action selection over large infinite-horizon MDPs that cannot be solved with existing off-line heuristic search and dynamic programming algorithms, and compare it with UCT.
Стилі APA, Harvard, Vancouver, ISO та ін.
35

Lee, Jongmin, Wonseok Jeon, Geon-Hyeong Kim, and Kee-Eung Kim. "Monte-Carlo Tree Search in Continuous Action Spaces with Value Gradients." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (2020): 4561–68. http://dx.doi.org/10.1609/aaai.v34i04.5885.

Повний текст джерела
Анотація:
Monte-Carlo Tree Search (MCTS) is the state-of-the-art online planning algorithm for large problems with discrete action spaces. However, many real-world problems involve continuous action spaces, where MCTS is not as effective as in discrete action spaces. This is mainly due to common practices such as coarse discretization of the entire action space and failure to exploit local smoothness. In this paper, we introduce Value-Gradient UCT (VG-UCT), which combines traditional MCTS with gradient-based optimization of action particles. VG-UCT simultaneously performs a global search via UCT with respect to the finitely sampled set of actions and performs a local improvement via action value gradients. In the experiments, we demonstrate that our approach outperforms existing MCTS methods and other strong baseline algorithms for continuous action spaces.
Стилі APA, Harvard, Vancouver, ISO та ін.
36

Avoundjian, Tigran, Julia C. Dombrowski, Matthew R. Golden, et al. "Comparing Methods for Record Linkage for Public Health Action: Matching Algorithm Validation Study." JMIR Public Health and Surveillance 6, no. 2 (2020): e15917. http://dx.doi.org/10.2196/15917.

Повний текст джерела
Анотація:
Background Many public health departments use record linkage between surveillance data and external data sources to inform public health interventions. However, little guidance is available to inform these activities, and many health departments rely on deterministic algorithms that may miss many true matches. In the context of public health action, these missed matches lead to missed opportunities to deliver interventions and may exacerbate existing health inequities. Objective This study aimed to compare the performance of record linkage algorithms commonly used in public health practice. Methods We compared five deterministic (exact, Stenger, Ocampo 1, Ocampo 2, and Bosh) and two probabilistic record linkage algorithms (fastLink and beta record linkage [BRL]) using simulations and a real-world scenario. We simulated pairs of datasets with varying numbers of errors per record and the number of matching records between the two datasets (ie, overlap). We matched the datasets using each algorithm and calculated their recall (ie, sensitivity, the proportion of true matches identified by the algorithm) and precision (ie, positive predictive value, the proportion of matches identified by the algorithm that were true matches). We estimated the average computation time by performing a match with each algorithm 20 times while varying the size of the datasets being matched. In a real-world scenario, HIV and sexually transmitted disease surveillance data from King County, Washington, were matched to identify people living with HIV who had a syphilis diagnosis in 2017. We calculated the recall and precision of each algorithm compared with a composite standard based on the agreement in matching decisions across all the algorithms and manual review. Results In simulations, BRL and fastLink maintained a high recall at nearly all data quality levels, while being comparable with deterministic algorithms in terms of precision. Deterministic algorithms typically failed to identify matches in scenarios with low data quality. All the deterministic algorithms had a shorter average computation time than the probabilistic algorithms. BRL had the slowest overall computation time (14 min when both datasets contained 2000 records). In the real-world scenario, BRL had the lowest trade-off between recall (309/309, 100.0%) and precision (309/312, 99.0%). Conclusions Probabilistic record linkage algorithms maximize the number of true matches identified, reducing gaps in the coverage of interventions and maximizing the reach of public health action.
Стилі APA, Harvard, Vancouver, ISO та ін.
37

Wadhai, Prajwal Ashok. "Algolizer Using ReactJS." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 04 (2024): 1–5. http://dx.doi.org/10.55041/ijsrem30733.

Повний текст джерела
Анотація:
The Algorithm Visualizer Project is an interactive and educational tool designed to illustrate various algorithms' functionality and efficiency through visual representations. Algorithms are fundamental to computer science, but their abstract nature can be challenging to comprehend. This project aims to bridge that gap by providing a user-friendly interface that visually demonstrates algorithms in action. The visualizer offers a platform where users can select from a range of algorithms, such as sorting (e.g., Bubble Sort, Merge Sort). Each algorithm is showcased step-by-step, allowing users to observe how data structures evolve and how the algorithms operate on them. Through dynamic visualizations, users can track the algorithm's progress, see how data is manipulated, and understand the underlying logic behind each step. Additionally, the tool provides options for adjusting parameters, such as input size or speed, enabling users to experiment with different scenarios and grasp the impact on algorithm performance. This project not only serves as a learning resource for students studying computer science and programming but also appeals to enthusiasts seeking a deeper understanding of algorithms. By offering an intuitive and engaging visual representation, the Algorithm Visualizer Project aims to make complex algorithms accessible and comprehensible to a wider audience.
Стилі APA, Harvard, Vancouver, ISO та ін.
38

Gillies, E. A., A. G. Y. Johnston, and C. R. McInnes. "Action Selection Algorithms for Autonomous Microspacecraft." Journal of Guidance, Control, and Dynamics 22, no. 6 (1999): 914–16. http://dx.doi.org/10.2514/2.4473.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
39

WIESE, U. J. "CLUSTER ALGORITHM SOLUTION OF SIGN AND COMPLEX ACTION PROBLEMS." International Journal of Modern Physics B 17, no. 28 (2003): 5435–47. http://dx.doi.org/10.1142/s0217979203020545.

Повний текст джерела
Анотація:
Numerical simulations of numerous quantum systems suffer from notorious sign or complex action problems. In such cases, the Boltzmann factors contributing to the path integral are in general not positive. As a consequence, standard Monte Carlo algorithms based on importance sampling fail. Meron-cluster algorithms realize a general strategy for solving sign problems by canceling explicitly all negative contributions. The remaining uncancelled positive contributions are then generated using importance sampling. The general nature of the sign problem is discussed and its solution with a meron-cluster algorithm is illustrated for staggered lattice fermions that undergo a chiral phase transition. A similar cluster algorithm is used to solve the complex action problem that arises in the Potts model approximation to dense Quantum Chromodynamics (QCD) with heavy quarks.
Стилі APA, Harvard, Vancouver, ISO та ін.
40

Sharp, Graham R. "Algorithmic Recognition of Group Actions on Orbitals." LMS Journal of Computation and Mathematics 2 (1999): 1–27. http://dx.doi.org/10.1112/s146115700000005x.

Повний текст джерела
Анотація:
AbstractAn algorithm is given tjat recognises (in O(lN2 log N) time, where N is the size of the input and l the depth of a precalculated Schreier tree) when a transitive group, (G, Ω) is the action on one orbit of the action of G on the set Γ(2) of ordered pairs of distinct elements of some G-set Γ (that us, Ωis isomorphic to an orbital of (G,Γ)). This may be adapted to list all essentially different such actions in O(lN4log N)time, where N is the sum of sizes of the input and output. This will be a useful tool for reducing the degree of a permutation group as an aid to further study of the group.This algorithm is then extended to provide an algorithm that will (in O(lN3 log N) time) recognise when a transiteve group is the action on one orbit of the action of G on the set Γ{2} ofunorderd pairs of distinct elements of some G-set Γ. An algorithm for finding all essentially different such actions is also provided, running in O(lN4logN) time. (again, N is the sum of the input and output sizes.) It is also indicated how these results may be applied to the more general problem of recognising when an intransitive group (G,Ω) is isomorphic to (G, Γ{2}) for some G-set Γ.All the algorithms are practical; most have been implementd in GAP, and the code is made available with this paper. In some cases the algorithms are considerably more practical than their asymptotic analyses would suggest.
Стилі APA, Harvard, Vancouver, ISO та ін.
41

Liu, Jinsong, Chenghan Xie, Qi Deng, Dongdong Ge, and Yinyu Ye. "Sketched Newton Value Iteration for Large-Scale Markov Decision Processes." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (2024): 13936–44. http://dx.doi.org/10.1609/aaai.v38i12.29301.

Повний текст джерела
Анотація:
Value Iteration (VI) is one of the most classic algorithms for solving Markov Decision Processes (MDPs), which lays the foundations for various more advanced reinforcement learning algorithms, such as Q-learning. VI may take a large number of iterations to converge as it is a first-order method. In this paper, we introduce the Newton Value Iteration (NVI) algorithm, which eliminates the impact of action space dimension compared to some previous second-order methods. Consequently, NVI can efficiently handle MDPs with large action spaces. Building upon NVI, we propose a novel approach called Sketched Newton Value Iteration (SNVI) to tackle MDPs with both large state and action spaces. SNVI not only inherits the stability and fast convergence advantages of second-order algorithms, but also significantly reduces computational complexity, making it highly scalable. Extensive experiments demonstrate the superiority of our algorithms over traditional VI and previously proposed second-order VI algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
42

Wang, Yu, Xiaoqing Chen, Jiaoqun Li, and Zengxiang Lu. "Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition: Enabling Miner Unsafe Action Recognition." Sensors 24, no. 14 (2024): 4557. http://dx.doi.org/10.3390/s24144557.

Повний текст джерела
Анотація:
The unsafe action of miners is one of the main causes of mine accidents. Research on underground miner unsafe action recognition based on computer vision enables relatively accurate real-time recognition of unsafe action among underground miners. A dataset called unsafe actions of underground miners (UAUM) was constructed and included ten categories of such actions. Underground images were enhanced using spatial- and frequency-domain enhancement algorithms. A combination of the YOLOX object detection algorithm and the Lite-HRNet human key-point detection algorithm was utilized to obtain skeleton modal data. The CBAM-PoseC3D model, a skeleton modal action-recognition model incorporating the CBAM attention module, was proposed and combined with the RGB modal feature-extraction model CBAM-SlowOnly. Ultimately, this formed the Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition (CBAM-MFFAR) model for recognizing unsafe actions of underground miners. The improved CBAM-MFFAR model achieved a recognition accuracy of 95.8% on the NTU60 RGB+D public dataset under the X-Sub benchmark. Compared to the CBAM-PoseC3D, PoseC3D, 2S-AGCN, and ST-GCN models, the recognition accuracy was improved by 2%, 2.7%, 7.3%, and 14.3%, respectively. On the UAUM dataset, the CBAM-MFFAR model achieved a recognition accuracy of 94.6%, with improvements of 2.6%, 4%, 12%, and 17.3% compared to the CBAM-PoseC3D, PoseC3D, 2S-AGCN, and ST-GCN models, respectively. In field validation at mining sites, the CBAM-MFFAR model accurately recognized similar and multiple unsafe actions among underground miners.
Стилі APA, Harvard, Vancouver, ISO та ін.
43

Merlis, Nadav, and Shie Mannor. "Lenient Regret for Multi-Armed Bandits." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (2021): 8950–57. http://dx.doi.org/10.1609/aaai.v35i10.17082.

Повний текст джерела
Анотація:
We consider the Multi-Armed Bandit (MAB) problem, where an agent sequentially chooses actions and observes rewards for the actions it took. While the majority of algorithms try to minimize the regret, i.e., the cumulative difference between the reward of the best action and the agent's action, this criterion might lead to undesirable results. For example, in large problems, or when the interaction with the environment is brief, finding an optimal arm is infeasible, and regret-minimizing algorithms tend to over-explore. To overcome this issue, algorithms for such settings should instead focus on playing near-optimal arms. To this end, we suggest a new, more lenient, regret criterion that ignores suboptimality gaps smaller than some ε. We then present a variant of the Thompson Sampling (TS) algorithm, called ε-TS, and prove its asymptotic optimality in terms of the lenient regret. Importantly, we show that when the mean of the optimal arm is high enough, the lenient regret of ε-TS is bounded by a constant. Finally, we show that ε-TS can be applied to improve the performance when the agent knows a lower bound of the suboptimality gaps.
Стилі APA, Harvard, Vancouver, ISO та ін.
44

Bequette, B. Wayne. "Glucose Clamp Algorithms and Insulin Time-Action Profiles." Journal of Diabetes Science and Technology 3, no. 5 (2009): 1005–13. http://dx.doi.org/10.1177/193229680900300503.

Повний текст джерела
Анотація:
Motivation: Most current insulin pumps include an insulin-on-board (IOB) feature to help subjects avoid problems associated with “insulin stacking.” In addition, many control algorithms proposed for a closed-loop artificial pancreas make use of IOB to reduce the probability of hypoglycemic events that often occur due to the integral action of the controller. The IOB curves are generated from the pharmacodynamic (time-activity profiles) actions of subcutaneous insulin, which are obtained from glycemic clamp studies. Methods: Glycemic clamp algorithms are reviewed and in silico studies are performed to analyze the effect of glucose meter bias and noise on glycemic control and the manipulated glucose infusion rates. The glucose infusion rates are used to obtain insulin time-activity profiles, which are then used to generate IOB curves. Results: A model-based, three-step-ahead controller is shown to be equivalent to a proportional-integral control algorithm with time-delay compensation. A systematic glucose meter bias of +6 mg/dl results in a decrease in the glucose area under the curve of 3% but no change in the IOB profiles. Conclusions: Based on these preliminary simulation studies, a substantial amount of glucose meter bias and noise during a glycemic clamp can be tolerated with little net effect on the IOB curves. It is suggested that handheld glucose meters can therefore be used in clamp studies if the measurements are filtered (averaged) before processing by the control algorithm. Clinical studies are needed to confirm these preliminary results.
Стилі APA, Harvard, Vancouver, ISO та ін.
45

Ke, Fengyi, and Qian Zhang. "Research on aerobics action modal recognition algorithm based on fuzzy system and reinforcement learning." Molecular & Cellular Biomechanics 21, no. 3 (2024): 645. http://dx.doi.org/10.62617/mcb645.

Повний текст джерела
Анотація:
Nowadays, human movement recognition technology has received a high degree of attention and has been used in a variety of fields such as intelligent security and motion analysis. The traditional action recognition method relies on artificial extraction of features, not only the recognition efficiency is low, and the recognition accuracy is not high, has been unable to meet the requirements of action recognition. The action recognition method based on reinforcement learning can automatically extract features, greatly simplifying the process of manual feature extraction in the traditional method, but at the same time, it also has some defects such as easy to be disturbed by external environment and complicated network training. In view of this situation, this paper takes aerobics action recognition as an example, proposes an action recognition algorithm based on Fuzzy least squares support vector machine, and adopts Fuzzy LS-SVM classification algorithm to realize the classification of actions on the feature set. The results of the study show that the aerobics movement recognition algorithm proposed in this paper has more excellent performance compared to the traditional recognition algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
46

Yang, Hangqi. "Analysis and study on path planning algorithms in the further mobile action." Journal of Physics: Conference Series 2824, no. 1 (2024): 012006. http://dx.doi.org/10.1088/1742-6596/2824/1/012006.

Повний текст джерела
Анотація:
Abstract This study investigates the use of four common path planning algorithms - RRT (Rapidly-Exploring Random Tree), Dijkstra, A*, and ACO (Ant Colony Optimization) - for autonomous endpoint search in pre-defined grid mazes. Path planning is a crucial problem in robot navigation, involving the ability to find optimal paths with the minimum running time in complex environments. In this research, it firstly introduces the basic principles and mechanisms of each algorithm. RRT is a random sampling-based algorithm that explores the search space by generating a tree-like structure randomly. Based on graph search, Dijkstra’s method determines the best solution by figuring out the shortest routes between the starting point and other places. A* algorithm combines heuristic functions and the concept of Dijkstra’s algorithm to reduce the search space and improve search efficiency. ACO algorithm simulates the behavior of ants searching for food by updating pheromone information and employing search strategies to find the optimal path. Next, a series of experiments to evaluate the performance of these four algorithms in pre-defined mazes. Search time, path lengths, and accuracy of the optimal path for each algorithm are recorded. Eventually, the result analysis of the four algorithms and the comparison are given in the last two sections of this article. In addition, A* algorithm has the best performance under the conditions set out in this article.
Стилі APA, Harvard, Vancouver, ISO та ін.
47

Jiménez, Sergio, Anders Jonsson, and Héctor Palacios. "Temporal Planning With Required Concurrency Using Classical Planning." Proceedings of the International Conference on Automated Planning and Scheduling 25 (April 8, 2015): 129–37. http://dx.doi.org/10.1609/icaps.v25i1.13731.

Повний текст джерела
Анотація:
In this paper we describe two novel algorithms for temporal planning. The first algorithm, TP, is an adaptation of the TEMPO algorithm. It compiles each temporal action into two classical actions, corresponding to the start and end of the temporal action, but handles the temporal constraints on actions through a modification of the Fast Downward planning system. The second algorithm, TPSHE, is a pure compilation from temporal to classical planning for the case in which required concurrency only appears in the form of single hard envelopes. We describe novel classes of temporal planning instances for which TPSHE is provably sound and complete. Compiling a temporal instance into a classical one gives a lot of freedom in terms of the planner or heuristic used to solve the instance. In experiments TPSHE significantly outperforms all planners from the temporal track of the International Planning Competition.
Стилі APA, Harvard, Vancouver, ISO та ін.
48

Niazi, Abdolkarim, Norizah Redzuan, Raja Ishak Raja Hamzah, and Sara Esfandiari. "Improvement on Supporting Machine Learning Algorithm for Solving Problem in Immediate Decision Making." Advanced Materials Research 566 (September 2012): 572–79. http://dx.doi.org/10.4028/www.scientific.net/amr.566.572.

Повний текст джерела
Анотація:
In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.
Стилі APA, Harvard, Vancouver, ISO та ін.
49

., Mehvish, and Ravinder Pal Singh. "Random Forest and Extreme Learning Machine Algorithms for High Accuracy Credit Card Fraud Detection." International Journal for Research in Applied Science and Engineering Technology 11, no. 9 (2023): 892–95. http://dx.doi.org/10.22214/ijraset.2023.55752.

Повний текст джерела
Анотація:
Abstract: The banking sector is facing a huge issue with credit card fraud, and research has shown that machine learning algorithms are a useful tool for identifying fraudulent actions of this kind. In this investigation, we offer a method for detecting fraudulent use of credit cards that makes use of a hybrid of two machine learning algorithms known as Random Forest (RF) and Extreme Learning Machine (ELM). We compiled a dataset using information obtained from a wide variety of sources, and then we preprocessed it to eliminate any inconsistencies and errors. Following this, the RF and ELM algorithms were put into action and trained on the dataset in order to provide forecasts on the occurrence of fraudulent acts. Measures of performance such as determining how accurate the algorithms are are examples. According to the findings of our research, the ELM algorithm is more accurate than the RF algorithm when it comes to the detection of fraudulent credit card activity.
Стилі APA, Harvard, Vancouver, ISO та ін.
50

Mukherjee, Shohin, and Maxim Likhachev. "GePA*SE: Generalized Edge-Based Parallel A* for Slow Evaluations." Proceedings of the International Symposium on Combinatorial Search 16, no. 1 (2023): 153–57. http://dx.doi.org/10.1609/socs.v16i1.27295.

Повний текст джерела
Анотація:
Parallel search algorithms have been shown to improve planning speed by harnessing the multithreading capability of modern processors. One such algorithm PA*SE achieves this by parallelizing state expansions, whereas another algorithm ePA*SE achieves this by effectively parallelizing edge evaluations. ePA*SE targets domains in which the action space comprises actions with expensive but similar evaluation times. However, in a number of robotics domains, the action space is heterogenous in the computational effort required to evaluate the cost of an action and its outcome. Motivated by this, we introduce GePA*SE: Generalized Edge-based Parallel A* for Slow Evaluations, which generalizes the key ideas of PA*SE and ePA*SE, i.e., parallelization of state expansions and edge evaluations, respectively. This extends its applicability to domains that have actions requiring varying computational effort to evaluate them. The open-source code for GePA*SE, along with the baselines, is available here: https://github.com/shohinm/parallel_search
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!