Accedi

Bibliografie tematiche / Partially Observable Markov Decision Processes (POMDPs) / Articoli di riviste

Segui questo link per vedere altri tipi di pubblicazioni sul tema: Partially Observable Markov Decision Processes (POMDPs).

Articoli di riviste sul tema "Partially Observable Markov Decision Processes (POMDPs)"

Autore: Grafiati

Pubblicato: 8 giugno 2024

Ultima modifica: 31 luglio 2025

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Partially Observable Markov Decision Processes (POMDPs)".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

NI, YAODONG, and ZHI-QIANG LIU. "BOUNDED-PARAMETER PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES: FRAMEWORK AND ALGORITHM." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 21, no. 06 (2013): 821–63. http://dx.doi.org/10.1142/s0218488513500396.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) are powerful for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model the real-life situation precisely, due to various reasons such as limited data for learning the model, inability of exact POMDPs to model dynamic situations, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter partially observable Markov decision processes (BPOMDPs). A modified value iteration is proposed as a basic strategy for ta

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Tennenholtz, Guy, Uri Shalit, and Shie Mannor. "Off-Policy Evaluation in Partially Observable Environments." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (2020): 10276–83. http://dx.doi.org/10.1609/aaai.v34i06.6590.

Testo completo

Abstract (sommario):

This work studies the problem of batch off-policy evaluation for Reinforcement Learning in partially observable environments. Off-policy evaluation under partial observability is inherently prone to bias, with risk of arbitrarily large errors. We define the problem of off-policy evaluation for Partially Observable Markov Decision Processes (POMDPs) and establish what we believe is the first off-policy evaluation result for POMDPs. In addition, we formulate a model in which observed and unobserved variables are decoupled into two dynamic processes, called a Decoupled POMDP. We show how off-poli

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Carr, Steven, Nils Jansen, and Ufuk Topcu. "Task-Aware Verifiable RNN-Based Policies for Partially Observable Markov Decision Processes." Journal of Artificial Intelligence Research 72 (November 18, 2021): 819–47. http://dx.doi.org/10.1613/jair.1.12963.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) are models for sequential decision-making under uncertainty and incomplete information. Machine learning methods typically train recurrent neural networks (RNN) as effective representations of POMDP policies that can efficiently process sequential data. However, it is hard to verify whether the POMDP driven by such RNN-based policies satisfies safety constraints, for instance, given by temporal logic specifications. We propose a novel method that combines techniques from machine learning with the field of formal methods: training an RNN-b

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Kim, Sung-Kyun, Oren Salzman, and Maxim Likhachev. "POMHDP: Search-Based Belief Space Planning Using Multiple Heuristics." Proceedings of the International Conference on Automated Planning and Scheduling 29 (May 25, 2021): 734–44. http://dx.doi.org/10.1609/icaps.v29i1.3542.

Testo completo

Abstract (sommario):

Robots operating in the real world encounter substantial uncertainty that cannot be modeled deterministically before the actual execution. This gives rise to the necessity of robust motion planning under uncertainty also known as belief space planning. Belief space planning can be formulated as Partially Observable Markov Decision Processes (POMDPs). However, computing optimal policies for non-trivial POMDPs is computationally intractable. Building upon recent progress from the search community, we propose a novel anytime POMDP solver, Partially Observable Multi-Heuristic Dynamic Programming (

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Wang, Chenggang, and Roni Khardon. "Relational Partially Observable MDPs." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (2010): 1153–58. http://dx.doi.org/10.1609/aaai.v24i1.7742.

Testo completo

Abstract (sommario):

Relational Markov Decision Processes (MDP) are a useful abstraction for stochastic planning problems since one can develop abstract solutions for them that are independent of domain size or instantiation. While there has been an increased interest in developing relational fully observable MDPs, there has been very little work on relational partially observable MDPs (POMDP), which deal with uncertainty in problem states in addition to stochastic action effects. This paper provides a concrete formalization of relational POMDPs making several technical contributions toward their solution. First,

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Hauskrecht, M. "Value-Function Approximations for Partially Observable Markov Decision Processes." Journal of Artificial Intelligence Research 13 (August 1, 2000): 33–94. http://dx.doi.org/10.1613/jair.678.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a set of imperfect or noisy observations. The modeling advantage of POMDPs, however, comes at a price -- exact methods for solving them are computationally very expensive and thus applicable in practice only to very simple problems. We focus on efficient approximation (heuristic) methods that attempt to alleviate the computational problem and trade off accurac

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Victorio-Meza, Hermilo, Manuel Mejía-Lavalle, Alicia Martínez Rebollar, Andrés Blanco Ortega, Obdulia Pichardo Lagunas, and Grigori Sidorov. "Searching for Cerebrovascular Disease Optimal Treatment Recommendations Applying Partially Observable Markov Decision Processes." International Journal of Pattern Recognition and Artificial Intelligence 32, no. 01 (2017): 1860015. http://dx.doi.org/10.1142/s0218001418600157.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) are mathematical models for the planning of action sequences under conditions of uncertainty. Uncertainty in POMDPs is manifested in two ways: uncertainty in the perception of model states and uncertainty in the effects of actions on states. The diagnosis and treatment of cerebral vascular diseases (CVD) present this double condition of uncertainty, so we think that POMDP is the most suitable method to model them. In this paper, we propose a model of CVD that is based on observations obtained from neuroimaging studies such as computed tom

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Zhang, N. L., and W. Liu. "A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains." Journal of Artificial Intelligence Research 7 (November 1, 1997): 199–230. http://dx.doi.org/10.1613/jair.419.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterministic and the state of the world is not completely observable. It is difficult to solve POMDPs exactly. This paper proposes a new approximation scheme. The basic idea is to transform a POMDP into another one where additional information is provided by an oracle. The oracle informs the planning agent that the current state of the world is in a certain region. The transformed POMDP is consequently said to be region observable. It is easier to solve than the or

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Maxtulus Junedy Nababan. "Perkembangan Perilaku Pembelajaran Peserta Didik dengan Menggunakan Partially Observable Markov Decision Processes." Edukasi Elita : Jurnal Inovasi Pendidikan 2, no. 1 (2024): 289–97. https://doi.org/10.62383/edukasi.v2i1.1034.

Testo completo

Abstract (sommario):

The School Community can be considered as an agent in a multi-agent system because there is dynamic interaction in a school system. The important thing in a multi-agent system is managing relationships between agents to achieve coordinated behavior by developing knowledge, attitudes and practices, namely the factors that shape environmental behavior. This research extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the idea of agent models into the state space. The results of this research show that (POMDP) is effective

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Omidshafiei, Shayegan, Ali–Akbar Agha–Mohammadi, Christopher Amato, Shih–Yuan Liu, Jonathan P. How, and John Vian. "Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions." International Journal of Robotics Research 36, no. 2 (2017): 231–58. http://dx.doi.org/10.1177/0278364917692864.

Testo completo

Abstract (sommario):

This work focuses on solving general multi-robot planning problems in continuous spaces with partial observability given a high-level domain description. Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) are general models for multi-robot coordination problems. However, representing and solving Dec-POMDPs is often intractable for large problems. This work extends the Dec-POMDP model to the Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP) to take advantage of the high-level representations that are natural for multi-robot problems and to facil

Gli stili APA, Harvard, Vancouver, ISO e altri

11

Rozek, Brandon, Junkyu Lee, Harsha Kokel, Michael Katz, and Shirin Sohrabi. "Partially Observable Hierarchical Reinforcement Learning with AI Planning (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 21 (2024): 23635–36. http://dx.doi.org/10.1609/aaai.v38i21.30504.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) challenge reinforcement learning agents due to incomplete knowledge of the environment. Even assuming monotonicity in uncertainty, it is difficult for an agent to know how and when to stop exploring for a given task. In this abstract, we discuss how to use hierarchical reinforcement learning (HRL) and AI Planning (AIP) to improve exploration when the agent knows possible valuations of unknown predicates and how to discover them. By encoding the uncertainty in an abstract planning model, the agent can derive a high-level plan which is then

Gli stili APA, Harvard, Vancouver, ISO e altri

12

Theocharous, Georgios, and Sridhar Mahadevan. "Compressing POMDPs Using Locality Preserving Non-Negative Matrix Factorization." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (2010): 1147–52. http://dx.doi.org/10.1609/aaai.v24i1.7750.

Testo completo

Abstract (sommario):

Partially Observable Markov Decision Processes (POMDPs) are a well-established and rigorous framework for sequential decision-making under uncertainty. POMDPs are well-known to be intractable to solve exactly, and there has been significant work on finding tractable approximation methods. One well-studied approach is to find a compression of the original POMDP by projecting the belief states to a lower-dimensional space. We present a novel dimensionality reduction method for POMDPs based on locality preserving non-negative matrix factorization. Unlike previous approaches, such as Krylov compre

Gli stili APA, Harvard, Vancouver, ISO e altri

13

Walraven, Erwin, and Matthijs T. J. Spaan. "Point-Based Value Iteration for Finite-Horizon POMDPs." Journal of Artificial Intelligence Research 65 (July 11, 2019): 307–41. http://dx.doi.org/10.1613/jair.1.11324.

Testo completo

Abstract (sommario):

Partially Observable Markov Decision Processes (POMDPs) are a popular formalism for sequential decision making in partially observable environments. Since solving POMDPs to optimality is a difficult task, point-based value iteration methods are widely used. These methods compute an approximate POMDP solution, and in some cases they even provide guarantees on the solution quality, but these algorithms have been designed for problems with an infinite planning horizon. In this paper we discuss why state-of-the-art point-based algorithms cannot be easily applied to finite-horizon problems that do

Gli stili APA, Harvard, Vancouver, ISO e altri

14

Zhang, N. L., and W. Zhang. "Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes." Journal of Artificial Intelligence Research 14 (February 1, 2001): 29–51. http://dx.doi.org/10.1613/jair.761.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.

Gli stili APA, Harvard, Vancouver, ISO e altri

15

Wang, Erli, Hanna Kurniawati, and Dirk Kroese. "An On-Line Planner for POMDPs with Large Discrete Action Space: A Quantile-Based Approach." Proceedings of the International Conference on Automated Planning and Scheduling 28 (June 15, 2018): 273–77. http://dx.doi.org/10.1609/icaps.v28i1.13906.

Testo completo

Abstract (sommario):

Making principled decisions in the presence of uncertainty is often facilitated by Partially Observable Markov Decision Processes (POMDPs). Despite tremendous advances in POMDP solvers, finding good policies with large action spaces remains difficult. To alleviate this difficulty, this paper presents an on-line approximate solver, called Quantile-Based Action Selector (QBASE). It uses quantile-statistics to adaptively evaluate a small subset of the action space without sacrificing the quality of the generated decision strategies by much. Experiments on four different robotics tasks with up to

Gli stili APA, Harvard, Vancouver, ISO e altri

16

Zhang, Zongzhang, Michael Littman, and Xiaoping Chen. "Covering Number as a Complexity Measure for POMDP Planning and Learning." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 1853–59. http://dx.doi.org/10.1609/aaai.v26i1.8360.

Testo completo

Abstract (sommario):

Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable from the initial belief point, has interesting links with the complexity of POMDP planning, theoretically. In this paper, we present empirical evidence that the covering number for the reachable belief spa

Gli stili APA, Harvard, Vancouver, ISO e altri

17

Ross, S., J. Pineau, S. Paquet, and B. Chaib-draa. "Online Planning Algorithms for POMDPs." Journal of Artificial Intelligence Research 32 (July 29, 2008): 663–704. http://dx.doi.org/10.1613/jair.2567.

Testo completo

Abstract (sommario):

Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online P

Gli stili APA, Harvard, Vancouver, ISO e altri

18

Ko, Li Ling, David Hsu, Wee Sun Lee, and Sylvie Ong. "Structured Parameter Elicitation." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (2010): 1102–7. http://dx.doi.org/10.1609/aaai.v24i1.7744.

Testo completo

Abstract (sommario):

The behavior of a complex system often depends on parameters whose values are unknown in advance. To operate effectively, an autonomous agent must actively gather information on the parameter values while progressing towards its goal. We call this problem parameter elicitation. Partially observable Markov decision processes (POMDPs) provide a principled framework for such uncertainty planning tasks, but they suffer from high computational complexity. However, POMDPs for parameter elicitation often possess special structural properties, specifically, factorization and symmetry. This work identi

Gli stili APA, Harvard, Vancouver, ISO e altri

19

XIANG, YANG, and FRANK HANSHAR. "MULTIAGENT EXPEDITION WITH GRAPHICAL MODELS." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 19, no. 06 (2011): 939–76. http://dx.doi.org/10.1142/s0218488511007416.

Testo completo

Abstract (sommario):

We investigate a class of multiagent planning problems termed multiagent expedition, where agents move around an open, unknown, partially observable, stochastic, and physical environment, in pursuit of multiple and alternative goals of different utility. Optimal planning in multiagent expedition is highly intractable. We introduce the notion of conditional optimality, decompose the task into a set of semi-independent optimization subtasks, and apply a decision-theoretic multiagent graphical model to solve each subtask optimally. A set of techniques are proposed to enhance modeling so that the

Gli stili APA, Harvard, Vancouver, ISO e altri

20

Lin, Yong, Xingjia Lu, and Fillia Makedon. "Approximate Planning in POMDPs with Weighted Graph Models." International Journal on Artificial Intelligence Tools 24, no. 04 (2015): 1550014. http://dx.doi.org/10.1142/s0218213015500141.

Testo completo

Abstract (sommario):

Markov decision process (MDP) based heuristic algorithms have been considered as simple, fast, but imprecise solutions for partially observable Markov decision processes (POMDPs). The main reason comes from how we approximate belief points. We use weighted graphs to model the state space and the belief space, in order for a detailed analysis of the MDP heuristic algorithm. As a result, we provide the prerequisite conditions to build up a robust belief graph. We further introduce a dynamic mechanism to manage belief space in the belief graph, so as to improve the efficiency and decrease the spa

Gli stili APA, Harvard, Vancouver, ISO e altri

21

Sanner, Scott, and Kristian Kersting. "Symbolic Dynamic Programming for First-order POMDPs." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (2010): 1140–46. http://dx.doi.org/10.1609/aaai.v24i1.7747.

Testo completo

Abstract (sommario):

Partially-observable Markov decision processes (POMDPs) provide a powerful model for sequential decision-making problems with partially-observed state and are known to have (approximately) optimal dynamic programming solutions. Much work in recent years has focused on improving the efficiency of these dynamic programming algorithms by exploiting symmetries and factored or relational representations. In this work, we show that it is also possible to exploit the full expressive power of first-order quantification to achieve state, action, and observation abstraction in a dynamic programming solu

Gli stili APA, Harvard, Vancouver, ISO e altri

22

Capitan, Jesus, Matthijs Spaan, Luis Merino, and Anibal Ollero. "Decentralized Multi-Robot Cooperation with Auctioned POMDPs." Proceedings of the International Conference on Automated Planning and Scheduling 24 (May 11, 2014): 515–18. http://dx.doi.org/10.1609/icaps.v24i1.13658.

Testo completo

Abstract (sommario):

Planning under uncertainty faces a scalability problem when considering multi-robot teams, as the information space scales exponentially with the number of robots. To address this issue, this paper proposes to decentralize multi-robot Partially Observable Markov Decision Processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions. Auctions provide a flexible way of coordinating individual policies modeled by POMDPs and have low communication requirements. Additionally, communication models in the multi-agent POMDP literature severely mismatch with real inter-

Gli stili APA, Harvard, Vancouver, ISO e altri

23

Belly, Marius, Nathanaël Fijalkow, Hugo Gimbert, Florian Horn, Guillermo A. Pérez, and Pierre Vandenhove. "Revelations: A Decidable Class of POMDPs with Omega-Regular Objectives." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 25 (2025): 26454–62. https://doi.org/10.1609/aaai.v39i25.34845.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) form a prominent model for uncertainty in sequential decision making. We are interested in constructing algorithms with theoretical guarantees to determine whether the agent has a strategy ensuring a given specification with probability 1. This well-studied problem is known to be undecidable already for very simple omega-regular objectives, because of the difficulty of reasoning on uncertain events. We introduce a revelation mechanism which restricts information loss by requiring that almost surely the agent has eventually full informatio

Gli stili APA, Harvard, Vancouver, ISO e altri

24

Aras, R., and A. Dutech. "An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs." Journal of Artificial Intelligence Research 37 (March 26, 2010): 329–96. http://dx.doi.org/10.1613/jair.2915.

Testo completo

Abstract (sommario):

Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs). Although DEC-POMDPS are a general and powerful modeling tool, solving them is a task with an overwhelming complexity that can be doubly exponential. In this paper, we study an alternate formulation of DEC-POMDPs relying on a sequence-form representation of policies. From this formulation, we show how to derive Mixed Integer Linear Programming (MILP) problems that

Gli stili APA, Harvard, Vancouver, ISO e altri

25

Wen, Xian, Haifeng Huo, and Jinhua Cui. "The optimal probability of the risk for finite horizon partially observable Markov decision processes." AIMS Mathematics 8, no. 12 (2023): 28435–49. http://dx.doi.org/10.3934/math.20231455.

Testo completo

Abstract (sommario):

<abstract><p>This paper investigates the optimality of the risk probability for finite horizon partially observable discrete-time Markov decision processes (POMDPs). The probability of the risk is optimized based on the criterion of total rewards not exceeding the preset goal value, which is different from the optimal problem of expected rewards. Based on the Bayes operator and the filter equations, the optimization problem of risk probability can be equivalently reformulated as filtered Markov decision processes. As an advantage of developing the value iteration technique, the opt

Gli stili APA, Harvard, Vancouver, ISO e altri

26

Itoh, Hideaki, Hisao Fukumoto, Hiroshi Wakuya, and Tatsuya Furukawa. "Bottom-up learning of hierarchical models in a class of deterministic POMDP environments." International Journal of Applied Mathematics and Computer Science 25, no. 3 (2015): 597–615. http://dx.doi.org/10.1515/amcs-2015-0044.

Testo completo

Abstract (sommario):

AbstractThe theory of partially observable Markov decision processes (POMDPs) is a useful tool for developing various intelligent agents, and learning hierarchical POMDP models is one of the key approaches for building such agents when the environments of the agents are unknown and large. To learn hierarchical models, bottom-up learning methods in which learning takes place in a layer-by-layer manner from the lowest to the highest layer are already extensively used in some research fields such as hidden Markov models and neural networks. However, little attention has been paid to bottom-up app

Gli stili APA, Harvard, Vancouver, ISO e altri

27

Chatterjee, Krishnendu, Martin Chmelik, and Ufuk Topcu. "Sensor Synthesis for POMDPs with Reachability Objectives." Proceedings of the International Conference on Automated Planning and Scheduling 28 (June 15, 2018): 47–55. http://dx.doi.org/10.1609/icaps.v28i1.13875.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning problems in which an agent interacts with an environment using noisy and imprecise sensors. We study a setting in which the sensors are only partially defined and the goal is to synthesize “weakest” additional sensors, such that in the resulting POMDP, there is a small-memory policy for the agent that almost-surely (with probability 1) satisfies a reachability objective. We show that the problem is NP-complete, and present a symbolic algorithm by encoding the problem into SAT instances. We illustr

Gli stili APA, Harvard, Vancouver, ISO e altri

28

Dressel, Louis, and Mykel Kochenderfer. "Efficient Decision-Theoretic Target Localization." Proceedings of the International Conference on Automated Planning and Scheduling 27 (June 5, 2017): 70–78. http://dx.doi.org/10.1609/icaps.v27i1.13832.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) offer a principled approach to control under uncertainty. However, POMDP solvers generally require rewards to depend only on the state and action. This limitation is unsuitable for information-gathering problems, where rewards are more naturally expressed as functions of belief. In this work, we consider target localization, an information-gathering task where an agent takes actions leading to informative observations and a concentrated belief over possible target locations. By leveraging recent theoretical and algorithmic advances, we in

Gli stili APA, Harvard, Vancouver, ISO e altri

29

Park, Jaeyoung, Kee-Eung Kim, and Yoon-Kyu Song. "A POMDP-Based Optimal Control of P300-Based Brain-Computer Interfaces." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (2011): 1559–62. http://dx.doi.org/10.1609/aaai.v25i1.7956.

Testo completo

Abstract (sommario):

Most of the previous work on brain-computer interfaces (BCIs) exploiting the P300 in electroencephalography (EEG) has focused on low-level signal processing algorithms such as feature extraction and classification methods. Although a significant improvement has been made in the past, the accuracy of detecting P300 is limited by the inherently low signal-to-noise ratio in EEGs. In this paper, we present a systematic approach to optimize the interface using partially observable Markov decision processes (POMDPs). Through experiments involving human subjects, we show the P300 speller system that

Gli stili APA, Harvard, Vancouver, ISO e altri

30

Doshi, P., and P. J. Gmytrasiewicz. "Monte Carlo Sampling Methods for Approximating Interactive POMDPs." Journal of Artificial Intelligence Research 34 (March 24, 2009): 297–337. http://dx.doi.org/10.1613/jair.2630.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) provide a principled framework for sequential planning in uncertain single agent settings. An extension of POMDPs to multiagent settings, called interactive POMDPs (I-POMDPs), replaces POMDP belief spaces with interactive hierarchical belief systems which represent an agents belief about the physical world, about beliefs of other agents, and about their beliefs about others beliefs. This modification makes the difficulties of obtaining solutions due to complexity of the belief and policy spaces even more acute. We describe a general met

Gli stili APA, Harvard, Vancouver, ISO e altri

31

Shatkay, H., and L. P. Kaelbling. "Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Topological-Geometrical Gap." Journal of Artificial Intelligence Research 16 (March 1, 2002): 167–207. http://dx.doi.org/10.1613/jair.874.

Testo completo

Abstract (sommario):

Hidden Markov models (HMMs) and partially observable Markov decision processes (POMDPs) provide useful tools for modeling dynamical systems. They are particularly useful for representing the topology of environments such as road networks and office buildings, which are typical for robot navigation and planning. The work presented here describes a formal framework for incorporating readily available odometric information and geometrical constraints into both the models and the algorithm that learns them. By taking advantage of such information, learning HMMs/POMDPs can be made to generate bette

Gli stili APA, Harvard, Vancouver, ISO e altri

32

Lim, Michael H., Tyler J. Becker, Mykel J. Kochenderfer, Claire J. Tomlin, and Zachary N. Sunberg. "Optimality Guarantees for Particle Belief Approximation of POMDPs." Journal of Artificial Intelligence Research 77 (August 27, 2023): 1591–636. http://dx.doi.org/10.1613/jair.1.14525.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Ou

Gli stili APA, Harvard, Vancouver, ISO e altri

33

Spaan, M. T. J., and N. Vlassis. "Perseus: Randomized Point-based Value Iteration for POMDPs." Journal of Artificial Intelligence Research 24 (August 1, 2005): 195–220. http://dx.doi.org/10.1613/jair.1659.

Testo completo

Abstract (sommario):

Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agent's belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Co

Gli stili APA, Harvard, Vancouver, ISO e altri

34

Kraemer, Landon, and Bikramjit Banerjee. "Informed Initial Policies for Learning in Dec-POMDPs." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 2433–34. http://dx.doi.org/10.1609/aaai.v26i1.8426.

Testo completo

Abstract (sommario):

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a formal model for planning in cooperative multiagent systems where agents operate with noisy sensors and actuators, and local information. Prevalent Dec-POMDP solution techniques have mostly been centralized and have assumed knowledge of the model. In real world scenarios, however, solving centrally may not be an option and model parameters maybe unknown. To address this, we propose a distributed, model-free algorithm for learning Dec-POMDP policies, in which agents take turns learning, with each agent not current

Gli stili APA, Harvard, Vancouver, ISO e altri

35

Banerjee, Bikramjit, Jeremy Lyle, Landon Kraemer, and Rajesh Yellamraju. "Sample Bounded Distributed Reinforcement Learning for Decentralized POMDPs." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (2021): 1256–62. http://dx.doi.org/10.1609/aaai.v26i1.8260.

Testo completo

Abstract (sommario):

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a powerful modeling technique for realistic multi-agent coordination problems under uncertainty. Prevalent solution techniques are centralized and assume prior knowledge of the model. We propose a distributed reinforcement learning approach, where agents take turns to learn best responses to each other’s policies. This promotes decentralization of the policy computation problem, and relaxes reliance on the full knowledge of the problem parameters. We derive the relation between the sample complexity of best respons

Gli stili APA, Harvard, Vancouver, ISO e altri

36

Wu, Bo, Yan Peng Feng, and Hong Yan Zheng. "Point-Based Monte Carto Online Planning in POMDPs." Advanced Materials Research 846-847 (November 2013): 1388–91. http://dx.doi.org/10.4028/www.scientific.net/amr.846-847.1388.

Testo completo

Abstract (sommario):

The online planning and learning in partially observable Markov decision processes are often intractable because belief states space has two curses: dimensionality and history. In order to address this problem, this paper proposes a point-based Monte Carto online planning approach in POMDPs. This approach involves performing value backup at specific reachable belief points, rather than over the entire belief simplex, to speed up computation processes. Then Monte Carlo tree search algorithm is exploited to share the value of actions across each subtree of the search tree so as to minimise the m

Gli stili APA, Harvard, Vancouver, ISO e altri

37

Bernstein, D. S., C. Amato, E. A. Hansen, and S. Zilberstein. "Policy Iteration for Decentralized Control of Markov Decision Processes." Journal of Artificial Intelligence Research 34 (March 1, 2009): 89–132. http://dx.doi.org/10.1613/jair.2667.

Testo completo

Abstract (sommario):

Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers

Gli stili APA, Harvard, Vancouver, ISO e altri

38

Hahsler, Michael, and Anthony R. Cassandra. "Pomdp: A Computational Infrastructure for Partially Observable Markov Decision Processes." R Journal 16, no. 2 (2025): 116–33. https://doi.org/10.32614/rj-2024-021.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

39

Ajdarów, Michal, Šimon Brlej, and Petr Novotný. "Shielding in Resource-Constrained Goal POMDPs." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 14674–82. http://dx.doi.org/10.1609/aaai.v37i12.26715.

Testo completo

Abstract (sommario):

We consider partially observable Markov decision processes (POMDPs) modeling an agent that needs a supply of a certain resource (e.g., electricity stored in batteries) to operate correctly. The resource is consumed by the agent's actions and can be replenished only in certain states. The agent aims to minimize the expected cost of reaching some goal while preventing resource exhaustion, a problem we call resource-constrained goal optimization (RSGO). We take a two-step approach to the RSGO problem. First, using formal methods techniques, we design an algorithm computing a shield for a given sc

Gli stili APA, Harvard, Vancouver, ISO e altri

40

Simão, Thiago D., Marnix Suilen, and Nils Jansen. "Safe Policy Improvement for POMDPs via Finite-State Controllers." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 15109–17. http://dx.doi.org/10.1609/aaai.v37i12.26763.

Testo completo

Abstract (sommario):

We study safe policy improvement (SPI) for partially observable Markov decision processes (POMDPs). SPI is an offline reinforcement learning (RL) problem that assumes access to (1) historical data about an environment, and (2) the so-called behavior policy that previously generated this data by interacting with the environment. SPI methods neither require access to a model nor the environment itself, and aim to reliably improve upon the behavior policy in an offline manner. Existing methods make the strong assumption that the environment is fully observable. In our novel approach to the SPI pr

Gli stili APA, Harvard, Vancouver, ISO e altri

41

Zhang, Zongzhang, David Hsu, Wee Sun Lee, Zhan Wei Lim, and Aijun Bai. "PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces." Proceedings of the International Conference on Automated Planning and Scheduling 25 (April 8, 2015): 249–57. http://dx.doi.org/10.1609/icaps.v25i1.13706.

Testo completo

Abstract (sommario):

Trial-based asynchronous value iteration algorithms for large Partially Observable Markov Decision Processes (POMDPs), such as HSVI2, FSVI and SARSOP, have made impressive progress in the past decade. In the forward exploration phase of these algorithms, only the outcome that has the highest potential impact is searched. This paper provides a novel approach, called Palm LEAf SEarch (PLEASE), which allows the selection of more than one outcome when their potential impacts are close to the highest one. Compared with existing trial-based algorithms, PLEASE can save considerable time to propagate

Gli stili APA, Harvard, Vancouver, ISO e altri

42

Sonu, Ekhlas, Yingke Chen, and Prashant Doshi. "Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs." Proceedings of the International Conference on Automated Planning and Scheduling 25 (April 8, 2015): 202–10. http://dx.doi.org/10.1609/icaps.v25i1.13712.

Testo completo

Abstract (sommario):

Interactive partially observable Markov decision processes (I-POMDP) provide a formal framework for planning for a self-interested agent in multiagent settings. An agent operating in a multiagent environment must deliberate about the actions that other agents may take and the effect these actions have on the environment and the rewards it receives. Traditional I-POMDPs model this dependence on the actions of other agents using joint action and model spaces. Therefore, the solution complexity grows exponentially with the number of agents thereby complicating scalability. In this paper, we model

Gli stili APA, Harvard, Vancouver, ISO e altri

43

Bouton, Maxime, Jana Tumova, and Mykel J. Kochenderfer. "Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (2020): 10061–68. http://dx.doi.org/10.1609/aaai.v34i06.6563.

Testo completo

Abstract (sommario):

Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP). By formulating a planning problem, we show how to use point-based value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy. We demonstrate

Gli stili APA, Harvard, Vancouver, ISO e altri

44

Lemmel, Julian, and Radu Grosu. "Real-Time Recurrent Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 17 (2025): 18189–97. https://doi.org/10.1609/aaai.v39i17.34001.

Testo completo

Abstract (sommario):

We introduce a biologically plausible RL framework for solving tasks in partially observable Markov decision processes (POMDPs). The proposed algorithm combines three integral parts: (1) A Meta-RL architecture, resembling the mammalian basal ganglia; (2) A biologically plausible reinforcement learning algorithm, exploiting temporal difference learning and eligibility traces to train the policy and the value-function; (3) An online automatic differentiation algorithm for computing the gradients with respect to parameters of a shared recurrent network backbone. Our experimental results show that

Gli stili APA, Harvard, Vancouver, ISO e altri

45

Petrik, Marek, and Shlomo Zilberstein. "Linear Dynamic Programs for Resource Management." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (2011): 1377–83. http://dx.doi.org/10.1609/aaai.v25i1.7794.

Testo completo

Abstract (sommario):

Sustainable resource management in many domains presents large continuous stochastic optimization problems, which can often be modeled as Markov decision processes (MDPs). To solve such large MDPs, we identify and leverage linearity in state and action sets that is common in resource management. In particular, we introduce linear dynamic programs (LDPs) that generalize resource management problems and partially observable MDPs (POMDPs). We show that the LDP framework makes it possible to adapt point-based methods--the state of the art in solving POMDPs--to solving LDPs. The experimental result

Gli stili APA, Harvard, Vancouver, ISO e altri

46

Dibangoye, Jilles Steeve, Christopher Amato, Olivier Buffet, and François Charpillet. "Optimally Solving Dec-POMDPs as Continuous-State MDPs." Journal of Artificial Intelligence Research 55 (February 24, 2016): 443–97. http://dx.doi.org/10.1613/jair.4623.

Testo completo

Abstract (sommario):

Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general model for decision-making under uncertainty in decentralized settings, but are difficult to solve optimally (NEXP-Complete). As a new way of solving these problems, we introduce the idea of transforming a Dec-POMDP into a continuous-state deterministic MDP with a piecewise-linear and convex value function. This approach makes use of the fact that planning can be accomplished in a centralized offline manner, while execution can still be decentralized. This new Dec-POMDP formulation, which we call an occu

Gli stili APA, Harvard, Vancouver, ISO e altri

47

Ng, Brenda, Carol Meyers, Kofi Boakye, and John Nitao. "Towards Applying Interactive POMDPs to Real-World Adversary Modeling." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 2 (2021): 1814–20. http://dx.doi.org/10.1609/aaai.v24i2.18818.

Testo completo

Abstract (sommario):

We examine the suitability of using decision processes to model real-world systems of intelligent adversaries. Decision processes have long been used to study cooperative multiagent interactions, but their practical applicability to adversarial problems has received minimal study. We address the pros and cons of applying sequential decision-making in this area, using the crime of money laundering as a specific example. Motivated by case studies, we abstract out a model of the money laundering process, using the framework of interactive partially observable Markov decision processes (I-POMDPs).

Gli stili APA, Harvard, Vancouver, ISO e altri

48

Boots, Byron, and Geoffrey Gordon. "An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (2011): 293–300. http://dx.doi.org/10.1609/aaai.v25i1.7924.

Testo completo

Abstract (sommario):

Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems — for example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However, they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many featu

Gli stili APA, Harvard, Vancouver, ISO e altri

49

Andriushchenko, Roman, Milan Češka, Filip Macák, Sebastian Junges, and Joost-Pieter Katoen. "An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty." Journal of Artificial Intelligence Research 82 (February 3, 2025): 433–69. https://doi.org/10.1613/jair.1.16593.

Testo completo

Abstract (sommario):

Dealing with aleatoric uncertainty is key in many domains involving sequential decision making, e.g., planning in AI, network protocols, and symbolic program synthesis. This paper presents a general-purpose model-based framework to obtain policies operating in uncertain environments in a fully automated manner. The new concept of coloured Markov Decision Processes (MDPs) enables a succinct representation of a wide range of synthesis problems. A coloured MDP describes a collection of possible policy configurations with their structural dependencies. The framework covers the synthesis of (a) pro

Gli stili APA, Harvard, Vancouver, ISO e altri

50

Banerjee, Bikramjit. "Pruning for Monte Carlo Distributed Reinforcement Learning in Decentralized POMDPs." Proceedings of the AAAI Conference on Artificial Intelligence 27, no. 1 (2013): 88–94. http://dx.doi.org/10.1609/aaai.v27i1.8670.

Testo completo

Abstract (sommario):

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a powerful modeling technique for realistic multi-agent coordination problems under uncertainty. Prevalent solution techniques are centralized and assume prior knowledge of the model. Recently a Monte Carlo based distributed reinforcement learning approach was proposed, where agents take turns to learn best responses to each other’s policies. This promotes decentralization of the policy computation problem, and relaxes reliance on the full knowledge of the problem parameters. However, this Monte Carlo approach has

Gli stili APA, Harvard, Vancouver, ISO e altri

Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!