Academic literature on the topic 'Apprentissage par renforcement profond'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Apprentissage par renforcement profond.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Apprentissage par renforcement profond"
Griffon, L., M. Chennaoui, D. Leger, and M. Strauss. "Apprentissage par renforcement dans la narcolepsie de type 1." Médecine du Sommeil 15, no. 1 (March 2018): 60. http://dx.doi.org/10.1016/j.msom.2018.01.164.
Full textFillières-Riveau, Gauthier, Jean-Marie Favreau, Vincent Barra, and Guillaume Touya. "Génération de cartes tactiles photoréalistes pour personnes déficientes visuelles par apprentissage profond." Revue Internationale de Géomatique 30, no. 1-2 (January 2020): 105–26. http://dx.doi.org/10.3166/rig.2020.00104.
Full textGarcia, Pascal. "Exploration guidée en apprentissage par renforcement. Connaissancesa prioriet relaxation de contraintes." Revue d'intelligence artificielle 20, no. 2-3 (June 1, 2006): 235–75. http://dx.doi.org/10.3166/ria.20.235-275.
Full textDegris, Thomas, Olivier Sigaud, and Pierre-Henri Wuillemin. "Apprentissage par renforcement factorisé pour le comportement de personnages non joueurs." Revue d'intelligence artificielle 23, no. 2-3 (May 13, 2009): 221–51. http://dx.doi.org/10.3166/ria.23.221-251.
Full textHost, Shirley, and Nicolas Sabouret. "Apprentissage par renforcement d'actes de communication dans un système multi-agent." Revue d'intelligence artificielle 24, no. 2 (April 17, 2010): 159–88. http://dx.doi.org/10.3166/ria.24.159-188.
Full textPouliquen, Geoffroy, and Catherine Oppenheim. "Débruitage par apprentissage profond: impact sur les biomarqueurs quantitatifs des tumeurs cérébrales." Journal of Neuroradiology 49, no. 2 (March 2022): 136. http://dx.doi.org/10.1016/j.neurad.2022.01.040.
Full textVillatte, Matthieu, David Scholiers, and Esteve Freixa i Baqué. "Apprentissage du comportement optimal par exposition aux contingences dans le dilemme de Monty Hall." ACTA COMPORTAMENTALIA 12, no. 1 (June 1, 2004): 5–24. http://dx.doi.org/10.32870/ac.v12i1.14548.
Full textFouquet, Guillaume. "60 ans démunis devant 30 ans !" Gestalt 59, no. 2 (July 7, 2023): 103–14. http://dx.doi.org/10.3917/gest.059.0103.
Full textCaccamo, Emmanuelle, and Fabien Richert. "Les procédés algorithmiques au prisme des approches sémiotiques." Cygne noir, no. 7 (June 1, 2022): 1–16. http://dx.doi.org/10.7202/1089327ar.
Full textChoplin, Arnaud, and Julie Laporte. "Comparaison de deux stratégies pédagogiques dans l’apprentissage du toucher thérapeutique." Revue des sciences de l’éducation 42, no. 3 (June 7, 2017): 187–210. http://dx.doi.org/10.7202/1040089ar.
Full textDissertations / Theses on the topic "Apprentissage par renforcement profond"
Zimmer, Matthieu. "Apprentissage par renforcement développemental." Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008/document.
Full textReinforcement learning allows an agent to learn a behavior that has never been previously defined by humans. The agent discovers the environment and the different consequences of its actions through its interaction: it learns from its own experience, without having pre-established knowledge of the goals or effects of its actions. This thesis tackles how deep learning can help reinforcement learning to handle continuous spaces and environments with many degrees of freedom in order to solve problems closer to reality. Indeed, neural networks have a good scalability and representativeness. They make possible to approximate functions on continuous spaces and allow a developmental approach, because they require little a priori knowledge on the domain. We seek to reduce the amount of necessary interaction of the agent to achieve acceptable behavior. To do so, we proposed the Neural Fitted Actor-Critic framework that defines several data efficient actor-critic algorithms. We examine how the agent can fully exploit the transitions generated by previous behaviors by integrating off-policy data into the proposed framework. Finally, we study how the agent can learn faster by taking advantage of the development of his body, in particular, by proceeding with a gradual increase in the dimensionality of its sensorimotor space
Zimmer, Matthieu. "Apprentissage par renforcement développemental." Electronic Thesis or Diss., Université de Lorraine, 2018. http://www.theses.fr/2018LORR0008.
Full textReinforcement learning allows an agent to learn a behavior that has never been previously defined by humans. The agent discovers the environment and the different consequences of its actions through its interaction: it learns from its own experience, without having pre-established knowledge of the goals or effects of its actions. This thesis tackles how deep learning can help reinforcement learning to handle continuous spaces and environments with many degrees of freedom in order to solve problems closer to reality. Indeed, neural networks have a good scalability and representativeness. They make possible to approximate functions on continuous spaces and allow a developmental approach, because they require little a priori knowledge on the domain. We seek to reduce the amount of necessary interaction of the agent to achieve acceptable behavior. To do so, we proposed the Neural Fitted Actor-Critic framework that defines several data efficient actor-critic algorithms. We examine how the agent can fully exploit the transitions generated by previous behaviors by integrating off-policy data into the proposed framework. Finally, we study how the agent can learn faster by taking advantage of the development of his body, in particular, by proceeding with a gradual increase in the dimensionality of its sensorimotor space
Martinez, Coralie. "Classification précoce de séquences temporelles par de l'apprentissage par renforcement profond." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAT123.
Full textEarly classification (EC) of time series is a recent research topic in the field of sequential data analysis. It consists in assigning a label to some data that is sequentially collected with new data points arriving over time, and the prediction of a label has to be made using as few data points as possible in the sequence. The EC problem is of paramount importance for supporting decision makers in many real-world applications, ranging from process control to fraud detection. It is particularly interesting for applications concerned with the costs induced by the acquisition of data points, or for applications which seek for rapid label prediction in order to take early actions. This is for example the case in the field of health, where it is necessary to provide a medical diagnosis as soon as possible from the sequence of medical observations collected over time. Another example is predictive maintenance with the objective to anticipate the breakdown of a machine from its sensor signals. In this doctoral work, we developed a new approach for this problem, based on the formulation of a sequential decision making problem, that is the EC model has to decide between classifying an incomplete sequence or delaying the prediction to collect additional data points. Specifically, we described this problem as a Partially Observable Markov Decision Process noted EC-POMDP. The approach consists in training an EC agent with Deep Reinforcement Learning (DRL) in an environment characterized by the EC-POMDP. The main motivation for this approach was to offer an end-to-end model for EC which is able to simultaneously learn optimal patterns in the sequences for classification and optimal strategic decisions for the time of prediction. Also, the method allows to set the importance of time against accuracy of the classification in the definition of rewards, according to the application and its willingness to make this compromise. In order to solve the EC-POMDP and model the policy of the EC agent, we applied an existing DRL algorithm, the Double Deep-Q-Network algorithm, whose general principle is to update the policy of the agent during training episodes, using a replay memory of past experiences. We showed that the application of the original algorithm to the EC problem lead to imbalanced memory issues which can weaken the training of the agent. Consequently, to cope with those issues and offer a more robust training of the agent, we adapted the algorithm to the EC-POMDP specificities and we introduced strategies of memory management and episode management. In experiments, we showed that these contributions improved the performance of the agent over the original algorithm, and that we were able to train an EC agent which compromised between speed and accuracy, on each sequence individually. We were also able to train EC agents on public datasets for which we have no expertise, showing that the method is applicable to various domains. Finally, we proposed some strategies to interpret the decisions of the agent, validate or reject them. In experiments, we showed how these solutions can help gain insight in the choice of action made by the agent
Paumard, Marie-Morgane. "Résolution automatique de puzzles par apprentissage profond." Thesis, CY Cergy Paris Université, 2020. http://www.theses.fr/2020CYUN1067.
Full textThe objective of this thesis is to develop semantic methods of reassembly in the complicated framework of heritage collections, where some blocks are eroded or missing.The reassembly of archaeological remains is an important task for heritage sciences: it allows to improve the understanding and conservation of ancient vestiges and artifacts. However, some sets of fragments cannot be reassembled with techniques using contour information or visual continuities. It is then necessary to extract semantic information from the fragments and to interpret them. These tasks can be performed automatically thanks to deep learning techniques coupled with a solver, i.e., a constrained decision making algorithm.This thesis proposes two semantic reassembly methods for 2D fragments with erosion and a new dataset and evaluation metrics.The first method, Deepzzle, proposes a neural network followed by a solver. The neural network is composed of two Siamese convolutional networks trained to predict the relative position of two fragments: it is a 9-class classification. The solver uses Dijkstra's algorithm to maximize the joint probability. Deepzzle can address the case of missing and supernumerary fragments, is capable of processing about 15 fragments per puzzle, and has a performance that is 25% better than the state of the art.The second method, Alphazzle, is based on AlphaZero and single-player Monte Carlo Tree Search (MCTS). It is an iterative method that uses deep reinforcement learning: at each step, a fragment is placed on the current reassembly. Two neural networks guide MCTS: an action predictor, which uses the fragment and the current reassembly to propose a strategy, and an evaluator, which is trained to predict the quality of the future result from the current reassembly. Alphazzle takes into account the relationships between all fragments and adapts to puzzles larger than those solved by Deepzzle. Moreover, Alphazzle is compatible with constraints imposed by a heritage framework: at the end of reassembly, MCTS does not access the reward, unlike AlphaZero. Indeed, the reward, which indicates if a puzzle is well solved or not, can only be estimated by the algorithm, because only a conservator can be sure of the quality of a reassembly
Jneid, Khoder. "Apprentissage par Renforcement Profond pour l'Optimisation du Contrôle et de la Gestion des Bâtiment." Electronic Thesis or Diss., Université Grenoble Alpes, 2023. http://www.theses.fr/2023GRALM062.
Full textHeating, ventilation, and air-conditioning (HVAC) systems account for high energy consumption in buildings. Conventional approaches used to control HVAC systems rely on rule-based control (RBC) that consists of predefined rules set by an expert. Model-predictive control (MPC), widely explored in literature, is not adopted in the industry since it is a model-based approach that requires to build models of the building at the first stage to be used in the optimization phase and thus is time-consuming and expensive. During the PhD, we investigate reinforcement learning (RL) to optimize the energy consumption of HVAC systems while maintaining good thermal comfort and good air quality. Specifically, we focus on model-free RL algorithms that learn through interaction with the environment (building including the HVAC) and thus not requiring to have accurate models of the environment. In addition, online approaches are considered. The main challenge of an online model-free RL is the number of days that are necessary for the algorithm to acquire enough data and actions feedback to start acting properly. Hence, the research subject of the PhD is boosting model-free RL algorithms to converge faster to make them applicable in real-world applications, HVAC control. Two approaches have been explored during the PhD to achieve our objective: the first approach combines RBC with value-based RL, and the second approach combines fuzzy rules with policy-based RL. Both approaches aim to boost the convergence of RL by guiding the RL policy but they are completely different. The first approach exploits RBC rules during training while in the second approach, the fuzzy rules are injected directly into the policy. Tests areperformed on a simulated office during winter. This simulated office is a replica of a real office at Grenoble INP
Léon, Aurélia. "Apprentissage séquentiel budgétisé pour la classification extrême et la découverte de hiérarchie en apprentissage par renforcement." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS226.
Full textThis thesis deals with the notion of budget to study problems of complexity (it can be computational complexity, a complex task for an agent, or complexity due to a small amount of data). Indeed, the main goal of current techniques in machine learning is usually to obtain the best accuracy, without worrying about the cost of the task. The concept of budget makes it possible to take into account this parameter while maintaining good performances. We first focus on classification problems with a large number of classes: the complexity in those algorithms can be reduced thanks to the use of decision trees (here learned through budgeted reinforcement learning techniques) or the association of each class with a (binary) code. We then deal with reinforcement learning problems and the discovery of a hierarchy that breaks down a (complex) task into simpler tasks to facilitate learning and generalization. Here, this discovery is done by reducing the cognitive effort of the agent (considered in this work as equivalent to the use of an additional observation). Finally, we address problems of understanding and generating instructions in natural language, where data are available in small quantities: we test for this purpose the simultaneous use of an agent that understands and of an agent that generates the instructions
Younes, Mohamed. "Apprentissage et stimulation des stratégies de sport (boxe) pour l'entraînement en réalité virtuelle." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS014.
Full textThis thesis investigates the extraction and simulation of fighter interactions, mainly for boxing, by utilizing deep learning techniques: human motion estimation from videos, reinforcement learning-based imitation learning, and physics-based character simulation. In the context of sport analysis from videos, a benchmark protocol is proposed where various contemporary 2D human pose extraction methods are evaluated for their precision in deriving positional information from RGB video recordings of boxers during complex movements and unfavorable filming circumstances. In a second part, the thesis focuses on replicating realistic fighter interactions given motion and interaction data through an innovative methodology for imitating interactions and motions among multiple physically simulated characters derived from unorganized motion capture data. Initially, this technique was demonstrated for simulating light shadow-boxing between two fighters without significant physical contact. Subsequently, it was expanded to accommodate additional interaction data featuring boxing with actual physical contact and other combat activities, along with handling user instructions and interaction restrictions
Israilov, Sardor. "De l'identification basée apprentissage profond à la commande basée modèle." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4003.
Full textFish swimming remains a complex subject that is not yet fully understood due to the inter-section of biology and fluid dynamics. Through years of evolution, organisms in nature have perfected their biological mechanisms to navigate efficiently in their environment and adaptto particular situations. Throughout history, mankind has been inspired by nature to innovateand develop nature-like systems. Biomimetic robotic fish, in particular, has a number of appli-cations in the real world and its control is yet to be optimized. Deep Reinforcement Learning showed excellent results in control of robotic systems, where dynamics is too complex to befully modeled and analyzed. In this thesis, we explored new venues of control of a biomimetic fish via reinforcement learning to effectively maximize the thrust and speed. However, to fully comprehend the newly-emerged data-based algorithms, we first studied the application of these methods on a standard benchmark of a control theory, the inverted pendulum with a cart. We demonstrated that deep Reinforcement Learning could control the system without any prior knowledge of the system, achieving performance comparable to traditional model-based con-trol theory methods. In the third chapter, we focus on the undulatory swimming of a roboticfish, exploring various objectives and information sources for control. Our studies indicate that the thrust force of a robotic fish can be optimized using inputs from both force sensors and cameras as feedback for control. Our findings demonstrate that a square wave control with a particular frequency maximizes the thrust and we rationalize it using Pontryagin Maximum Principle. An appropriate model is established that shows an excellent agreement between simulation and experimental results. Subsequently, we concentrate on the speed maximization of a robotic fish both in several virtual environments and experiments using visual data. Once again, we find that deep Reinforcement Learning can find an excellent swimming gait with a square wave control that maximizes the swimming speed
Brenon, Alexis. "Modèle profond pour le contrôle vocal adaptatif d'un habitat intelligent." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM057/document.
Full textSmart-homes, resulting of the merger of home-automation, ubiquitous computing and artificial intelligence, support inhabitants in their activity of daily living to improve their quality of life.Allowing dependent and aged people to live at home longer, these homes provide a first answer to society problems as the dependency tied to the aging population.In voice controlled home, the home has to answer to user's requests covering a range of automated actions (lights, blinds, multimedia control, etc.).To achieve this, the control system of the home need to be aware of the context in which a request has been done, but also to know user habits and preferences.Thus, the system must be able to aggregate information from a heterogeneous home-automation sensors network and take the (variable) user behavior into account.The development of smart home control systems is hard due to the huge variability regarding the home topology and the user habits.Furthermore, the whole set of contextual information need to be represented in a common space in order to be able to reason about them and make decisions.To address these problems, we propose to develop a system which updates continuously its model to adapt itself to the user and which uses raw data from the sensors through a graphical representation.This new method is particularly interesting because it does not require any prior inference step to extract the context.Thus, our system uses deep reinforcement learning; a convolutional neural network allowing to extract contextual information and reinforcement learning used for decision-making.Then, this memoir presents two systems, a first one only based on reinforcement learning showing limits of this approach against real environment with thousands of possible states.Introduction of deep learning allowed to develop the second one, ARCADES, which gives good performances proving that this approach is relevant and opening many ways to improve it
Mesnard, Thomas. "Attribution de crédit pour l'apprentissage par renforcement dans des réseaux profonds." Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAX155.
Full textDeep reinforcement learning has been at the heart of many revolutionary results in artificial intelligence in the last few years. These agents are based on credit assignment techniques that try to establish correlations between past actions and future events and use these correlations to become effective in a given task. This problem is at the heart of the current limitations of deep reinforcement learning and credit assignment techniques used today remain relatively rudimentary and incapable of inductive reasoning. This thesis therefore focuses on the study and formulation of new credit assignment methods for deep reinforcement learning. Such techniques could speed up learning, make better generalization when agents are trained on multiple tasks, and perhaps even allow the emergence of abstraction and reasoning
Books on the topic "Apprentissage par renforcement profond"
Sutton, Richard S. Reinforcement learning: An introduction. Cambridge, Mass: MIT Press, 1998.
Find full textOntario. Esquisse de cours 12e année: Sciences de l'activité physique pse4u cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Technologie de l'information en affaires btx4e cours préemploi. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Études informatiques ics4m cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Mathématiques de la technologie au collège mct4c cours précollégial. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Sciences snc4m cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: English eae4e cours préemploi. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Le Canada et le monde: une analyse géographique cgw4u cours préuniversitaire. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Environnement et gestion des ressources cgr4e cours préemploi. Vanier, Ont: CFORP, 2002.
Find full textOntario. Esquisse de cours 12e année: Histoire de l'Occident et du monde chy4c cours précollégial. Vanier, Ont: CFORP, 2002.
Find full textBook chapters on the topic "Apprentissage par renforcement profond"
Tazdaït, Tarik, and Rabia Nessah. "5. Vote et apprentissage par renforcement." In Le paradoxe du vote, 157–77. Éditions de l’École des hautes études en sciences sociales, 2013. http://dx.doi.org/10.4000/books.editionsehess.1931.
Full textJACQUEMONT, Mikaël, Thomas VUILLAUME, Alexandre BENOIT, Gilles MAURIN, and Patrick LAMBERT. "Analyse d’images Cherenkov monotélescope par apprentissage profond." In Inversion et assimilation de données de télédétection, 303–35. ISTE Group, 2023. http://dx.doi.org/10.51926/iste.9142.ch9.
Full textHADJADJ-AOUL, Yassine, and Soraya AIT-CHELLOUCHE. "Utilisation de l’apprentissage par renforcement pour la gestion des accès massifs dans les réseaux NB-IoT." In La gestion et le contrôle intelligents des performances et de la sécurité dans l’IoT, 27–55. ISTE Group, 2022. http://dx.doi.org/10.51926/iste.9053.ch2.
Full textBENDELLA, Mohammed Salih, and Badr BENMAMMAR. "Impact de la radio cognitive sur le green networking : approche par apprentissage par renforcement." In Gestion du niveau de service dans les environnements émergents. ISTE Group, 2020. http://dx.doi.org/10.51926/iste.9002.ch8.
Full textConference papers on the topic "Apprentissage par renforcement profond"
Fourcade, A. "Apprentissage profond : un troisième oeil pour les praticiens." In 66ème Congrès de la SFCO. Les Ulis, France: EDP Sciences, 2020. http://dx.doi.org/10.1051/sfco/20206601014.
Full textReports on the topic "Apprentissage par renforcement profond"
Melloni, Gian. Le leadership des autorités locales en matière d'assainissement et d'hygiène : expériences et apprentissage de l'Afrique de l'Ouest. Institute of Development Studies (IDS), January 2022. http://dx.doi.org/10.19088/slh.2022.002.
Full text