Academic literature on the topic 'Epsilon greedy'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Epsilon greedy.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Epsilon greedy"

1

Liu, Yang, Qiuyu Lu, Zhenfan Yu, Yue Chen, and Yinguo Yang. "Reinforcement Learning-Enhanced Adaptive Scheduling of Battery Energy Storage Systems in Energy Markets." Energies 17, no. 21 (2024): 5425. http://dx.doi.org/10.3390/en17215425.

Full text
Abstract:
Battery Energy Storage Systems (BESSs) play a vital role in modern power grids by optimally dispatching energy according to the price signal. This paper proposes a reinforcement learning-based model that optimizes BESS scheduling with the proposed Q-learning algorithm combined with an epsilon-greedy strategy. The proposed epsilon-greedy strategy-based Q-learning algorithm can efficiently manage energy dispatching under uncertain price signals and multi-day operations without retraining. Simulations are conducted under different scenarios, considering electricity price fluctuations and battery
APA, Harvard, Vancouver, ISO, and other styles
2

Kyoung, Dohyun, and Yunsick Sung. "Transformer Decoder-Based Enhanced Exploration Method to Alleviate Initial Exploration Problems in Reinforcement Learning." Sensors 23, no. 17 (2023): 7411. http://dx.doi.org/10.3390/s23177411.

Full text
Abstract:
In reinforcement learning, the epsilon (ε)-greedy strategy is commonly employed as an exploration technique This method, however, leads to extensive initial exploration and prolonged learning periods. Existing approaches to mitigate this issue involve constraining the exploration range using expert data or utilizing pretrained models. Nevertheless, these methods do not effectively reduce the initial exploration range, as the exploration by the agent is limited to states adjacent to those included in the expert data. This paper proposes a method to reduce the initial exploration range in reinfo
APA, Harvard, Vancouver, ISO, and other styles
3

KURNIAWATI, NAZMIA, YULI KURNIA NINGSIH, SOFIA DEBI PUSPA, and TRI SWASONO ADI. "Algoritma Epsilon Greedy pada Reinforcement Learning untuk Modulasi Adaptif Komunikasi Vehicle to Infrastructure (V2I)." ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika 9, no. 3 (2021): 716. http://dx.doi.org/10.26760/elkomika.v9i3.716.

Full text
Abstract:
ABSTRAKKomunikasi Vehicle to Infrastructure (V2I) memungkinkan kendaraan dapat terhubung ke berbagai macam infrastruktur. Dengan kondisi kendaraan yang bergerak, maka kondisi lingkungan yang dilewati mempengaruhi parameter komunikasi. Implementasi modulasi adaptif pada skema V2I memperbolehkan sistem menggunakan skema modulasi yang berbeda untuk mengakomodasi perubahan kondisi lingkungan. Pada penelitian ini digunakan skema modulasi QPSK, 8PSK, dan 16-QAM dengan memanfaatkan reinforcement learning dan algoritma epsilon greedy untuk menentukan skema modulasi yang digunakan berdasarkan level AWG
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Zizhuo. "Investigation of progress and application related to Multi-Armed Bandit algorithms." Applied and Computational Engineering 37, no. 1 (2024): 155–59. http://dx.doi.org/10.54254/2755-2721/37/20230496.

Full text
Abstract:
This paper discusses four Multi-armed Bandit algorithms: Explore-then-Commit (ETC), Epsilon-Greedy, Upper Confidence Bound (UCB), and Thompson Sampling algorithm. ETC algorithm aims to spend the majority of rounds on the best arm, but it can lead to a suboptimal outcome if the environment changes rapidly. The Epsilon-Greedy algorithm is designed to explore and exploit simultaneously, while it often tries sub-optimal arm even after the algorithm finds the best arm. Thus, the Epsilon-Greedy algorithm performs well when the environment continuously changes. UCB algorithm is one of the most used M
APA, Harvard, Vancouver, ISO, and other styles
5

Yashiki, Koudai, Masayuki Wajima, Takashi Kawakami, Takahumi Oohori, and Masahiro Kinoshita. "2A1-J10 The group behavior using a epsilon-greedy." Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) 2007 (2007): _2A1—J10_1—_2A1—J10_2. http://dx.doi.org/10.1299/jsmermd.2007._2a1-j10_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Liu, Qiaojia. "Optimizing Short and Long Term Investment Returns Using Multi-Armed Slot Machine Algorithms." Applied and Computational Engineering 83, no. 1 (2024): 110–19. http://dx.doi.org/10.54254/2755-2721/83/2024glg0068.

Full text
Abstract:
This study focuses on comparing the effectiveness of UCB, Thompson sampling and Epsilon-Greedy algorithms in multi-armed slot machine algorithms for short-term and long-term investment return optimization in financial markets. This analysis examines stock performance data from Tesla, General Motors, and Ford over specific periods: five years of weekly data (2019-2024) and six months of daily data (February to August 2024). From the results, it is shown that for long-term investments, Over the five-year period, Thompson Sampling outperformed UCB and Epsilon-Greedy algorithms in terms of stabili
APA, Harvard, Vancouver, ISO, and other styles
7

Dell'Aversana, Paolo. "Reinforcement learning in optimization problems. Applications to geophysical data inversion." AIMS Geosciences 8, no. 3 (2022): 488–502. http://dx.doi.org/10.3934/geosci.2022027.

Full text
Abstract:
<abstract> <p>In this paper, we introduce a novel inversion methodology that combines the benefits offered by Reinforcement-Learning techniques with the advantages of the Epsilon-Greedy method for an expanded exploration of the model space. Among the various Reinforcement Learning approaches, we applied the set of algorithms included in the category of the Q-Learning methods. We show that the Temporal Difference algorithm offers an effective iterative approach that allows finding an optimal solution in geophysical inverse problems. Furthermore, the Epsilon-Greedy method properly co
APA, Harvard, Vancouver, ISO, and other styles
8

You, Xinhong, Pengping Zhang, Minglin Liu, Lingqi Lin, and Shuai Li. "Epsilon-Greedy-Based MQTT QoS Mode Selection and Power Control Algorithm for Power Distribution IoT." International Journal of Mobile Computing and Multimedia Communications 14, no. 1 (2023): 1–18. http://dx.doi.org/10.4018/ijmcmc.306976.

Full text
Abstract:
Employing message queuing telemetry transport (MQTT) in the power distribution internet of things (PD-IoT) can meet the demands of reliable data transmission while significantly reducing energy consumption through the dynamic and flexible selection of three different quality of service (QoS) modes and power control. However, there are still some challenges, including incomplete information, coupling of optimization variables, and dynamic tradeoff between packet-loss ratio and energy consumption. In this paper, the authors propose a joint optimization algorithm named EMMA for MQTT QoS mode sele
APA, Harvard, Vancouver, ISO, and other styles
9

Zhang, Lingxiang. "Analyzing the strengths and weaknesses of diverse algorithms for solving Multi-Armed Bandit problems using Python." Applied and Computational Engineering 68, no. 1 (2024): 205–14. http://dx.doi.org/10.54254/2755-2721/68/20241407.

Full text
Abstract:
With the rapid advancement of science and technology, the internet has become an integral part of daily life, revolutionizing how people access information and make decisions. In this context, algorithms play a pivotal role in helping individuals make informed choices tailored to their preferences across various domains. Utilizing the MovieLens dataset (https://grouplens.org/datasets/movielens/1m/), which contains a rich compilation of movie ratings and metadata, this study conducts a thorough analysis using Python to assess the performance of four distinct algorithms: Explore-then-Commit (ETC
APA, Harvard, Vancouver, ISO, and other styles
10

Malon, Krzysztof. "Evaluation of Radio Channel Utility using Epsilon-Greedy Action Selection." Journal of Telecommunictions and Information Technology 3, no. 2021 (2021): 10–17. http://dx.doi.org/10.26636/jtit.2021.153621.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Epsilon greedy"

1

Fadlallah, Sami. "Green chemistry in polymerisation : elaboration and development of novel organometallic complexes of the rare-earth metals for their application in (Co)-polymerisation catalysis." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10069/document.

Full text
Abstract:
De nouveaux complexes allyl-borohydrure de terres rares trivalents, RE(BH4)2(C3H5)(THF)x (RE = Sc, x = 2; Y, La, Nd, Sm, x = 3) ont été synthétisés. Les complexes ont été caractérisés, y compris par diffraction des rayons X, et leur réactivité vis-à-vis de l’insertion de petites molécules organiques est décrite, qui met en jeu de façon comparative les liaisons métal-borohydrure et métal-allyle. Dans ce travail de thèse, il a été montré que le complexe de néodyme est capable d’amorcer la polymérisation de l’isoprène, seul ou combiné avec un co-catalyseur de type magnésium, conduisant à du trans
APA, Harvard, Vancouver, ISO, and other styles
2

Ebadi, Nasim. "Estimating Costs of Reducing Environmental Emissions From a Dairy Farm: Multi-objective epsilon-constraint Optimization Versus Single Objective Constrained Optimization." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/99304.

Full text
Abstract:
Agricultural production is an important source of environmental emissions. While water quality concerns related to animal agriculture have been studied extensively, air quality issues have become an increasing concern. Due to the transfer of nutrients between air, water, and soil, emissions to air can harm water quality. We conduct a multi-objective optimization analysis for a representative dairy farm with two different approaches: nonlinear programming (NLP) and ϵ-constraint optimization to evaluate trade-offs among reduction of multiple pollutants including nitrogen (N), phosphorus (P), gre
APA, Harvard, Vancouver, ISO, and other styles
3

Kaddour, Karim. "Traduction commentée du Grand commentaire d' Averroès aux livres petit Alpha, grand Alpha, Gamma et Epsilon de la Métaphysique d' Aristote." Thesis, Paris 1, 2018. http://www.theses.fr/2018PA01H233/document.

Full text
Abstract:
L’objet de ce présent travail consiste dans une traduction commentée du Grand commentaire d’Averroès de la Métaphysique d’Aristote à partir du texte arabe établi par le père Maurice Bouyges. Cette traduction concerne principalement les livres Grand Alpha, Petit Alpha ; Gamma ; Epsilon. Ce travail s’inscrit dans l’intérêt que nous portons à la transmission de la pensée grecque chez les auteurs musulmans du Moyen Âge, et plus particulièrement à la restitution de la pensée métaphysique d’Aristote chez Averroès. À travers cette traduction, les enjeux sont multiples : traduire le texte arabe de la
APA, Harvard, Vancouver, ISO, and other styles
4

Baheti, Payal. "Clean synthesis of novel green surfactants." Thesis, Montpellier, Ecole nationale supérieure de chimie, 2018. http://www.theses.fr/2018ENCM0012.

Full text
Abstract:
Les polymères en étoile connaissent un intérêt accru en raison de leurs propriétés thermiques et mécaniques inimitables. Partant du constat qu’en parallèle la chimie durable se développe à un rythme sans précédent, nous proposons dans cette thèse de développer une stratégie plus « verte » pour la synthèse de polymères en étoile de type D-sorbitol-poly(ε-caprolactone) (star PCL-OHx). Ces derniers seront synthétisés sans solvant (en masse) ou dans des solvants « propres » (CO2 supercritique) et en présence du catalyseur métallique Sn(Oct)2 (qui a été approuvé par la FDA) ou d’un catalyseur enzym
APA, Harvard, Vancouver, ISO, and other styles
5

HOANG, QUYNH NGOC, and 黃瓊玉. "Integrating Epsilon-Based and Grey Forecasting Model to Evaluate the Performance of Packaging Industry." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/39ze6e.

Full text
Abstract:
碩士<br>國立高雄科技大學<br>工業工程與管理系<br>107<br>Nowadays, Vietnamese packaging is considered to be the fastest growing industries, especially when the domestic goods production demand for local market and export is increasing strongly, leading to a tremendous need of packaging developed. This research aims to calculate the productivity performance of 12 packaging companies in Vietnam that listed in Viet’s stock exchange from 2014 to 2017, by combined Grey forecasting model to predict financial future values for the next 4 years’ period (2018- 2021) and an epsilon-based measure of efficiency in DEA (EBM)
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Epsilon greedy"

1

Nikolaou, Thanasēs. Hoi epsilon. Ekdotikos Organismos Livan̄e, 2007.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Young, Jock. Epsilon Zeta: A novel. Harbor House, 2006.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Comoth, Katharina. Hestia: Zur Bedeutung des mystischen [Epsilon]. Universitätsverlag C. Winter, 1998.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Harrison, Bryan, Steve Slotnick, and Lin Hanson. Omicron, renaissant at 150. Omicron, 2005.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Baker, Barbara White. Making a difference through timeless service: The Iota Epsilon Omega story, 1971-2013. Iota Epsilon Omega Chapter, 2014.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Goutard, Geneviève-Germaine. L' énigme de l'epsilon de Delphes: Le passé dévoilé. Dossiers d'Aquitaine et d'ailleurs, 1997.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Griffin, J. Calvin, and Otto G. Reumann. Explanation of Uniform System of Accounting for Sigma Phi Epsilon Chapter. Creative Media Partners, LLC, 2018.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Aristóteles. Metaphysics: Books Gamma, Delta, and Epsilon (Clarendon Aristotle Series). Oxford University Press, USA, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Aristóteles. Metaphysics: Books Gamma, Delta, and Epsilon (Clarendon Aristotle Series). 2nd ed. Oxford University Press, USA, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Βασίλης Λάσκας. Εύα και Αδάμ, το μήλο που έγινε χρήματα.: Greece. Independently Published, 2022.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Epsilon greedy"

1

Tokic, Michel, and Günther Palm. "Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax." In KI 2011: Advances in Artificial Intelligence. Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24455-1_33.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

da Silva Lima, Gabriel, Vinícius Rosa Cota, and Wallace Moreira Bessa. "In Silico Application of the Epsilon-Greedy Algorithm for Frequency Optimization of Electrical Neurostimulation for Hypersynchronous Disorders." In Communications in Computer and Information Science. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-63848-0_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Jeffery, L. H. "The Peloponnese." In The Local Scripts of Archaic Greece. Oxford University PressOxford, 1990. http://dx.doi.org/10.1093/oso/9780198140610.003.0008.

Full text
Abstract:
Abstract The Corinthian beta has been called an artificial derivation from pi, to avoid confusion with the epsilon. 1 I incline to think rather that it arose from an incorrect rendering of the primitive beta-type in which the ‘hooks’ are not closed (as in There, Naxos-Paros, Argos, Gortyn in Crete), the lower hook of the Corinthian beta being twisted in the reverse direction.2 This may have been done deliberately, because of the epsilon; but in Melos, where the epsilon is normal, the twisted beta is also used, and so it is perhaps more likely that it was an original error, and that the curious
APA, Harvard, Vancouver, ISO, and other styles
4

Brubaker, Ben, Daniel Bump, and Solomon Friedberg. "The Reduction to Statement D." In Weyl Group Multiple Dirichlet Series. Princeton University Press, 2011. http://dx.doi.org/10.23943/princeton/9780691150659.003.0013.

Full text
Abstract:
This chapter focuses on the language of resotopes and assumes that γ‎Lsubscript Greek small letter epsilon and γ‎Rsubscript Greek small letter epsilon are multiples of n for every totally resonant episode. It also recalls that s and d are the weights of the accordions under consideration. It begins with the proposition that Statement D is equivalent to Statement C, and that Statement D is true if n ≠ s. It then describes the case of a totally resonant short Gelfand-Tsetlin pattern before presenting the proof that Statement D implies Statement B. It shows that the reduction to Statement D is st
APA, Harvard, Vancouver, ISO, and other styles
5

Macintosh Wilson, Alistair. "Apollonius The I Great Geometer." In The Infinite In The Finite. Oxford University PressOxford, 1995. http://dx.doi.org/10.1093/oso/9780198539506.003.0014.

Full text
Abstract:
Abstract Apollonius was born at Perga in Pamphylia, near Antalya in present-day Turkey, around 262 BC. He is believed to have studied at Alexandria, under the famous astronomer Aristarchus of Samos. Whilst studying with Aristarchus, Apollonius was given the nickname ‘Epsilon’, because of his interest in the theory of the Moon, whose crescent looks like that Greek letter. One of the major problems of Greek astronomy was why the planets sometimes appear to move backwards in the sky. For example, Fig. 14.1 shows the backward, or retrograde, motion of the planet Mars.
APA, Harvard, Vancouver, ISO, and other styles
6

Pickover, Clifford A. "The Huascardn Box." In Wonders of Numbers. Oxford University PressNew York, NY, 2001. http://dx.doi.org/10.1093/oso/9780195133424.003.0106.

Full text
Abstract:
Abstract For the first problem, turn on the red finger for 10 seconds. Turn off the red finger and turn on the green finger. Quickly open the box. If the fan is continually spinning, then the green finger is the one. If the fan is spinning but slows down, then it is connected to the red finger. Otherwise, it is the yellow finger. (Physicist Dick Hess of Rancho Palos Verdes, California, proposed a similar problem in the 1998 Pi Mu Epsilon Journal, vol. 10, no. 8, p. 660.)
APA, Harvard, Vancouver, ISO, and other styles
7

Brubaker, Ben, Daniel Bump, and Solomon Friedberg. "Types." In Weyl Group Multiple Dirichlet Series. Princeton University Press, 2011. http://dx.doi.org/10.23943/princeton/9780691150659.003.0011.

Full text
Abstract:
This chapter divides the prototypes into much smaller units called types. It fixes a top and bottom row, and therefore a cartoon. For each episode ε‎ of the cartoon, the chapter fixes an integer κ‎subscript Greek small letter epsilon. Then the set of all short Gelfand-Tsetlin patterns with the given top and bottom rows is called a type. Thus two patterns are in the same type if and only if they have the same top and bottom rows (and hence the same cartoon), and if the sum of the first (middle) row elements in each episode is the same for both patterns. The possible episodes may be grouped into
APA, Harvard, Vancouver, ISO, and other styles
8

Hoenig, Alan. "About Tex And Latex." In Tex Unbound. Oxford University PressNew York, NY, 1998. http://dx.doi.org/10.1093/oso/9780195096859.003.0001.

Full text
Abstract:
Abstract It is surprising how difficult it is to automate the process of typesetting. The TEX (ideally pronounced to rhyme with blec-c-c-ch) typesetting system is (arguably) the best way to do this by computer. TEX was designed to cope with the intricacies of mathematical and technical typesetting and especially to deliver beautifully typeset documents. The source of the “TEX” logo is the Greek root !’EX (tau epsilon chi), which forms part of words like “technology.” What is KTeX, and how does it differ from TEX? In ways we will discuss, TEX typesetting commands have been used to create many n
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Ziyuan, Zichen Wang, Zhilei Tan, Jiandong Cui, and Shiru Jia. "Application of ε-poly-L-lysine in Improving Food Quality and Safety." In Bio-Based Antimicrobial Agents to Improve Agricultural and Food Safety. BENTHAM SCIENCE PUBLISHERS, 2024. http://dx.doi.org/10.2174/9789815256239124010009.

Full text
Abstract:
Each year, economic losses in the food industry due to spoilage of grain, aquatic products and fruit are huge. People now express more concern about food safety and nutrition, therefore, the need for green preservatives is also growing. Epsilon-poly-L-lysine (ε-PL), a cationic polyamino acid with 25–35 L-lysine residues, possesses broad-spectrum antimicrobial activity, biodegradable properties, resistance to high temperature, and non-toxicity and can dissolve in water. So, it has been extensively applied in the field of preservatives for foodstuffs, agriculture and biomedicine. Thus, the chapt
APA, Harvard, Vancouver, ISO, and other styles
10

Gullino, Silvia. "Avicenna’s Interpretation of Aristotle’s Metaphysics (Ε1, 1026a13-16)." In Proceedings of the XXIII World Congress of Philosophy. Philosophy Documentation Center, 2018. http://dx.doi.org/10.5840/wcp232018221297.

Full text
Abstract:
During the 9th century Aristotle’s Metaphysics was translated for the first time from Greek into Arabic by Ustâth, at the request of al-Kindî and, afterwards, the interest of the Arab world in this oeuvre grew with the production of several translations, comments and paraphrases of the work. Among the books which compose the Metaphysics, one of the most studied was book Epsilon. In particular Arab philosophers focused their interest on the passage of Ε1, which contains a classification of the theoretical sciences (1026a13-1026a16), founded on the degree of immateriality and of separation from
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Epsilon greedy"

1

Yeole, Anuradha, Ami Desai, and Kriti Srivastava. "Optimizing Fashion Recommendations for Diverse Body Types: An Epsilon Greedy Approach." In 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2024. http://dx.doi.org/10.1109/icccnt61001.2024.10724611.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wei, Wenhong, Yi Meng, and Qingxia Li. "A Novel Reinforcement Learning Multi-Objective Community Detection Algorithm with $\epsilon$-Gradient-Greedy Strategy." In 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2024. https://doi.org/10.1109/smc54092.2024.10831561.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Rodr�guez-Fragoso, Mart�n, Octavio Elizalde-Solis, and Edgar Ram�rez-Jim�nez. "Reaction Pathway Optimization Using Reinforcement Learning in Steam Methane Reforming and Associated Parallel Reactions." In The 35th European Symposium on Computer Aided Process Engineering. PSE Press, 2025. https://doi.org/10.69997/sct.128960.

Full text
Abstract:
This study presents the application of a Q-learning algorithm to optimize the selection of chemical reactions for methane reforming processes. Starting with a set of 11 candidate reactions, the algorithm identified three key reactions. These reactions effectively represent the experimental data while aligning with the underlying physics of the process and previously reported findings. The algorithm employed an epsilon-greedy policy to balance exploration and exploitation during the training process. Furthermore, simulations based on the identified reactions revealed trends consistent with expe
APA, Harvard, Vancouver, ISO, and other styles
4

Kim, Chyon Hae, Kanta Watanabe, Shun Nishide, and Manabu Gouko. "Epsilon-greedy babbling." In 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob). IEEE, 2017. http://dx.doi.org/10.1109/devlrn.2017.8329812.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Deng, Yue, Zirui Wang, and Yin Zhang. "Improving Multi-agent Reinforcement Learning with Stable Prefix Policy." In Thirty-Third International Joint Conference on Artificial Intelligence {IJCAI-24}. International Joint Conferences on Artificial Intelligence Organization, 2024. http://dx.doi.org/10.24963/ijcai.2024/6.

Full text
Abstract:
In multi-agent reinforcement learning (MARL), the epsilon-greedy method plays an important role in balancing exploration and exploitation during the decision-making process in value-based algorithms. However, the epsilon-greedy exploration process will introduce conservativeness when calculating the expected state value when the agents are more in need of exploitation during the approximate policy convergence, which may result in a suboptimal policy convergence. Besides, eliminating the epsilon-greedy algorithm leaves no exploration and may lead to unacceptable local optimal policies. To addre
APA, Harvard, Vancouver, ISO, and other styles
6

Guzel, Basak Esin Kokturk, Bora Mocan, Busra Arslan, Gokce Polat, and Tarik Kavusan. "Demographic Targeting With Epsilon-greedy Exploration in Digital Advertising." In 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021. http://dx.doi.org/10.1109/ubmk52708.2021.9558951.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Guzel, Basak Esin Kokturk, Bora Mocan, Busra Arslan, Gokce Polat, and Tarik Kavusan. "Demographic Targeting With Epsilon-greedy Exploration in Digital Advertising." In 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021. http://dx.doi.org/10.1109/ubmk52708.2021.9558951.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kuang, Nikki Lijing, and Clement H. C. Leung. "Performance Effectiveness of Multimedia Information Search Using the Epsilon-Greedy Algorithm." In 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 2019. http://dx.doi.org/10.1109/icmla.2019.00160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Crawford, Victoria G. "Faster Guarantees of Evolutionary Algorithms for Maximization of Monotone Submodular Functions." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/229.

Full text
Abstract:
In this paper, the monotone submodular maximization problem (SM) is studied. SM is to find a subset of size kappa from a universe of size n that maximizes a monotone submodular objective function f . We show using a novel analysis that the Pareto optimization algorithm achieves a worst-case ratio of (1 − epsilon)(1 − 1/e) in expectation for every cardinality constraint kappa &lt; P , where P ≤ n + 1 is an input, in O(nP ln(1/epsilon)) queries of f . In addition, a novel evolutionary algorithm called the biased Pareto optimization algorithm, is proposed that achieves a worst-case ratio of (1 −
APA, Harvard, Vancouver, ISO, and other styles
10

Cai, Fuxi, and Wei D. Lu. "Epsilon-greedy strategy for online dictionary learning with realistic memristor array constraints." In 2017 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH). IEEE, 2017. http://dx.doi.org/10.1109/nanoarch.2017.8053730.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!