Toggle navigation
OpenReview
.net
Login
×
Back to
EWRL
EWRL 2024 Workshop Submissions
Recurrent Natural Policy Gradient for POMDPs
Semih Cayci
,
Atilla Eryilmaz
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Combining Automated Optimisation of Hyperparameters and Reward Shape
Julian Dierkes
,
Emma Cramer
,
Sebastian Trimpe
,
Holger Hoos
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning
Mathieu Rita
,
Florian Strub
,
Rahma Chaabouni
,
Paul Michel
,
Emmanuel Dupoux
,
Olivier Pietquin
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Robust Chain of Thoughts Preference Optimization
Eugene Choi
,
Arash Ahmadian
,
Olivier Pietquin
,
Matthieu Geist
,
Mohammad Gheshlaghi Azar
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Stochastic Q-learning for Large Discrete Action Spaces
Fares Fourati
,
Vaneet Aggarwal
,
Mohamed-Slim Alouini
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Environment Complexity and Nash Equilibria in a Sequential Social Dilemma
Mustafa Yasir
,
Andrew Howes
,
Vasilios Mavroudis
,
Chris Hicks
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Private Online Learning in Adversarial MDPs: Full-Information and Bandit
Shaojie Bai
,
Lanting Zeng
,
Chengcheng Zhao
,
Xiaoming Duan
,
Mohammad Sadegh Talebi
,
Peng Cheng
,
Jiming Chen
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Tractable Offline Learning of Regular Decision Processes
Ahana Deb
,
Roberto Cipollone
,
Anders Jonsson
,
Alessandro Ronca
,
Mohammad Sadegh Talebi
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Using a Learned Policy Basis to Optimally Solve Reward Machines
Guillermo Infante
,
David Kuric
,
Vicenç Gómez
,
Anders Jonsson
,
Herke van Hoof
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Dual-Force: Enhanced Offline Diversity Maximization under Imitation Constraints
Pavel Kolev
,
Marin Vlastelica
,
Georg Martius
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Denoised Predictive Imagination: An Information-theoretic approach for learning World Models
Vedant Dave
,
Elmar Rueckert
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Viability of Future Actions: Robust Reinforcement Learning via Entropy Regularization
Pierre-François Massiani
,
Alexander von Rohr
,
Lukas Haverbeck
,
Sebastian Trimpe
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Hadamard Representations: Augmenting Hyperbolic Tangents in RL
Jacob Eeuwe Kooi
,
Mark Hoogendoorn
,
Vincent Francois-Lavet
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Trust the Model Where It Trusts Itself - Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Bernd Frauenknecht
,
Artur Eisele
,
Devdutt Subhasish
,
Friedrich Solowjow
,
Sebastian Trimpe
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Can Decentralized Q-learning learn to collude?
Janusz M Meylahn
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Contextualized Hybrid Ensemble Q-learning: Learning Fast with Control Priors
Emma Cramer
,
Bernd Frauenknecht
,
Ramil Sabirov
,
Sebastian Trimpe
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Offline Reinforcement Learning with Pessimistic Value Priors
Filippo Valdettaro
,
Aldo A. Faisal
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Sample-efficient reinforcement learning for environments with rare high-reward states
Daniel G Mastropietro
,
Urtzi Ayesta
,
Matthieu Jonckheere
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Understanding the Gaps in Satisficing Bandits
Chloé Rouyer
,
Ronald Ortner
,
Peter Auer
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Linear Bandits with Memory
Giulia Clerici
,
Pierre Laforgue
,
Nicolò Cesa-Bianchi
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Directed Exploration in Reinforcement Learning from Linear Temporal Logic
Marco Bagatella
,
Andreas Krause
,
Georg Martius
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Revisiting On-Policy Deep Reinforcement Learning
Mahdi Kallel
,
Samuele Tosatto
,
Carlo D'Eramo
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Controller Synthesis from Deep Reinforcement Learning Policies
Florent Delgrange
,
Guy Avni
,
Anna Lukina
,
Christian Schilling
,
Ann Nowe
,
Guillermo Perez
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
A Minimax-Bayes Approach to Ad Hoc Teamwork
Victor Villin
,
Christos Dimitrakakis
,
Thomas Kleine Buening
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
Learning to Explore with Lagrangians for Bandits under Unknown Constraints
Udvas Das
,
Debabrota Basu
Published: 01 Aug 2024, Last Modified: 09 Oct 2024
EWRL17
Readers:
Everyone
«
‹
1
2
3
4
5
›
»