( events)
Timezone: »
Workshop
Mon Dec 13 08:15 AM -- 06:00 PM (PST)
Deep Reinforcement Learning
In recent years, the use of deep neural networks as function approximators has enabled researchers to extend reinforcement learning techniques to solve increasingly complex control tasks. The emerging field of deep reinforcement learning has led to remarkable empirical results in rich and varied domains like robotics, strategy games, and multiagent interactions. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help interested researchers outside of the field gain perspective about the current state of the art and potential directions for future contributions.
Welcome and Introduction (Welcoming Notes) | |
Invited Talk: Anna Harutyunyan (Talk) | |
Anna Harutyunyan Talk Q&A (Q&A) | |
Implicit Behavioral Cloning (Oral) | |
Implicit Behavioral Cloning Q&A (Q&A) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Oral) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization Q&A (Q&A) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Oral) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation Q&A (Q&A) | |
Benchmarking the Spectrum of Agent Capabilities (Oral) | |
Benchmarking the Spectrum of Agent Capabilities Q&A (Q&A) | |
Invited Talk: Laura Schulz (Talk) | |
Laura Schulz Talk Q&A (Q&A) | |
Break | |
Opinion Contributed Talk: Wilka Carvalho (Talk) | |
Wilka Carvalho Talk Q&A (Q&A) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Oral) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning Q&A (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Oral) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision Q&A (Q&A) | |
Invited Talk: George Konidaris (Talk) | |
George Konidaris Talk Q&A (Q&A) | |
Poster Session (in Gather Town) (Poster Session) | |
Opinion Contributed Talk: Sergey Levine (Talk) | |
Sergey Levine Talk Q&A (Q&A) | |
Panel Discussion 1 (Panel Discussion) | |
Invited Talk: Dale Schuurmans (Talk) | |
Dale Schuurmans Talk Q&A (Q&A) | |
Break | |
Invited Talk: Karol Hausman (Talk) | |
Karol Hausman Talk Q&A (Q&A) | |
NeurIPS RL Competitions Results Presentations (Presentations) | |
Invited Talk: Kenji Doya (Talk) | |
Kenji Doya Talk Q&A (Q&A) | |
Panel Discussion 2 (Panel Discussion) | |
Learning a Subspace of Policies for Online Adaptation in Reinforcement Learning (Poster) | |
Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games (Poster) | |
Implicit Behavioral Cloning (Poster) | |
Block Contextual MDPs for Continual Learning (Poster) | |
Mismatched No More: Joint Model-Policy Optimization for Model-Based RL (Poster) | |
Hybrid Imitative Planning with Geometric and Predictive Costs in Offroad Environments (Poster) | |
Continuous Control With Ensemble Deep Deterministic Policy Gradients (Poster) | |
Graph Backup: Data Efficient Backup Exploiting Markovian Data (Poster) | |
Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments (Poster) | |
Policy Optimization via Optimal Policy Evaluation (Poster) | |
Math Programming based Reinforcement Learning for Multi-Echelon Inventory Management (Poster) | |
Distributional Decision Transformer for Offline Hindsight Information Matching (Poster) | |
Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies (Poster) | |
Exponential Family Model-Based Reinforcement Learning via Score Matching (Poster) | |
From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation (Poster) | |
StarCraft II Unplugged: Large Scale Offline Reinforcement Learning (Poster) | |
Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation (Poster) | |
Meta Arcade: A Configurable Environment Suite for Deep Reinforcement Learning and Meta-Learning (Poster) | |
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (Poster) | |
Embodiment perspective of reward definition for behavioural homeostasis (Poster) | |
Grounding Aleatoric Uncertainty in Unsupervised Environment Design (Poster) | |
Learning compositional tasks from language instructions (Poster) | |
Status-quo policy gradient in Multi-Agent Reinforcement Learning (Poster) | |
Adaptive Scheduling of Data Augmentation for Deep Reinforcement Learning (Poster) | |
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning (Poster) | |
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals (Poster) | |
Understanding the Effects of Dataset Composition on Offline Reinforcement Learning (Poster) | |
Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning (Poster) | |
Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models (Poster) | |
Strength Through Diversity: Robust Behavior Learning via Mixture Policies (Poster) | |
OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning (Poster) | |
Self-Imitation Learning from Demonstrations (Poster) | |
Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari (Poster) | |
The Information Geometry of Unsupervised Reinforcement Learning (Poster) | |
TARGETED ENVIRONMENT DESIGN FROM OFFLINE DATA (Poster) | |
GrASP: Gradient-Based Affordance Selection for Planning (Poster) | |
The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks (Poster) | |
Transfer RL across Observation Feature Spaces via Model-Based Regularization (Poster) | |
Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning (Poster) | |
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks (Poster) | |
Stability Analysis in Mixed-Autonomous Traffic with Deep Reinforcement Learning (Poster) | |
Target Entropy Annealing for Discrete Soft Actor-Critic (Poster) | |
Modern Hopfield Networks for Return Decomposition for Delayed Rewards (Poster) | |
CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery (Poster) | |
General Characterization of Agents by States they Visit (Poster) | |
Learning Efficient Multi-Agent Cooperative Visual Exploration (Poster) | |
Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning (Poster) | |
Attention-based Partial Decoupling of Policy and Value for Generalization in Reinforcement Learning (Poster) | |
Off-Policy Correction For Multi-Agent Reinforcement Learning (Poster) | |
A Family of Cognitively Realistic Parsing Environments for Deep Reinforcement Learning (Poster) | |
Long-Term Credit Assignment via Model-based Temporal Shortcuts (Poster) | |
Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks (Poster) | |
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks (Poster) | |
An Empirical Study of Non-Uniform Sampling in Off-Policy Reinforcement Learning for Continuous Control (Poster) | |
BLAST: Latent Dynamics Models from Bootstrapping (Poster) | |
Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks (Poster) | |
Learning Robust Dynamics through Variational Sparse Gating (Poster) | |
A Closer Look at Gradient Estimators with Reinforcement Learning as Inference (Poster) | |
A Framework for Efficient Robotic Manipulation (Poster) | |
Deep RePReL--Combining Planning and Deep RL for acting in relational domains (Poster) | |
Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling (Poster) | |
Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives (Poster) | |
Generalisation in Lifelong Reinforcement Learning through Logical Composition (Poster) | |
A Modern Self-Referential Weight Matrix That Learns to Modify Itself (Poster) | |
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization (Poster) | |
Should I Run Offline Reinforcement Learning or Behavioral Cloning? (Poster) | |
Learning Value Functions from Undirected State-only Experience (Poster) | |
HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation (Poster) | |
Large Scale Coordination Transfer for Cooperative Multi-Agent Reinforcement Learning (Poster) | |
ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives (Poster) | |
Deep Reinforcement Learning Explanation via Model Transforms (Poster) | |
CoMPS: Continual Meta Policy Search (Poster) | |
Maximum Entropy Model-based Reinforcement Learning (Poster) | |
Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay (Poster) | |
TempoRL: Temporal Priors for Exploration in Off-Policy Reinforcement Learning (Poster) | |
Recurrent Off-policy Baselines for Memory-based Continuous Control (Poster) | |
Offline Policy Selection under Uncertainty (Poster) | |
Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers (Poster) | |
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning (Poster) | |
That Escalated Quickly: Compounding Complexity by Editing Levels at the Frontier of Agent Capabilities (Poster) | |
A Meta-Gradient Approach to Learning Cooperative Multi-Agent Communication Topology (Poster) | |
Towards Automatic Actor-Critic Solutions to Continuous Control (Poster) | |
Continuous Control with Action Quantization from Demonstrations (Poster) | |
Bayesian Exploration for Lifelong Reinforcement Learning (Poster) | |
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates (Poster) | |
Task-Induced Representation Learning (Poster) | |
Cross-Domain Imitation Learning via Optimal Transport (Poster) | |
Learning Parameterized Task Structure for Generalization to Unseen Entities (Poster) | |
Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization (Poster) | |
Expert Human-Level Driving in Gran Turismo Sport Using Deep Reinforcement Learning with Image-based Representation (Poster) | |
Hierarchical Few-Shot Imitation with Skill Transition Models (Poster) | |
Offline Meta-Reinforcement Learning with Online Self-Supervision (Poster) | |
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning (Poster) | |
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning (Poster) | |
Beyond Target Networks: Improving Deep $Q$-learning with Functional Regularization (Poster) | |
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning (Poster) | |
On the Transferability of Deep-Q Networks (Poster) | |
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (Poster) | |
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World Trifinger (Poster) | |
Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL (Poster) | |
Introducing Symmetries to Black Box Meta Reinforcement Learning (Poster) | |
Component Transfer Learning for Deep RL Based on Abstract Representations (Poster) | |
Unsupervised Learning of Temporal Abstractions using Slot-based Transformers (Poster) | |
Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback (Poster) | |
Behavior Predictive Representations for Generalization in Reinforcement Learning (Poster) | |
Policy Gradients Incorporating the Future (Poster) | |
Offline Reinforcement Learning with In-sample Q-Learning (Poster) | |
A Graph Policy Network Approach for Volt-Var Control in Power Distribution Systems (Poster) | |
Skill-based Meta-Reinforcement Learning (Poster) | |
Exploring through Random Curiosity with General Value Functions (Poster) | |
URLB: Unsupervised Reinforcement Learning Benchmark (Poster) | |
Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks (Poster) | |
Automatic Curricula via Expert Demonstrations (Poster) | |
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning (Poster) | |
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models (Poster) | |
OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion (Poster) | |
Discriminator Augmented Model-Based Reinforcement Learning (Poster) | |
Accelerated Deep Reinforcement Learning of Terrain-Adaptive Locomotion Skills (Poster) | |
Latent Geodesics of Model Dynamics for Offline Reinforcement Learning (Poster) | |
Wasserstein Distance Maximizing Intrinsic Control (Poster) | |
SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning (Poster) | |
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization (Poster) | |
DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations (Poster) | |
Benchmark for Out-of-Distribution Detection in Deep Reinforcement Learning (Poster) | |
Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method (Poster) | |
Implicitly Regularized RL with Implicit Q-values (Poster) | |
Imitation Learning from Pixel Observations for Continuous Control (Poster) | |
Vision-Guided Quadrupedal Locomotion in the Wild with Multi-Modal Delay Randomization (Poster) | |
Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium (Poster) | |
Fast Inference and Transfer of Compositional Task for Few-shot Task Generalization (Poster) | |
What Would the Expert $do(\cdot)$?: Causal Imitation Learning (Poster) | |
GPU-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning (Poster) | |
Hindsight Foresight Relabeling for Meta-Reinforcement Learning (Poster) | |
No DICE: An Investigation of the Bias-Variance Tradeoff in Meta-Gradients (Poster) | |
MHER: Model-based Hindsight Experience Replay (Poster) | |
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations (Poster) | |
Lifting the veil on hyper-parameters for value-baseddeep reinforcement learning (Poster) | |
Understanding and Preventing Capacity Loss in Reinforcement Learning (Poster) | |
Interactive Robust Policy Optimization for Multi-Agent Reinforcement Learning (Poster) | |
Benchmarking the Spectrum of Agent Capabilities (Poster) | |
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL (Poster) | |
Imitation Learning from Observations under Transition Model Disparity (Poster) | |
PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network (Poster) | |
On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension (Poster) | |
Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification (Poster) | |
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation (Poster) | |
Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning (Poster) | |
TransDreamer: Reinforcement Learning with Transformer World Models (Poster) | |