(α-β) indicates alphabetical ordering; * indicates equal contribution.
-
Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification
(α-β) Haolin Liu, Artin Tajdini, Andrew Wagenmaker, Chen-Yu Wei
NeurIPS 2024
[arXiv]
-
Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback
(α-β) Haolin Liu, Zakaria Mhammedi, Chen-Yu Wei, Julian Zimmert
NeurIPS 2024
[arXiv]
-
How Does Variance Shape the Regret in Contextual Bandits?
(α-β) Zeyu Jia, Jian Qian, Alexander Rakhlin, Chen-Yu Wei
NeurIPS 2024
[arXiv]
-
On Tractable Φ-Equilibria in Non-Concave Games
(α-β) Yang Cai, Constantinos Daskalakis, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng
NeurIPS 2024
[arXiv]
-
Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data
(α-β) Zeyu Jia, Alexander Rakhlin, Ayush Sekhari, Chen-Yu Wei
COLT 2024
[arXiv] [video]
-
Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games
(α-β) Yang Cai, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng
AISTAT 2024 (Oral)
[arXiv]
-
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback
(α-β) Haolin Liu, Chen-Yu Wei, Julian Zimmert
ICLR 2024 (Spotlight)
[arXiv]
-
Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits
(α-β) Haolin Liu, Chen-Yu Wei, Julian Zimmert
NeurIPS 2023
[arXiv] [video]
-
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
Dongsheng Ding*, Chen-Yu Wei*, Kaiqing Zhang*, Alejandro Ribeiro
NeurIPS 2023
[arXiv] [video]
-
No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions
Tiancheng Jin*, Junyan Liu*, Chloe Rouyer, William Chang, Chen-Yu Wei, Haipeng Luo
NeurIPS 2023
[arXiv] [video]
-
First- and Second-Order Bounds for Adversarial Linear Contextual Bandits
Julia Olkhovskaya, Jack Mayo, Tim van Erven, Gergely Neu, Chen-Yu Wei
NeurIPS 2023
[arXiv] [video]
-
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games
(α-β) Yang Cai, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng
NeurIPS 2023
[arXiv] [video]
-
A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
(α-β) Christoph Dann, Chen-Yu Wei, Julian Zimmert
COLT 2023
[arXiv] [slides]
-
Best of Both Worlds Policy Optimization
(α-β) Christoph Dann, Chen-Yu Wei, Julian Zimmert
ICML 2023 (Long Talk)
[arXiv] [slides] [video]
-
Refined Regret for Adversarial MDPs with Linear Function Approximation
(α-β) Yan Dai, Haipeng Luo, Chen-Yu Wei, Julian Zimmert
ICML 2023
[arXiv] [slides] [video]
-
A Unified Algorithm for Stochastic Path Problems
(α-β) Christoph Dann, Chen-Yu Wei, Julian Zimmert
ALT 2023
[arXiv] [slides]
-
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Dongsheng Ding*, Chen-Yu Wei*, Kaiqing Zhang*, Mihailo Jovanovic
ICML 2022 (Long Talk)
[arXiv] [video]
-
Personalization Improves Privacy-Accuracy Tradeoffs in Federated Optimization
Alberto Bietti, Chen-Yu Wei, Miroslav Dudik, John Langford, Zhiwei Steven Wu
ICML 2022
[arXiv] [video]
-
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure
Hsu Kao, Chen-Yu Wei, Vijay Subramanian
ALT 2022
[arXiv]
[slides]
-
A Model Selection Approach for Corruption Robust Reinforcement Learning
Chen-Yu Wei, Christoph Dann, Julian Zimmert
ALT 2022 (Best Paper Award)
[arXiv] [slides]
-
Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses
Haipeng Luo*, Chen-Yu Wei*, Chung-Wei Lee
NeurIPS 2021
[arXiv]
[slides]
[slides] [video]
-
Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
(α-β) Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang
ICML 2021
[arXiv] [slides] [video]
-
Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-Box Approach
Chen-Yu Wei and Haipeng Luo
COLT 2021 (Best Paper Award)
[arXiv] [slides] [slides] [video]
-
Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-Horizon Competitive Markov Games
Chen-Yu Wei, Chung-Wei Lee*, Mengxiao Zhang*, Haipeng Luo
COLT 2021
[arXiv] [slides]
-
Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications
(α-β) Liyu Chen, Haipeng Luo, Chen-Yu Wei
COLT 2021
[arXiv] [slides]
-
Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Liyu Chen, Haipeng Luo, Chen-Yu Wei
COLT 2021
[arXiv] [slides] [video]
-
Learning Infinite-Horizon Average-Reward MDPs with Linear Function Approximation
Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Rahul Jain
AISTAT 2021
[arXiv] [slides] [video]
-
Linear Last-Iterate Convergence in Constrained Saddle-Point Optimization
Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo
ICLR 2021
[arXiv] [code]
[slides] [video]
-
Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds
Ehsan Emamjomeh-Zadeh*, Chen-Yu Wei*, Haipeng Luo, David Kempe
ALT 2021
[arXiv] [slides] [video]
-
Bias No More: High-Probability Data-Dependent Regret Bounds for Adversarial Bandits and MDPs
(α-β) Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang
NeurIPS 2020 (Oral)
[arXiv] [slides] [video]
-
Federated Residual Learning
Chen-Yu Wei, Alekh Agarwal, John Langford
NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning 2020
[arXiv]
-
Taking a Hint: How to Leverage Loss Predictors in Contextual Bandits?
Chen-Yu Wei, Haipeng Luo, Alekh Agarwal
COLT 2020
[arXiv] [slides] [video]
-
Model-free Reinforcement Learning in Infinite-Horizon Average-Reward Markov Decision Processes
Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, Rahul Jain
ICML 2020
[arXiv]
[code] [slides] [video]
-
Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator
James Preiss*, Sebastien Arnold*, Chen-Yu Wei*, Marius Kloft
NeurIPS Workshop on Optimization Foundations for Reinforcement Learning 2019 [arXiv]
SoCal Machine Learning Symposium 2019 (Best Poster Award)
-
A New Algorithm for Non-Stationary Contextual Bandits: Efficient, Optimal, and Parameter-Free
(α-β) Yifang Chen, Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei
COLT 2019
[arXiv]
[a joint extended abstract with Auer, Gajane, and Ortner] [slides]
-
Improved Path-Length Regret Bounds for Bandits
(α-β) Sebastien Bubeck, Yuanzhi Li, Haipeng Luo, Chen-Yu Wei
COLT 2019
[arXiv] [slides]
-
Beating Stochastic and Adversarial Semi-Bandits Optimally and Simultaneously
Julian Zimmert, Haipeng Luo, Chen-Yu Wei
ICML 2019 (Long Talk)
[arXiv] [slides]
-
Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case
(α-β) Alina Beygelzimer, David Pal, Balazs Szorenyi, Devanathan Thiruvenkatachari, Chen-Yu Wei, Chicheng Zhang
ICML 2019
[arXiv]
[code] [slides]