Wenhao Zhan

alt text 

I am a Ph.D. student at Princeton University advised by Professor Jason D. Lee and Yuxin Chen.
Before that, I received my Bachelor's Degree in Electronic Engineering from Tsinghua University.

Office: Friend Center 306, Princeton, NJ.
Email: wenhao.zhan@princeton.edu
Google Scholar

Research

My research interests include

  • Reinforcement Learning

  • Statistics

Publications

(* = equal contribution, + = equal contribution and random order)

  1. W. Zhan, S. Fujimoto, Z. Zhu, J. D. Lee, D. R. Jiang, Y. Efroni, "Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank", Preprint.

  2. A. Huang, W. Zhan, T. Xie, J. D. Lee, W. Sun, A. Krishnamurthy, D. J. Foster, "Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-squared Preference Optimization", Preprint.

  3. Z. Gao, W. Zhan, J. D. Chang, G. Swamy, K. Brantley, J. D. Lee, W. Sun, "Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF", Preprint.

  4. J. D. Chang*, W. Zhan*, O. Oertell, K. Brantley, D. Misra, J. D. Lee, W. Sun, "Dataset Reset Policy Optimization for RLHF", Preprint.

  5. Z. Gao, J. D. Chang, W. Zhan, O. Oertell, G. Swamy, K. Brantley, T. Joachims, J. A. Bagnell, J. D. Lee, W. Sun, "REBEL: Reinforcement Learning via Regressing Relative Rewards", Accepted by Neurips 2024.

  6. Z. Zhang, W. Zhan, Y. Chen, S. S. Du, J. D. Lee, "Optimal Multi-Distribution Learning", COLT 2024.

  7. W. Zhan, M. Uehara, W. Sun, J. D. Lee, "Provable Reward-Agnostic Preference-Based Reinforcement Learning", ICLR 2024 Spotlight.

  8. W. Zhan*, M. Uehara*, N. Kallus, J. D. Lee, W. Sun, "Provable Offline Preference-Based Reinforcement Learning", ICLR 2024 Spotlight.

  9. Y. Zhao+, W. Zhan+, X. Hu+, H. Leung, F. Farnia, W. Sun, J. D. Lee, "Provably Efficient CVaR RL in Low-rank MDPs", ICLR 2024.

  10. G. Li*, W. Zhan*, J. D. Lee, Y. Chi, Y. Chen, "Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning", Neurips 2023.

  11. W. Zhan*, S. Cen*, B. Huang, Y. Chen, J. D. Lee, Y. Chi, "Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence", SIAM Journal on Optimization, 2023.

  12. W. Zhan, M. Uehara, W. Sun, J. D. Lee, "PAC Reinforcement Learning for Predictive State Representations", ICLR 2023.

  13. W. Zhan, J. D. Lee, Z. Yang, "Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games", ICLR 2023.

  14. W. Zhan, B. Huang, A. Huang, N. Jiang, J. D. Lee, "Offline Reinforcement Learning with Realizability and Single-policy Concentrability", COLT 2022.

  15. C. Z. Lee, L. P. Barnes, W. Zhan, A. Özgür, "Over-the-Air Statistical Estimation of Sparse Models", GLOBECOM 2021.

  16. W. Zhan, H. Tang, J. Wang, "Delay Optimal Cross-Layer Scheduling Over Markov Channels with Power Constraint", BMSB 2020.

Working

Meta

Research Intern
Jun 2024 – Sep 2024
Efficient Multi-Agent Offline Reinforcement Learning

Teaching

  • Spring 2024: Foundations of Reinforcement Learning, as TA (Princeton, Instructor: Prof. Chi Jin).

  • Fall 2022: Theory of Weakly Supervised Learning, as TA (Princeton, Instructor: Prof. Jason D. Lee).

Honors

  • 2024 Award for Excellence awarded by Princeton SEAS

  • Honorable mention for the 2023 Jane Street Graduate Research Fellowship