|
Lanxiang Hu
I'm a PhD candidate at UCSD, fortunately advised by Prof. Hao Zhang and Prof. Tajana Šimunić Rosing. My research interest is in building efficient and reliable AI models and systems.
Previously I was a research intern at Snowflake AI research, where I worked on agents and efficient inference. Before joining UCSD, I have spent wonderful time working as a visiting research intern with Prof. Song Han. I completed my undergrad degree at UC Berkeley with majors in CS and Physics.
My work focuses on efficient AI and AI system evaluation. Some of my projects include:
Email /
Scholar /
Twitter /
Linkedin /
Github
|
News
- [2026/05] Our papers on Jacobi Forcing, d3LLM, and AMA-Bench are accepted to ICML 2026.
- [2026/04] I am honored to be selected as one of the 2026 ML and Systems Rising Stars!
- [2026/02] We are releasing VideoScience-Bench, a benchmark for evaluating video models' scientific reasoning and world modeling capabilities.
- [2026/01] Our papers on Lmgame Bench and Stronger-MAS are accepted to ICLR 2026.
- [2025/12] Our new method for training native causal parallel decoder -- Jacobi Forcing is released! Try our models for AR quality with 4.5x speedup.
- [2025/10] Our MAS training framework PettingLLMs (Stronger-MAS) is released! Try it out if you are working on agents!
- [2025/09] I wrapped up my internship at Snowflake AI research. Go Snowflakes!
- [2025/06] Lmgame Bench now supports model evaluation with and without gaming harness support.
- [2025/04] Our latest Lmgame Bench leaderboard and GamingAgent are now live! An exciting milestone toward evaluation of multi-turn interactive VLM agents.
- [2025/01] Our papers on Game Arena and LongPack are accepted to ICLR 2025.
- [2024/06] I joined Snowflake AI research for summer internship.
- [2024/05] Our papers on CLLMs and OSD are accepted to ICML 2024.
- [2024/03] A new family of parallel decoders, CLLMs are released. Try our codebase for 2~3 times inference speedup!
|
|
* denotes for equal contribution.
|
|
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing
Lanxiang Hu*, Siqi Kou*, Yichao Fu, Samyam Rajbhandari, Tajana Rosing, Yuxiong He, Zhijie Deng, Hao Zhang
ICML, 2026
arxiv /
code /
website /
Jacobi Forcing trains causal parallel decoders that are able to generate high-quality drafts conditioning on noisy context. On coding and math tasks, Jacobi Forcing achieves up to 4.5x speeudp with minimal to no performance loss.
|
|
Lmgame-Bench: How Good are LLMs at Playing Games?
Lanxiang Hu*, Mingjia Huo*, Yuxuan Zhang†, Haoyang Yu†, Eric P. Xing, Ion Stoica, Tajana Rosing, Haojian Jin, Hao Zhang
ICLR, 2026
arxiv /
code /
website /
We introduce lmgame-bench that evaluates latest large models with games and addresses evaluation challenges by providing scaffolds. We present quantitative analysis of the relationship between model gaming performance and results on existing benchmarks.
|
|
Stronger-MAS: Multi-Agent Reinforcement Learning for Collaborative LLMs
Yujie Zhao, Lanxiang Hu, Yang Wang, Minmin Hou, Hao Zhang, Ke Ding, Jishen Zhao
ICLR, 2026
arxiv /
code /
website /
We introduce an on-policy RL algorithm, AT-GRPO and a training system for multi-agent system training. Stronger-MAS boosts planning task accuracy from a 14.0–47.0% from single-agent RL baseline to 96.0–99.5%. It also improves reasoning performance with an average gain of 7.62% on coding and 17.93% on math.
|
|
GameArena: Evaluating LLM Reasoning through Live Computer Games
Lanxiang Hu*, Qiyu Li*, Anze Xie*, Nan Jiang, Ion Stoica, Haojian Jin, Hao Zhang
ICLR, 2025
arxiv /
code /
website /
We design and build a incentivized dynamic benchmarks to evaluate AI reasoning abilities extending beyond math and coding.
|
|
Scaling Long Context Training Data by Long-Distance Referrals
Yonghao Zhuang*, Lanxiang Hu*, Longfei Yun, Souvik Kundu, Zhengzhong Liu, Eric P. Xing, Hao Zhang
ICLR, 2025
arxiv /
We show long distance referral is important to long context training, and design data pipeline to scale up constructing such data.
|
|
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs
Lanxiang Hu, Tajana Rosing, Hao Zhang
ACL, 2025
arxiv /
code /
we introduce an algorithm to progressively prune MHA and MLP layers during domain-specific SFT to achieve up to 5.7x speedup and 60% less memory consumption in comparison with state-of-the-art model compression algorithms.
|
|
CLLMs: Consistency Large Language Models
Siqi Kou*, Lanxiang Hu*, Zhezhi He, Zhijie Deng, Hao Zhang
ICML, 2024
arxiv /
code /
website /
We show LLMs can be trained to operate LLMs as highly efficient parallel decoders with 2.4x to 3.4x speedup across a variety of benchmarks.
|
|
Online Speculative Decoding
Xiaoxuan Liu, Lanxiang Hu, Peter Bailis, Ion Stoica, Zhijie Deng, Alvin Cheung, Hao Zhang
ICML, 2024
arxiv /
code /
We introduce online speculative decoding algorithm (OSD) with improved responsiveness, speculation accuracy and compatibility with LLM serving systems.
|
Academic Services
Conference reviewer: ICML, ICLR, NeurIPS, COLM, ACL, ECCV.
|
|