Xiaohuan Pei (Terry) is a PhD student in Computer Science at the University of Sydney (USYD), supervised by Prof. Chang Xu. He collaborate with A/Prof. Tao Huang at SJTU, RA/Prof. Yanxi Li at NTU, A/Prof. Minjing Dong at CityU, Yuheng Shi at USYD and researcher Pichao Wang at NVIDIA. He is a visiting graduate researcher at the University of California, Los Angeles (UCLA), hosted by Prof. Cho-Jui Hsieh.

Currently, he is working on fundation models pretraining from scratch at the billion-parameter scale, with a primary focus on kinds of receipts of Stage-1 (Alignment), Stage-2 (SFT) and their own designed Stage-3. In parallel, He is also exploring foundation models for autonomous driving, with an emphasis on scalable pretraining and efficient inference.

Currently openning to one research intern position working on efficiency for World Model.


📝 Selected Work

ICLR 2026
sym

Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation

Xiaohuan Pei*, Yuxing Chen*, Siyu Xu, Yunke Wang, Yuheng Shi, Chang Xu

[Paper] [Code]

ICLR 2026
sym

Self-Distilled RoI Predictors for Fine-Grained MLLM Perception

Yuheng Shi, Xiaohuan Pei, Minjing Dong, Chang Xu

[Paper] [Code]

ICLR 2026
sym

Light Future-aware Masking for Vision-Language Inference

Xiaohuan Pei, Tao Huang, Yanxiang Ma, Chang Xu

[Paper] [Code]

AAAI 2025
sym

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

Xiaohuan Pei, Tao Huang, Chang Xu

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2025

[Paper] [Code] [Tutorial] GitHub stars

ECCV W 2024
sym

LocalMamba: Visual State Space Model with Windowed Selective Scan

Tao Huang, Xiaohuan Pei, Chang Xu

The European Conference on Computer Vision (ECCV), Workshop, 2024

[Paper] [Code] GitHub stars

🧑🏻‍💻 Preprints

Cross-Self KV Cache Pruning for Efficient Vision-Language Inference.
Xiaohuan Pei, Tao Huang, Chang Xu.
arXiv preprint [arXiv:2412.04652].

GPT self-supervision for a better data annotator.
Xiaohuan Pei, Yanxi Li, Chang Xu.
arXiv preprint [arXiv:2306.04349] (2023).

Text-driven Neural Architecture Embeddings and Retrieval.
Xiaohuan Pei, Yanxi Li, Minjing Dong, Chang Xu.

🧑🏻‍💻 Academic Publications

Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation.
Xiaohuan Pei, Yuxing Chen, Siyu Xu, Yunke Wang, Yuheng Shi, Chang Xu
The Fourteenth International Conference on Learning Representations (ICLR 2026). (Core Rank A*)

Catching the details: self-distilled ROI predictors for fine-grained Vision-Language-Model perception.
Yuheng Shi, Xiaohuan Pei, Minjing Dong, Chang Xu.
The Fourteenth International Conference on Learning Representations (ICLR 2026). (Core Rank A*)

Rethinking Causal Mask Attention for Vision-Language Inference.
Xiaohuan Pei, Tao Huang, Yanxiang Ma, Chang Xu.
The Fourteenth International Conference on Learning Representations (ICLR 2026). (Core Rank A*)

Efficientvmamba: Atrous selective scan for light weight visual mamba.
Xiaohuan Pei, Tao Huang, and Chang Xu.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2025). (Core Rank A*)

Localmamba: Visual state space model with windowed selective scan.
Tao Huang, Xiaohuan Pei, Chang Xu.
European Conference on Computer Vision (ECCV 2024), Workshop. (Core Rank A*)

Neural Architecture Retrieval.
Xiaohuan Pei, Yanxi Li, Minjing Dong, Chang Xu.
The Twelfth International Conference on Learning Representations (ICLR 2024). (Core Rank A*)

Contrastive code-comment pre-training.
Xiaohuan Pei, Daochang Liu, Qian Luo, Chang Xu.
IEEE International Conference on Data Mining (ICDM 2022). (Core Rank A*)

Self-attention gated cognitive diagnosis for faster adaptive educational assessments.
Xiaohuan Pei, Shuo Yang, Jiajun Huang, Chang Xu.
IEEE International Conference on Data Mining (ICDM 2022). (Core Rank A*)

TCNAS: Transformer Architecture Evolving in Clone Detection.
Hongyan Xu, Xiaohuan Pei, Shan You, Chang Xu.
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024). (Core Rank A)

📖 Teaching

  • 2024, Guest Lecture, Artifical Intelligence, The University of Sydney

  • 2023, 2025, Tutor for COMP5329 Deep Learning, The University of Sydney

🎖 Awards

  • ICDM Best Student Paper Award
  • Two Full Scholarship Awards
  • Outstanding Graduate
  • National Second Prize in Mathematics Competition
  • Provincial Prize in C++ Programming Competition

🌟 Funding Grants

  • The National Computational Infrastructure (NCI) Adapter Scheme, Australia

Services

  • Reviewer of TPAMI, ICML, NeurIPS, ICLR, CVPR, ICCV, KDD, ICDM.

Contact: xiaohuan.pei at sydney.edu.au, terrypei123 at gmail.com