Xiaohuan Pei (Terry) is a PhD student in Computer Science at the University of Sydney (USYD), supervised by Prof. Chang Xu. He collaborate with A/Prof. Tao Huang at SJTU, RA/Prof. Yanxi Li at NTU, A/Prof. Minjing Dong at CityU, Yuheng Shi at USYD and researcher Pichao Wang at NVIDIA. He is a visiting graduate researcher at the University of California, Los Angeles (UCLA), hosted by Prof. Cho-Jui Hsieh.
Currently, he is working on fundation models pretraining from scratch at the billion-parameter scale, with a primary focus on kinds of receipts of Stage-1 (Alignment), Stage-2 (SFT) and their own designed Stage-3. In parallel, He is also exploring foundation models for autonomous driving, with an emphasis on scalable pretraining and efficient inference.
Currently openning to one research intern position working on efficiency for World Model.
📝 Selected Work

Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation
Xiaohuan Pei*, Yuxing Chen*, Siyu Xu, Yunke Wang, Yuheng Shi, Chang Xu

Self-Distilled RoI Predictors for Fine-Grained MLLM Perception
Yuheng Shi, Xiaohuan Pei, Minjing Dong, Chang Xu

Light Future-aware Masking for Vision-Language Inference
Xiaohuan Pei, Tao Huang, Yanxiang Ma, Chang Xu

Cross-Self KV Cache Pruning for Efficient Vision-Language Inference
Xiaohuan Pei, Tao Huang, Chang Xu

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba
Xiaohuan Pei, Tao Huang, Chang Xu
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2025

LocalMamba: Visual State Space Model with Windowed Selective Scan
Tao Huang, Xiaohuan Pei, Chang Xu
The European Conference on Computer Vision (ECCV), Workshop, 2024
🧑🏻💻 Preprints
Cross-Self KV Cache Pruning for Efficient Vision-Language Inference.
Xiaohuan Pei, Tao Huang, Chang Xu.
arXiv preprint [arXiv:2412.04652].
GPT self-supervision for a better data annotator.
Xiaohuan Pei, Yanxi Li, Chang Xu.
arXiv preprint [arXiv:2306.04349] (2023).
Text-driven Neural Architecture Embeddings and Retrieval.
Xiaohuan Pei, Yanxi Li, Minjing Dong, Chang Xu.
🧑🏻💻 Academic Publications
Action-aware Dynamic Pruning for Efficient Vision-Language-Action
Manipulation.
Xiaohuan Pei, Yuxing Chen, Siyu Xu, Yunke Wang, Yuheng Shi, Chang Xu
The Fourteenth International Conference on Learning Representations (ICLR 2026). (Core Rank A*)
Catching the details: self-distilled ROI predictors for fine-grained Vision-Language-Model perception.
Yuheng Shi, Xiaohuan Pei, Minjing Dong, Chang Xu.
The Fourteenth International Conference on Learning Representations (ICLR 2026). (Core Rank A*)
Rethinking Causal Mask Attention for Vision-Language Inference.
Xiaohuan Pei, Tao Huang, Yanxiang Ma, Chang Xu.
The Fourteenth International Conference on Learning Representations (ICLR 2026). (Core Rank A*)
Efficientvmamba: Atrous selective scan for light weight visual mamba.
Xiaohuan Pei, Tao Huang, and Chang Xu.
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2025). (Core Rank A*)
Localmamba: Visual state space model with windowed selective scan.
Tao Huang, Xiaohuan Pei, Chang Xu.
European Conference on Computer Vision (ECCV 2024), Workshop. (Core Rank A*)
Neural Architecture Retrieval.
Xiaohuan Pei, Yanxi Li, Minjing Dong, Chang Xu.
The Twelfth International Conference on Learning Representations (ICLR 2024). (Core Rank A*)
Contrastive code-comment pre-training.
Xiaohuan Pei, Daochang Liu, Qian Luo, Chang Xu.
IEEE International Conference on Data Mining (ICDM 2022). (Core Rank A*)
Self-attention gated cognitive diagnosis for faster adaptive educational assessments.
Xiaohuan Pei, Shuo Yang, Jiajun Huang, Chang Xu.
IEEE International Conference on Data Mining (ICDM 2022). (Core Rank A*)
TCNAS: Transformer Architecture Evolving in Clone Detection.
Hongyan Xu, Xiaohuan Pei, Shan You, Chang Xu.
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024). (Core Rank A)
📖 Teaching
-
2024, Guest Lecture, Artifical Intelligence, The University of Sydney
-
2023, 2025, Tutor for COMP5329 Deep Learning, The University of Sydney
🎖 Awards
- ICDM Best Student Paper Award
- Two Full Scholarship Awards
- Outstanding Graduate
- National Second Prize in Mathematics Competition
- Provincial Prize in C++ Programming Competition
🌟 Funding Grants
- The National Computational Infrastructure (NCI) Adapter Scheme, Australia
Services
- Reviewer of TPAMI, ICML, NeurIPS, ICLR, CVPR, ICCV, KDD, ICDM.
Contact: xiaohuan.pei at sydney.edu.au, terrypei123 at gmail.com