I am about to start Research Assistant in Nanyang Technological University, working under the supervision of Jianwei Bian. I previously worked as an algorithm researcher at the AI startup GigaAI. In June 2025, I obtained my Bachelor’s degree from Xidian University. During my undergraduate studies, I worked as a research intern in NUS-HPC-AI-Lab. I also interned at the DataGrand with Dr. Nuo Xu.

Research Interests: world models, visual generation and embodied intelligence.

🔥 News

  • 2025.07:   Two papers accepted to CVPR 2025 (HumanDreamer, SpeeD)
  • 2024.07:   One paper accepted to ECCV 2024 (InfoGrowth)
  • 2024.05:   Open-Sora has been released. Congratulations!

📝 Selected Works

Below present the selected publications. The full list of publications can be obtained via My Google Scholar.

Publications

CVPR 2025
sym

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Kai Wang, Mingjia Shi, Yukun Zhou, et al.

Project   Code   Paper

  • As a plug-and-play and architectureagnostic approach, SpeeD consistently achieves 3× acceleration across various diffusion architectures, datasets, and tasks.
ECCV 2024
sym

Dataset Growth

Ziheng Qin, Zhaopan Xu, Yukun Zhou, et al.

Code   Paper

  • InfoGrowth can improve data quality/efficiency on both single modal and multi-modal tasks, with an efficient and scalable design.

Projects

Technical Report
sym

GigaWorld-0: World Models as Data Engine to Empower Embodied AI

Responsible for the training and evaluation of the video foundation model in GigaWorld-0, as well as the fine-tuning of downstream tasks and inference acceleration.

Code   Paper   Model

  • A unified world model framework designed explicitly as a data engine for Vision-Language-Action learning
Open-Source Project
sym

Open-Sora: Democratizing Efficient Video Production for All

Project participant as intern of Luchen Technology.

Code   Paper   Model   Dataset

  • Open-Sora is an open-source video generation model designed to produce high-fidelity video content. We democratizes full access to all the training/inference/data preparation codes as well as model weights