Ph.D. student at the Renmin University of China (from the 2023 fall @ML-GSAI ). I'm interested in scalability and optimization in deep learning.
-
Renmin University of China
- Beijing, China
- https://chen-yu-zheng.github.io/
Pinned Loading
-
ML-GSAI/Scaling-Diffusion-Transformers-muP
ML-GSAI/Scaling-Diffusion-Transformers-muP Public[NeurIPS 2025] Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".
-
ML-GSAI/Width-Depth-muP
ML-GSAI/Width-Depth-muP PublicOfficial implementation for our paper "Spectral Condition for μP under Width–Depth Scaling".
Python 7
-
ML-GSAI/Revisiting-Dis-vs-Gen-Classifiers
ML-GSAI/Revisiting-Dis-vs-Gen-Classifiers PublicOfficial implementation for "Revisiting Discriminative vs. Generative Classifiers: Theory and Implications".
-
ML-GSAI/Understanding-GDA
ML-GSAI/Understanding-GDA Public[NeurIPS 2023] Official implementation for our paper "Toward Understanding Generative Data Augmentation".
-
ML-GSAI/MesaOpt-AR-Transformer
ML-GSAI/MesaOpt-AR-Transformer Public[NeurIPS 2024] Official implementation for our paper "On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability".
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
