Undergraduate @ Tongji University · MLSys Fullstack · LLM Inference · Operating Systems
Email · GitHub · personal site
I'm an undergraduate at Tongji University interested in the intersection of systems programming and AI infrastructure.
My recent work focuses on LLM inference optimization, KV cache / paged attention, operating systems, and memory-efficient systems design.
- SieveKV — semantics-aware KV cache eviction for long-context LLM inference
- Paged KV Cache CUDA Kernels — fused CUDA kernels for efficient LLM decoding
- NovaOS — a Rust-based POSIX-compatible kernel for RISC-V64
- Distributed Semantic Retrieval System — Chord-based distributed dense retrieval and RAG pipeline
- National First Prize — Global Campus AI Algorithm Challenge
- International Silver Medal — iGEM
Building systems software for efficient AI.



