I am a final-year master student in Tsinghua University, advised by Prof. Yangdong Deng. Before that, I received my bachelor degree from Wuhan University in June 2022. Efficiency lies at the heart of computer science. My current focus is on efficient system software, where I am deeply committed to exploring innovative approaches that bridge algorithmic advancements and system-level optimizations. I firmly believe that significant leaps in efficiency can only be achieved through a harmonious integration of these two perspectives. Outside of research, I enjoy running, swimming and books on history and sociology. Email  /  Google Scholar  /  Github  /  Zhihu |
![]() |
|
|
![]() |
InstCache: A Predictive Cache for LLM Serving
Longwei Zou, Tingfeng Liu Kai Chen Jiangang Kong Yangdong Deng Preprint, 2025 arXiv |
![]() |
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou, Qingyang Wang Han Zhao Jiangang Kong Yi Yang Yangdong Deng ACL, 2024 github / arXiv |
![]() |
A Multi-Level Framework for Accelerating Training Transformer Models
Longwei Zou, Han Zhang Yangdong Deng ICLR, 2024 github / arXiv |
Design and source code from Jon Barron's website. |