news | Jianan (Oscar) Ji

Jun 18, 2025	Our first version of Mirage Persistent Kernel is released! We developed a compiler that automatically transforms LLM inference into a single megakernel — a fused GPU kernel that performs all necessary computation and communication in one launch. This end-to-end GPU fusion approach reduces LLM inference latency by 1.2-6.7x. Have a look at the blog and the codebase.
May 31, 2025	Excited to share that our paper Mirage: A Multi-Level Superoptimizer for Tensor Programs is accepted by OSDI ‘25 !