Our Research
Our research goal is to advance the state of the art in emerging large-scale system platforms by making them more efficient, responsive, intelligent, and programmable. Our current research topics lie in the following areas, and continues to evolve as new challenges emerge:
Efficient LLM Serving Systems: We design inference and serving systems that meet strict latency and cost targets at production scale.
Continual and On-Device Learning: We build methods and system support for continual adaptation on resource-constrained devices and edge environments.
Large-Scale Distributed Training: We develop techniques to train models efficiently across heterogeneous and geo-distributed clusters.
Fast and Scalable Big Data Analytics: We build data systems that accelerate iterative analytics and scalable ML data pipelines.
