
Han Cai
hancai [at] mit (dot) edu
I am a fifth-year Ph.D. student at MIT, advised by Prof. Song Han. Before coming to MIT, I received my Master's and Bachelor's degree at Shanghai Jiao Tong University (SJTU), advised by Prof. Yong Yu. At SJTU, I also worked closely with Prof. Weinan Zhang and Prof. Jun Wang.
My research interests lie in machine learning, particularly efficient deep learning and AutoML, as well as their applications in real-world scenarios such as computer vision, natural language processing, etc.
Email / Google Scholar / GitHub / Twitter
Awards
Competition Awards
- First Place in the CVPR 2020 Workshop of Low-Power Computer Vision Challenge, CPU detection and FPGA track
- First Place in the 2019 IEEE Low-Power Image Recognition Challenge, classification and detection track
- First Place in the Low-Power Computer Vision Workshop at ICCV 2019, DSP track
- Third Place in the CVPR 2019 Workshop Low-Power Image Recognition Challenge, classification track
News
- Sep 12 2023: EfficientViT is highlighted by MIT home page and MIT News.
- July 18 2023: EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction is accepted by ICCV 2023.
- Mar 2022: Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation is accepted by CVPR'2022.
- Jan 2022: Network Augmentation for Tiny Deep Learning is accepted by ICLR'2022.
- Oct 2021: Mcunetv2: Memory-efficient patch-based inference for tiny deep learning is accepted by NeurIPS'2021.
- Sep 2020: TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning is accepted by NeurIPS'2020.
- Aug 2020: OnceForAll won the first place in the CVPR 2020 Low-Power Computer Vision Challenge, CPU detection and FPGA track.
- May 2020: HAT: Hardware-Aware Transformer for Efficient Natural Language Processing to appear at ACL'2020.
- Feb 2020: APQ: Joint Search for Network Architecture, Pruning and Quantization Policy is accepted by CVPR'2020.
- Dec 2019: Once-For-All Network (OFA) is accepted by ICLR'2020.
- Nov 2019: AutoML for Architecting Efficient and Specialized Neural Networks to appear at IEEE Micro.
- Dec 2018: ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware is accepted by ICLR'2019.
Selected Projects

EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
EfficientViT is a new family of vision models for high-resolution dense prediction. It achieves global receptive field and multi-scale learning with only hardware-efficient operations. EfficientViT delivers remarkable performance gains over previous models with speedup on diverse hardware platforms, including mobile CPU, edge GPU, and cloud GPU. [Media: MIT home page, MIT News]

Once for All: Train One Network and Specialize it for Efficient Deployment
OFA is an efficient AutoML technique that decouples model training from architecture search. Train only once, specialize for many hardware platforms, from CPU/GPU to hardware accelerators. OFA consistently outperforms SOTA NAS methods while reducing orders of magnitude GPU hours and CO2 emission. In particular, OFA achieves a new SOTA 80.0% ImageNet top1 accuracy under the mobile setting (<600M FLOPs). OFA is the winning solution for CVPR 2020 Workshop of Low-Power Computer Vision Challenge (FPGA track), 2019 IEEE Low-Power Image Recognition Challenge (classification and detection track), Low-Power Computer Vision Workshop at ICCV 2019 (DSP track).
[Media: MIT News, Qualcomm, Xilinx, VentureBeat, SingularityHub]
[Industry Integration: PytorchHub@Meta, nnabla-nas@Sony]
[Code: GitHub (1.7k stars), Colab]

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
ProxylessNAS is an efficient hardware-aware neural architecture search method, which can directly search on large-scale datasets. It can design specialized neural network architecture for different hardware platforms. With >74.5% top-1 accuracy, the latency of ProxylessNAS is 1.8x faster than MobileNetV2.
[Media: MIT News, IEEE Spectrum]
[Industry Integration: PytorchHub@Meta, AutoGluon@Amazon, NNI@Microsoft]
Academic Services
- Serve as a reviewer for ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV, TPAMI, IJCV, ACL, EMNLP