ZeRO-Offload: Training Multi-Billion Parameter Models on a Single GPU | #site_titleZeRO-Offload: Training Multi-Billion Parameter Models on a Single GPU

Researchers from University of California, Merced and Microsoft have introduced ZeRO-Offload, a novel heterogeneous DL training technology that enables training of multi-billion parameter models on...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

Researchers from University of California, Merced and Microsoft have introduced ZeRO-Offload, a novel heterogeneous DL training technology that enables training of multi-billion parameter models on a single GPU without any model refactoring.