Introducing Alpa: A Compiler Architecture for Automated Model-Parallel Distributed Training That Outperforms Hand-Tuned Strategies | Synced
A research team from UC Berkeley, Amazon Web Services, Google, Shanghai Jiao Tong University and Duke University proposes Alpa, a compiler system for distributed deep learning on GPU clusters that ...
Source: Synced | AI Technology & Industry Review
A research team from UC Berkeley, Amazon Web Services, Google, Shanghai Jiao Tong University and Duke University proposes Alpa, a compiler system for distributed deep learning on GPU clusters that automatically generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on the models they were designed for.