Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Stars

1255

Forks

209

Language

Python

Last Updated

May 24, 2024

Similar Repos