Microsoft’s updated DeepSpeed can train trillion-parameter AI models with fewer GPUs


The company claims the technique, dubbed 3d parallelism, adapts to the varying needs of workload requirements to power extremely large models while balancing scaling efficiency.
Read more at VentureBeat…