GPT-J-6B: 6B JAX-Based Transformer


Summary: We have released GPT-J-6B, 6B JAX-based (Mesh) Transformer LM (Github).GPT-J-6B performs nearly on par with 6.7B GPT-3 (or Curie) on various zero-shot down-streaming tasks. You can try out …
Read more at Aran Komatsuzaki…