Mistral 7B

Mistral AI has released Mistral 7B, a 7.3B parameter model that outperforms Llama 2 13B and Llama 1 34B on various benchmarks. The model uses Grouped-query attention and Sliding Window Attention for faster inference and handling longer sequences. It is released under the Apache 2.0 license and can be fine-tuned for any task. The model also demonstrates superior performance in code and reasoning benchmarks.
Read more…

Discover more from Emsi's feed

Subscribe now to keep reading and get access to the full archive.

Continue reading