Groq, an AI hardware startup, has introduced two new open-source language models, Llama-3-Groq-Tool-Use 8B and 70B, that have demonstrated superior performance in tool use capabilities on the Berkeley Function Calling Leaderboard (BFCL). The 70B model achieved a 90.76% accuracy, claiming the top position, while the 8B model secured the third spot with 89.06% accuracy, outshining proprietary models from established tech giants like OpenAI, Google, and Anthropic.
The models were developed in collaboration with AI research company Glaive, using a base model from Meta’s Llama-3. Groq’s approach involved full fine-tuning and Direct Preference Optimization (DPO), leveraging ethically generated synthetic data. This development strategy not only addresses privacy and data overfitting concerns but also challenges the conventional reliance on massive real-world data sets for training cutting-edge AI, potentially reducing environmental impacts and privacy risks.
These models are now accessible through the Groq API and on Hugging Face, promoting open-source accessibility and fostering innovation in domains that benefit from advanced tool use and function calling abilities. Groq’s release also features a public demo on Hugging Face Spaces, developed in collaboration with Gradio, enhancing community engagement and further exploration of the models’ capabilities.
This open-source initiative by Groq stands to reshape the competitive landscape of the AI industry, encouraging transparency and accelerating AI development across the board. As the AI community continues to evaluate and adopt these models, the implications for AI accessibility and innovation are profound, suggesting a shift towards more democratic and diverse AI ecosystems.
