Unsloth Fixing Gemma bugs

Unsloth developers Daniel and Michael Han have dedicated the past week to addressing a series of bugs in Google’s Gemma, an AI model that had shown promise but was plagued by technical issues. The duo has successfully rectified these bugs, which ranged from simple typos affecting token generation to more complex problems such as incorrect casting in Keras and precision errors in RoPE (Rotary Positional Embeddings) calculations. Notably, they’ve improved the handling of layer normalization and the GELU activation function, ensuring these processes are executed with the appropriate precision to avoid loss of information. These fixes have been implemented in Colab notebooks and are also reflected in the latest Hugging Face transformers version 4.38.2. The brothers continue to work on further improvements, including a pull request for the approximate GELU function. They encourage the community to support their efforts through donations and engagement on their Discord server and social media platforms.
Read more at Unsloth – Unslow finetuning for AI & LLMs…

Unsloth Fixing Gemma bugs

Related

An LLM Practitioner’s Field Guide #2: First Contact

From Weights to Production: An LLM Practitioner’s Field Guide #1

Teaching Your Coding Agent to Think Before It Types

Meet Shannon by Keygraph: The AI Breakthrough in Autonomous Web Security Testing

Autoresearch by Andrej Karpathy: Revolutionizing Machine Learning with Autonomous Experimentation

Someone Built a Firewall for Claude Code — And You Probably Need It

AI Agents Are Privileged Processes. We’ve Been Treating Them Like Chatbots.

Cheddar Bench: Coding Agents Playing Bug Treasure Hunt

The Day 7,000 Robot Vacuums Almost Became a Remote-Controlled Army