GitHub - openai/transformer-debugger

OpenAI’s Superalignment team has developed the Transformer Debugger (TDB), a tool designed to delve into the specific behaviors of small language models. TDB leverages automated interpretability techniques and sparse autoencoders to facilitate the investigation of language model decisions, such as token output and attention head focus. The tool simplifies the process of exploring model behaviors without the immediate need for coding, by allowing interventions in the model’s forward pass and tracing the connections between model components.

The release includes a Neuron viewer React app, an Activation server for model inference, a simple inference library for GPT-2 models, and datasets of top-activating examples. To get started, users are guided to set up their environment with Python, pip, Node, and npm. The repository also provides instructions for running the TDB app, including setting up the activation server backend and neuron viewer frontend.

For those making changes to the TDB, the repository outlines steps to validate updates using pytest, mypy, and by confirming the functionality of the server and viewer. The tool and its components can be cited in research, with a provided citation format and BibTex entry for referencing.
Read more at GitHub…

GitHub – openai/transformer-debugger

Related

Mistral AI Releases Codestral Embed: A Specialized Code Embedding Model

OpenEvolve: Pioneering the Future of Evolutionary Code Optimization

LLMs Spot Subtle Linux Kernel Bugs Through Code Alone

Claude Opus 4 Brings AI One Step Closer to Autonomous Workdays

Devstral-Small-2505 Sets New Standard for Open-Source Coding Agents

Microsoft and GitHub Back MCP to Bridge AI with Real-World Systems

Meet MyManus: Your Local AI Agent That Plans, Executes, and Stays Offline

Microsoft Open-Sources Windows Subsystem for Linux, Invites Community Collaboration

AI Uncovers Hidden Role of Key Enzyme in Alzheimer’s and a Promising Treatment Path