Constant-Time Breakthrough Raises the Hash-Table Speed Limit

Did you know there’s now a hash table whose access time can stay constant on average — and only logarithmic in the worst case — no matter how close it is to being completely full?

That claim, which overturns a belief dating back to Andrew Yao’s 1985 Uniform Hashing is Optimal conjecture, is proven in Optimal Bounds for Open Addressing Without Reordering (Jan 2025) by Andrew Krapivin, Martín Farach-Colton and William Kuszmaul. Their open-addressed design (nick-named funnel hashing) achieves three key bounds when the table is $1-\delta$ full (so $x=\delta^{-1}$ ):

Amortized expected probes: $O(1)$ — average queries run in fixed time, independent of fullness.
Worst-case expected probes (non-greedy strategy): $O(\log x)$ , beating Yao’s linear-in- $x$ lower bound by an exponential factor.
Tight bound for **greedy strategies:** they exhibit a scheme with $O\bigl((\log x)^2\bigr)$ worst-case expected cost and prove that no greedy algorithm can do asymptotically better.

In practical terms, even if your hash table is 99.9 % full ( $x=1000$ ), every lookup in the non-greedy version needs roughly $\log 1000 \approx 10$ probes on average, and never more than about ten times that — a dramatic drop from the hundreds of probes previously considered unavoidable.

The work also supplies a constant-time insertion/lookup on average, irrespective of load. That constant bound surprised even specialists: earlier proofs showed $\Theta(\log x)$ was the floor for any “greedy” table (one that always takes the first free slot). Relax that single rule and the authors show you can do better than anyone thought possible.

These results matter because open-addressed tables back some of the most latency-sensitive components in systems code: language runtimes, kernel symbol tables, in-memory DB indexes, even GPU hash-based acceleration. Replacing linear-probe or uniform-probe tables with an $O(1)\!/O(\log x)$ design could cut tail-latency spikes at high load and shrink memory overhead devoted to over-provisioning.

If you want the full technical story, the paper is open access on arXiv:
Optimal Bounds for Open Addressing Without Reordering

For a lively narrative of how an undergraduate stumbled into dismantling four decades of consensus, Quanta Magazine captured the back-story in their recent feature.

The upshot: twenty-odd pages of analysis have redrawn the map for one of computing’s oldest data structures, and any code hitting the performance wall at high table-load now has a fresh avenue for speed-ups.

Constant-Time Breakthrough Raises the Hash-Table Speed Limit

Related

Leave a ReplyCancel reply

Constant-Time Breakthrough Raises the Hash-Table Speed Limit

Star Wars Reimagined: China’s Laser Satellite Outpaces Starlink

Court Rules AI’s Use of Books as Fair Use but Slams Pirated Collection Storage

Introducing the OWASP AI Testing Guide: A New Standard for AI Security Testing

The Low-Background Steel Problem of AI

Chinese AI Firms Dodge US Chip Bans with Cross-Border Data Smuggling to Malaysia

OpenAI open-sources a demo of a UI testing agent

Financial Dynamics in Agentic AI: Cursor’s Rise Versus GitHub Copilot

Mistral AI Releases Codestral Embed: A Specialized Code Embedding Model