Diffusion language models

Diffusion models have completely taken over generative modelling of perceptual signals — why is autoregression still the name of the game for language modelling? Can we do anything about that?

GPT-4 says:
Diffusion models have revolutionized generative modeling for perceptual signals like images, audio, and video. However, autoregression remains dominant in language modeling. This article explores the potential of diffusion models in language modeling, discussing the challenges and advantages of iterative refinement techniques. It also examines the use of continuous Gaussian diffusion for discrete data and the possibility of learning higher-level continuous representations for language modeling. While autoregression remains a tough baseline to beat, further exploration of diffusion models in language modeling could yield significant benefits.
Read more at Sander Dieleman…