To be clear, I’m not saying this isn’t a super impactful idea and I actually think it's a revolution. Since Attention Is All You Need came out in 2017, everything has changed fast but it’s just that I also feel that another breakthrough from research could appear in a similar way.
Transformer models are already reaching peak of their capabilities vs the cost to run them. All I see currently is basically the 'burn' phase, which is important to get to the 'sustain' phase.
With new stuff coming literally every week, I believe it is not a long time till either transformers get replaced, or some new version of them is introduced which is far more efficient.
Alternatively, one can imagine that GPUs will integrate Qubits.