Prediction: Claude 5 will be a major regression

At this point it should be completely obvious to everyone that there’s what is approximately a linear relationship between model cost and model performance. Anthropic is claiming that Claude 5 Sonnet will cost about half as much as their current SOTA models. Therefore, expect about half the performance. This is Anthropic’s version of GPT-5, i.e. a way to fool their customers into using a less compute intensive model, almost purely for the benefit of the company. But as usual, they will rig the benchmarks and make it appear as though the model is better at certain things, like coding.

It’s an illusion, folks. You’re being played. Wake the hell up.

Also, I can’t believe that people still talk about SWE-Bench when there is a paper proving that the benchmark is completely useless because models regurgitate memorized answers.

Again, please, wake up.

https://arxiv.org/abs/2506.12286

Story

Prediction: Claude 5 will be a major regression