Maybe a lot of the difference we see between peoples comments about how useful AI is for their coding, is a function of what language they're using. Python coders may love it, Go coders not much at all.
I mean that there is the possibility that swe bench is being specifically targeted for training and the results may not reflect real world performance.
Look at the results from multi swe bench - https://multi-swe-bench.github.io/#/
swe polybench - https://amazon-science.github.io/SWE-PolyBench/
Kotlin bench - https://firebender.com/leaderboard