So is Gemini tbh. It's the only agent I've used that gets itself stuck in ridiculous loops repeating "ok. I'm done. I'm ready to commit the changes. There are no bugs. I'm done."
Google somehow manages to fumble the easiest layups. I think Anthropic et al have a real chance here.
Google's product management and discipline are absolute horsesh*t. But they have a moat and its extreme technical competence. They own their infra from the hardware (custom ASICs, their own data centers, global intranet, etc.) all the way up to the models and product platforms to deploy it in. To the extent that making LLMs work to solve real world problems is a technical problem, landing Gemini is absolutely in Google's wheelhouse.
You are stating generalities when more specific information is easily available.
Google has AI infrastructure that it has created itself as well as competitive models, demonstrating technical competence in not-legacy-at-all areas, plus a track record of technical excellence in many areas both practical and research-heavy. So yes, technical competence is definitely an advantage for Google.
I use Claude every day. I cannot get Gemini to do anything useful, at all. Every time I've tried to use it, it has just failed to do what was required.
Three subthreads up you have someone saying gemini did what claude couldn't for them on some 14 year old legacy code issue. Seems you can't really use peoples prior success with their problem as an estimate of what your success will be like with your problem and a tool.
People and benchmarks are using pretty specific, narrow tests to judge the quality of LLMs. People have biases, benchmarks get gamed. In my own experience, Gemini seems to be lazy and scatter-brained compared to Claude, but shows higher general-purpose reasoning abilities. Anthropic is also obviously massively focusing on making their models good at coding.
So it is reasonable that Claude might show significantly better coding ability for most tasks, but the better general reasoning ability proves useful in coding tasks that are complicated and obscure.
Hard to bet against Hassabis + Google's resources. This is in their wheelhouse, and it's eating their search business and refactoring their cloud business. G+ seemed like a way to get more people to Google for login and tracking.
Thats pretty telling that on the search's / ad placement on the web where it matters, OAI has had no impact or its muted and offset by continued market power / increased demand for Google's ad-space on the web.
A couple months ago things were different. Try their stronger models. Gemini recently saved me from a needle in a haystack problem with buildpacks and Linux dependencies for a 14-year-old B2B SaaS app that I was solving a major problem for, and Gemini figured out the solution quickly after I worked on it for hours with Claude Code. I know it's just one story where Gemini won, and I have really enjoyed using Claude Code, but Google is having some success with the serious effort they're putting into this fight.
I think they had no choice but to release that AI before it was ready for prime time. Their search traffic started dropping after ChatGPT came out, and they risked not looking like a serious player in AI.
They recently replaced “define: word” (or “word meaning”) results with an “ai summary” and it’s decidedly worse. It used to just give you the definition(s) and synonyms for each one. Now it gives some rambling paragraphs.
My google gives me the data from oup for word meaning, and doesn't show any AI. It opens up the translator for word meaning language. It is really fast and convenient.
Google somehow manages to fumble the easiest layups. I think Anthropic et al have a real chance here.