In my unmeasured empirical observation Google has amazing speech recognition

jeffbee · on Sept 21, 2022

I tried feeding the four examples from this announcement into Google as dictation inputs and it just sits there blankly. On the JFK speech test file in the repo, Google understands perfectly. The samples in the announcement are clearly outside the capabilities of anything Google has launched publicly, but I don't know how that translates to overall utility in every day applications.

The5thElephant · on Sept 21, 2022

I agree they have the best compared to Apple, Amazon, Microsoft. However I don't think it is as good as what is being shown here by OpenAI.

Vetch · on Sept 21, 2022

My experience with the APIs is Google is excellent and Microsoft is slightly better. And the offline model I've been using that's nearly as good as both is facebook's wav2vec2-large-960h-lv60-self.

Don't believe what's on marketing pages, they rarely transfer to the real world. Will have to make time to try it and see. In theory, given task diversity and sheer number of hours, it should be a lot more robust but will wait on evidence before believing any claims on SoTA.

KingMob · on Sept 22, 2022

Weird. I started working on an ASR SaaS in my spare time, and at least on the test podcasts, Google was the worst: https://www.sammaspeech.com/blogs/post/speech-recognition-ac...