Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Then you need a lot of people that listen to those 12B hours of audio, and multiple listeners agree for each chunk of audio that what is spoken corresponds to the transcript.


Lots of machine learning systems can use unsupervised and semi-supervised learning. Then nobody has to listen to and annotate all that audio.


Yes, but then you don't need Mozilla collecting read speech samples. You can just scrape any audio out there, run speech activity detection, and there you go.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: