Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

$6,000 for the Gigaword dataset they used to train...

https://catalog.ldc.upenn.edu/LDC2012T21



There's not even a way to buy it as an individual who's not part of an organization! They don't even state this as a possibility.


Funny how Google is paying for data


The code can be used to train on other data. All you really need is a collection of news articles. I think there are some free ones available.

This dataset was only used to benchmark against other published results. It was first proposed in https://arxiv.org/abs/1509.00685.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: