"If Google could find a way to take that corpus, sliced and diced by genre, topi...

gridit · on April 11, 2017

Ok, here is one of the important opinions in the Google Books settlement, by Judge Chin in 2013 [0]. He basically says (paraphrasing), "I'm going to assume Google has violated copyright by creating digital copies and serving them. But it's fair use, because the new products are transformative".

For example, re:ngrams

""" Similarly, Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas, thereby opening up new fields of research. Words in books are being used in a way they have not been used before. Google Books has created something new in the use of book text-the frequency of words and trends in their usage provide substantive information. [...]

On the other hand, fair use has been found even where a defendant benefitted commercially from the unlicensed use of copyrighted works

"""

Oh man, this is mind-blowing.

[0] https://copyright-casebook.com/about/recent-cases-edited/aut...

pbhjpbhj · on April 11, 2017

Data-mining, indexing, quotations, meta-data, have all been extracted before. It seems more like the degree to which Google are/want to do it, rather than the idea to do it?

If I get the same treatment as Google before the law then doesn't this mean I can copy any whole corpus of work, use it, recopy it, share it, make derivative works, etc., all as long as at the end I write something new - a music track inspired by their work, say? That appears to be what the judge is saying when applied to other works??

SomewhatLikely · on April 11, 2017

They already have. http://commondatastorage.googleapis.com/books/syntactic-ngra...

yohui · on April 12, 2017

See also Google Ngram Viewer: https://books.google.com/ngrams

Wikipedia entry: https://en.wikipedia.org/wiki/Google_Ngram_Viewer