Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Shakespeare’s vocabulary: across his entire corpus, he uses 28,829 words, suggesting he knew over 100,000 words

Why does that suggest he knew over 100k words? Maybe it means he knew 28,829 and used all of them? Would he really know over 70,000 words he never used in his works? What would those 70,000 words be? Probably very obscure ones. How can you know that many obscure ones?



Vocabulary has for a while been considered in terms of 'receptive' and 'productive' capacity, with the assumption being that ones 'receptive' vocabulary can be larger, since it is easier to hear/read/understand a word than it is to use it correctly in reading/writing (this is not necessarily the popular opinion anymore [http://www.readingconnect.net/web/FILES/english-language-and...] but may provide the context for the claim about Shakespeare). The notion is that you are able to understand more words than you commonly use in your speech/writing, which is on some level intuitive, although of course it is an empirical question.


Assumptions tend to break down at the extremes.

E.g. Shakespeare actually invented a lot of the obscure words he used.


I imagine that it would be something similar to the German Tank Problem. (http://en.wikipedia.org/wiki/German_tank_problem ) Taking each writing as a sample of the words that are known then would allow for an estimation of total words known. I imagine that this would need to be modified to account for the non-uniform distribution of word use, but the principle would be the same.


Again, I'd like to point out that Shakespeare also made up words: http://www.shakespeare-online.com/biography/wordsinvented.ht...


Relevant video:

Kate Tempest - My Shakespeare: https://www.youtube.com/watch?v=i_auc2Z67OM

;-)


The 28K figure is acheived by counting multiple spellings of the same word. Shakespeare lived before dictionaries, so there was never a single standard way to spell a word.


I'm curious if words that Shakespeare invented count. There are many words that we see first used by Shakespeare, though some of them were probably words invented during his time by others with him merely being the first to record (in documents that survived until today).


The latter is probably more the case. The OED, for instance, has had a bias in favor of using Shakespeare as a word's origin since that dictionary's first edition. The number of words attributed to Shakespeare in the OED has dwindled over time.


Just like you or I know tens of thousands of words, but only use some small subset of them in any given work, you wouldn't expect that Shakespeare would use his entire productive vernacular in producing the limited corpus of his literary works.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: