Author of post here. I'd say most of the examples generated from the best model were good. However we chose examples that were not too gruesome, as news can be :)
We encourage you to try the code and see for yourself.
How does the model deal with dangling anaphora[1]? I wrote a summarizer for Spanish following a recent paper as a side project, and it looks as if I'll need a month of work to solve the issue.
[1] That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.
Wouldn't it suffice to do a coreference pass before extracting sentences? Obviously you'll compound coref errors with the errors in your main logic, but that seems somewhat unavoidable.
I am working on this in my kbsportal.com NLP demo. With accurate coreference substitutions (eg., substituting a previous NP like 'San Francisco' for 'there' in a later sentence, substituting full previously mentioned names for pronouns, etc.) extractive summarization should provide better results, and my intuition is that this preprocessing should help abstractive summarization also.
>>"In those tasks training from scratch with this model architecture does not do as well as some other techniques we're researching, but it serves as a baseline."
Can you elaborate a little on that? Is the training the problem or is the model just not good at longer texts?
Agreed, it seems they really hand-picked some shining examples for this post, and it would have been more interesting to see the full spectrum of when it works and when it doesn't. Perhaps the README in the Github repo is a bit more honest in terms of representativeness, though it only has 4 examples, one of them is an interesting failure:
article: novell inc. chief executive officer eric schmidt has been named chairman of the internet search-engine company google .
human: novell ceo named google chairman
machine: novell chief executive named to head internet company
I don't see that as a failure. It did produce a sentence that is shorter, grammatical (though "named to head" is a bit weird) and essentially true — calling Google an "internet company" would make sense in its early days (back when Google would be prefixed by "internet search-engine company").
I didn't think it was a failure either until I realized that I was letting future knowledge leak into the past! There is more than one internet company so upon reading that headline, given that it must be a novel event, my question would be: "Which Company?". Now I have to read the article until I find out. The human summary is better because I don't have to ask that question.
Most humans would interpret "novell chief executive named to head internet company" to mean "novell is an internet company and its chief executive just became its head" which is incorrect (and a little nonsensical since the CEO already is in change).
That's pretty interesting. It's taken me 5 readings of that sentence, including once out-loud to get your reading of it.
I thought the generated summary was really, really good. But I knew that Novell wasn't considered an internet company, so it wasn't until I made myself ignore that before I could see the other reading.
I think the holiday pay example is more glaring. Seen in isolation, I would be confused as to what on earth it was getting at. Furthermore, the summary is no good. The abstract isn't either. My summary would be: British Gas continues to fight eu court's decision that commission be included in holiday pay. Continue is used here to emphasize that the case is not yet over.
On the other hand, the football summary is exemplary; better than the provided abstract.
IMO at least the second example shown is already poor, or at least not much better than what sites like SMMRY[1] have been providing for years.
> hainan to curb spread of diseases
That sentence pretty much conveys no useful information - every city wants to "curb spread of diseases", so what has actually changed? The news here is about restriction on livestock, and even a student journalist would be expected to do better than this headline.
To be clear I'm excited about the idea and believe machine learning has much better potential for enormous refinements compared to SMMRY's method (as described by them[2]), I just don't think it's as "done" as a lot of people here seem to assume it to be.