Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does the model deal with dangling anaphora[1]? I wrote a summarizer for Spanish following a recent paper as a side project, and it looks as if I'll need a month of work to solve the issue.

[1] That is, the problem of selecting a sentence such as "He approved the motion" and then realising that "he" is now undefined.



We're not "selecting" sentences as an extractive summarizer might. The sentences are generated.

As for how does the model deal with co-reference? There's no special logic for that.


Wouldn't it suffice to do a coreference pass before extracting sentences? Obviously you'll compound coref errors with the errors in your main logic, but that seems somewhat unavoidable.


I am working on this in my kbsportal.com NLP demo. With accurate coreference substitutions (eg., substituting a previous NP like 'San Francisco' for 'there' in a later sentence, substituting full previously mentioned names for pronouns, etc.) extractive summarization should provide better results, and my intuition is that this preprocessing should help abstractive summarization also.


That is inter-sentence logic? Even humans have trouble with such ambiguity for certain cases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: