Sunday, June 19, 2016

IBM Watson at NAACL 2016

There were several Twitter NLP flare-ups recently triggered by the contrast between academic NLP and industry NLP. I'm not going to re-litigate those arguments, but I will note that one IBM Watson question answering team anticipated this very thing in their current NAACL paper for the NAACL HLT 2016 Workshop on Human-Computer Question Answering.

The paper is titled Watson Discovery Advisor: Question-answering in an industrial setting.

The Abstract
This work discusses a mix of challenges arising from Watson Discovery Advisor (WDA), an industrial strength descendant of the Watson Jeopardy! Question Answering system currently used in production in industry settings. Typical challenges include generation of appropriate training questions, adaptation to new industry domains, and iterative improvement of the system through manual error analyses.
The paper's topic is not surprising given that four of the authors are PhDs (Charley, Graham, Allen, and Kristen). Hence, it was largely a group of fishes out of water: they had an academic bent, but are daily wrestling with the real-word challenges of paying-customers and very messy data.

Here are five take-aways:

  1. Real-world questions and answers are far more ambiguous and domain-specific than academic training sets.
  2. Domain tuning involves far more than just retraining ML models.
  3. Useful error analysis requires deep dives into specific QA failures (as opposed to broad statistical generalizations).
  4. Defining what counts as an error is itself embedded in the context of the customer's needs and the domain data. What counts as an error to one customer may be acceptable to another.
  5. Quiz-Bowl evaluations are highly constrained, special-cases of general QA, a point I made in 2014 here (pats self on back). Their lesson's learned are of little value to the industry QA world (for now, at least).

I do hope you will read the brief paper in full (as well as the other excellent papers in the workshop).

No comments:

A linguist asks some questions about word vectors

I have at best a passing familiarity with word vectors, strictly from a 30,000 foot view. I've never directly used them outside a handfu...