Document Mining 2.0: meaning extraction


A very interesting post  on “Why meaning extraction?” examines the recent evolutions in search and document mining – in that case, on intelligence reports.

It takes a look back at a few areas where search engines have progressed over the last 10 years. This includes increasing the size of the index database, expanding the search semantically (actually this is more precisely query expansion, because the semantic level is limited to synonymy at best), parsing the query with Natural Language Processing, and providing an interactive UI to massage and refine the results. The author claims that none of these techniques have brought any significant improvements in the search experience over the last few years.

Although I do not necessarily agree that none of these technologies have helped, I strongly concur that in isolation they are not enough – more efficient search comes with a mix of search improvements, user interface, analytics tool and good UI built to address specific needs for specific verticals. But it’s time to  ”Stop searching, start finding”, as the author points out.

“Search engines must evolve to have in-depth understanding of the searched material.  Beyond search, categorization, faceted navigation, and entity extraction, which we all understand by this point, the future of search is meaning extraction” - also known as semantic documents, one of the promises of Document 2.0 (or beyond).

One Response to “Document Mining 2.0: meaning extraction”

  1. Make sense of your Health Records | The Future of Documents on Apr 22, 2009

    [...] Paper“ has benefits beyond search and retrieval. Natural Language Processing or “Document Mining“ will allow the automatic “mining” of your documents, and make them fully [...]

Post a comment

  -- required field
(not displayed publicly)
 

You may use HTML tags for style