Tuesday, April 29, 2008

MOSS Search feature: definition extraction

Recently I discovered an interesting new feature within MOSS Search (Apparently this guy had the same experience - What people are saying - definition extraction) Take a look at the last search result in the next screenshot ...


So how does this work - I only found a couple of online references. But the most complete one was listed in the MSDN forums - Discovered definitions/What people are saying about ...

The Definition Extraction feature finds definitions for candidate terms and identifies acronyms and their expansions by examining the grammatical structure of sentences that have been indexed (for example, NASA, radar, modem, and so on). It is only available for English. Definition extraction feature in MOSS 2007 is a feature that extracts mening of definition from indexed text. User enters a search query “X”, search  returns from the document index a ranked list of sentences containing definitions of “X”, such as “X is Y” with links to the documents in which the definitions were found. In MOSS, implementation definitions are extracted from free text rather than from glossaries.

Definition Extraction feature is integrated with Search Feature at Crawl/Indexing and Query times. During the crawl, tokens with alternate definitions are added to search database . At query time passed search token is compared with existing entry in definitions database. If a match is found the definitions link is populated at the bottom of the search results page. Collapsing the link shows number of definitions.

It’s a default setting and cannot be customized. You can turn off Definition Extraction by Editing the Search Centre results page in question, Modifying the Search Core Results web part, and turning off ‘Display Discovered Definition’.

Unfortunately, the white paper about Plan for building multilingual solutions seems to confirm the fact that definition extraction only seems to work for English.

