Wednesday, January 19, 2005

Musings about search - Search on the server

It is a nice feature to search everything on your desktop, but what about all the data left in other data stores at your company (File shares, websites, exchange public folders, notes database,...). This is an area which I think is still largely underserved. This year I mainly focussed on 2 search alternatives, SharePoint Portal Search and Mondosearch and I actually managed to do some projects with these search engines as well.

About Mondosearch
I already posted an evaluation of MondoSearch a couple of months ago (Check out CMS Search with Mondosoft), but lets add some remarks:
  • Be carefull when you edit the host that you want to crawl - this will reset your grabmap and all associated settings - fortunately it doesn't reset the category map

  • To check the version of Mondosoft you are using, you can go to any InSite page, do view source and look for the meta description field

  • In theory it is possible to copy the categories you have defined on one server to another by copying the C:\MondoSearch\SearchHost\data\MssCat.cfg, C:\MondoSearch\SearchHost\data\MssCatText.xml, C:\MondoSearch\SearchHost\data\MssCatTextPreview.xml to your new server. However you have to be absolutely sure that both of your servers are configured identical. In practice, don't do it, it will definitely go wrong

  • I already said that Mondosoft support is very good, well after experiencing some problems while doing an install, I definitely have to say it again. These guys definitely have an excellent support team

  • About SharePoint Portal Server Search
    Some people argue that search isn't the best way to retrieve your document, it is better to organize you documents through the use of categories (SPS2001 terminology) or areas (SPS2003 terminology). I still consider search to be an essential part of SharePoint Portal. We have done some projects using SPS search and especially with customizations of the search or using SPS search to create or own rollup webparts, lets see what we have found:
  • SharePoint search seems to scale better then the previous version, if you look at the SharePoint capacity planning whitepaper it states "The performance of the indexes degrades when the number of documents in the index exceeds 5 million documents. Have no more than four indexes on each indexer in a farm configuration.". This seems that in a webfarm setup you can index about 20 million documents.

  • If you encounter issues with search, you will have a hard time to retrieve the cause of those issues, since search is actually composed of a number of components

  • SharePoint search seems to have some glitches when you have multilingual content, ...

  • The SharePoint search UI is very customizable - check out the MSDN article How to Customize Your Search Using SharePoint Portal Server 2003 and the search.aspx documentation

  • SharePoint is very good at retrieving information for a specific search term but unfortunately it is not that good at showing the most relevant results. Read more about it SharePoint Search Results - what do you expect or Search result relevancy test UltraSeek vs SharePoint (I never tried Ultraseek so I can't say much about it ...)

  • About Coveo Enterprise Search
    I downloaded Coveo Enterprise search last week after seeing the posting from Angus Logan, Coveo is an enterprise search engine and you are free to use it, if you index less then 5000 documents. It definitely is an ideal solution for small MS CMS deployments. I will write about it in one of my next postings.... I actually tried it out with an existing CMS site in a Windows 2003 setup within Virtual PC and it definitely is very easy to set up and very intuitive. I however experienced some timeouts when performing crawls, maybe related to trying it out in Virtual PC.

    No comments: