Sunday, November 28, 2004

Web services discovery

If you want to use a webservice in your .Net applications, you have to know the URL to add a webreference to it. (If you encounter a problem, take a look at Troubleshooting add web reference problems) You can set individual web references to .asmx files, but if your application uses many services, this is not very handy. To solve this Microsoft provides discovery information. Microsoft provides discovery information in 2 ways, statically with disco files and dynamically with vsdisco files. With dynamic discovery the discovery document is generated at runtime. A .disco file contains markup that specifies references to a web service's WSDL file and other DISCO documents.

When a .vsdisco file is requested, the .NET framework analyzes the directory in which the .vsdisco file is located, as well as that directory's subdirectories and returns markup that contains references to all web services in that directory and the directory's subdirectories (It is however possible to exclude some directories from being searched).
A vsdisco file would look this
<?xml version="1.0" ?>
<dynamicDiscovery xmlns="urn:schemas-dynamicdiscovery:disco.2000-03-17">
<exclude path="_vti_cnf" />
<exclude path="_vti_pvt" />
<exclude path="_vti_log" />
<exclude path="_vti_script" />
<exclude path="_vti_txt" />

Dynamic discovery is however disabled by default. With the .Net framework you had to remove the comments from the following lines in your machine.config. With the .Net framework 1.1 these lines are removed from the machine.config so you have to add them yourself.
<!--<add verb="*" path="*.vsdisco"
System.Web.Services, Version=1.0.3300.0, Culture=neutral,
PublicKeyToken=b03f5f7f11d50a3a" validate="false"/>-->

As you can see dynamic discovery is accomplished in ASP.NET by mapping the file name extension VSDISCO
to an HTTP handler that scans the host directory and subdirectories for ASMX and DISCO files. For more info about http handlers definitely take a look at Using HTTP Modules and Handlers to Create Pluggable ASP.NET Components

Saturday, November 27, 2004


I was just wandering around on the net, when I found this site with WIFI spots in Montreal. But since I live in Belgium it would be nicer, to find Belgian wifi hotspots and .... surprise, there actually is.

Is WIFI finally becoming mainstream? From personal experience I find it still not very easy to use and I actually got some problems after upgrading Windows XP to SP2. Before SP2, the connection speed switched between 1MB, 10 MB,... up untill 54 MB. Now, it just stops at 1MB and don't changes... anybody any ideas?

Well, one thing that has changed with SP2 for the better, is that you can actually repair your connection now. If you right clicked your connection pre SP2 and clicked repair, it just stalled... now it works.

Office 2003 development

Most people don't consider Office as a full blown development platform on which you can build some nice solutions for your customers. With Office 2003 and Visual Studio Tools for Office you can however create some nice applications in which you can use all the features of Office 2003 such as charting (Excel), document handling (Word) and e-mail and calendaring (Outlook). In the past if you had to program in Office, you had to use VBA. With Visual Studio Tools for Office, you can now write .Net applications for Office. Is that nice or what? Unfortunately Visual Tools for Office is only available for download for MSDN subscribers.

  • OfficeZealot: Huge amount of resources for Office 2003 developers

  • Building a professional stock allocation system using Visual Studio Tools for Office

  • Smart Tag Developer Tools

  • Bring the Power of Visual Studio .NET to Business Solutions Built with Microsoft Office

  • Secure and Deploy Business Solutions with Microsoft Visual Studio Tools for Office

  • Introduction to the Office 2003 Research Services Class Library

  • Essential tools for web designers (... or wannabee web designers like me)

    The last days I had to create a prototype CMS 2002 site for a customer, unfortunately I had to do it alone this time. Normally we use professional webdesigners for creating the layout of the pages, since user interface is one of the essential parts in developing a web application (Check out Useable Information Technology). Luckily I got some nice tips about which programs you can use to create great websites. If you want to create stylesheets don't use Visual Studio .Net - it works but not very nicely - definitely use TopStyle. There is a free version available for download but the professional version is way better. Another nice tool is ColorImpact, which enables to compose nice color schemes for your pages. And of course last but not least Adobe Photoshop (... but that you all know ...)

    Wednesday, November 24, 2004


    I bought myself a new scanner today, so I scanned some sketches from art class and uploaded them on my other blog.

    Monday, November 22, 2004

    InfoPathHelper : add offline support for InfoPath

    Some weeks ago, I blogged about a TechNet evening in which Jelle Druyts gave a demo about how to enable InfoPath for truly offline scenarios (Musings about InfoPath). Well he has just released all the code how to implement such a scenario, take a look at InfoPathHelper: add offline support to InfoPath!

    Sunday, November 21, 2004

    Migrating to SharePoint 2003

    Both Windows SharePoint Services and SharePoint Portal are about a year old by now and customers seem to be quite eager to migrate their SharePoint 2001 servers (Also see this article - SharePoint is the number one product according to Ballmer. Since the next version of SharePoint (beta 1 expected in summer 2005) will probably not support backwards compatible document libraries this seems a wise move as well. Another reason is a migration to Windows 2003 since SharePoint 2001 is not supported on a Windows 2003 Server...

    Microsoft KB820328: "SharePoint Portal Server 2001 will not function on Windows Server 2003 because of incompatibilities between the versions of the Microsoft Web Storage System and Microsoft Internet Information Services (IIS) 6. There are currently no plans to release an update to the Web Storage System in SharePoint Portal Server 2001 to make it compatible with Windows Server 2003."

    There are some tools available to ease a migration, such as SPIN and SPOUT. SPOUT is a document library Export Tool for SharePoint. SPOUT will export the workspace document library content as an XML manifest file and the individual content files. The XML manifest files is used to describe the different objects and properties of the documents. After running, SPOUT will generate 3 files:
  • Manifest.xml : all metadata

  • Migrate_spout.log : a log file

  • An error file with the extension .err - this file should be empty after running SPOUT

  • I did some testing with SPOUT a while ago, just to get an idea how fast it would be. I ran it in a test environment with Windows 2000SP4, 512 MB RAM,SharePoint Portal Server 2001 SP2a, .Net Framework 1.1. On average I got export times between 1 and 3 secs per file with file size as the most important factor. Even when this would run twice as fast in a production environment time is still a limiting factor, exporting about 100000 files would take about 27 hours. You should also take into account the size of manifest.xml file, since XML parsing can become quite troublesome when your XML files contain a lot of data. (Tip : use Xpath queries since XMLDom manipulations can be very memory intensive, qua memory usage can run as high as 10 times the original filesize.)

     Number of filesSize manifest.xmlMBs exportedDuration
    Test 16996K42.941 sec
    Test 2154216K48.655 sec
    Test 3280396K56.81min11sec
    Test 4709982K1673min6sec

    If you want to try out some alternative tools for export and import you should definitely check out this GotDotNext workspace. Before starting a migration, take a look at the following links:
  • Technet article - migrating from SharePoint 2001

  • Migrating documents into SharePoint - tools and tips to improve performance

  • Overview SharePoint migration

  • Hotfix SPIN

  • Import problem SPIN for subsites - no hotfix available

  • Friday, November 19, 2004

    Reflector for .Net + addons

    A tool definitely worthwhile for .Net developers is Reflector for .NET . Reflector is a class browser for .NET components. It supports assembly and namespace views, type and member search, XML documentation, call and callee graphs, IL, Visual Basic, Delphi and C# decompiler, dependency trees, base type and derived type hierarchies and resource viewers. There are also some nice addons for this tool:
  • Reflector.FileDisassembler : You can use it to dump the decompiler output to files of any Reflector supported language (C#, VB.NET, Delphi). This is extremely useful if you are searching for a specific line of code as you can use VS.NET's "Find in Files" or want to convert a class from one language to another.

  • Reflector Diff - original release and Reflector Diff 0.6 Beta

  • SharePoint : Ghosted vs unghosted pages

    The concept of ghosted pages is something that you have to understand when customizing SharePoint. All items in SharePoint are by default stored in the database, but some aspx pages are not stored in the SharePoint database but on the file system, e.g. default.aspx for each site and also the search.aspx page for SharePoint Portal Server. These pages are called ghosted pages. These pages are pulled from the cache at runtime and therefore it will increase the scalability from the system since all uncustomized pages are reused accross all of the sites and there is no unnecessary data storage or retrieval.

    But these ghosted can become unghosted when they are modified with e.g. FrontPage. However FrontPage is not the only culprit, when you modify one of these pages through webfolders with notepad, then the page will also become unghosted. These unghosted pages are stored in the database. Normally if a page which is used in a site definition is changed, this change will apply to all sites created with this site definition,but if you're page is unghosted this will not happen. There is also a slight performance impact of about 10% between ghosted and unghosted pages because the files are being read from the database instead of the cached filesystem. You can check if a page has become unghosted with ghosthunter utility or by checking the vti_hasdefaultcontent field obtained through the Properties property of the SPFile object.

    There have been a lot of postings about ghosted and unghosted page and the role of Frontpage in this issue, take a look at them, they will provide more detail about for example the difference in parsing, so check them out

  • What you don't know about FrontPage can hurt you?

  • Dustin Millers response to the previous article

  • Ghosted and unghosted pages Part 1 on BlueDogLimited

  • Don't kill the messenger ...
  • Web part page + DVWP != always unghosted : nice tip for using the dataview webpart and not unghosting your page
  • Wednesday, November 17, 2004

    SharePoint development part I - Webparts

    When we think about developing on the SharePoint platform, the first thing that comes to mind is webpart development but there are actually more development tasks with SharePoint:

    1. Web part development : one of the SharePoint development topics which gets the most attention

    2. Developing with SharePoint lists : involves creating xml schema definitions, writing SharePoint Object Model code, adding your own UI

    3. Writing custom workflow: SharePoint does not provide workflow out of the box but allows you to add your own workflow through the use of eventhandlers

    4. Customizing SharePoint UI : starts from simple things like changing stylesheets, images and logos to create completely new site templates

    5. Extending and customizing SharePoint search

    In the coming weeks I will write more postings about SharePoint but in this first posting I'm going to focus on webpart development. You should approach webpart development as any other programming task.

    1. Make sure you understand the basics. A good article to start with is A developers introduction to webparts on MSDN or The definitieve hello world webpart from John Durant

    2. Think about the design, what you want to accomplish. Since webparts are basically enhanced ASP.Net server controls which live in SharePoint context, the number of options are immense. But you also have to think about the enhancements which are provided by the webpart framework. So if you want to just display data maybe think about using the dataviewwebpart for the moment and don't immediately start with a datagrid. Know the potential of connectable webparts. Definitely check out the 3 articles from Patrick Tisseghem:

    3. Developing and debugging For debugging definitely check out this posting
      Debugging Web Parts - a full explanation of the requirements Something which is also easily forgotten is that you can also force a debug from within your code with the following statement System.Diagnostics.Debugger.Break()

    4. Deploying and testing: Deploying webparts isn't that simple, but there are some nice tools out there to help you with the deployment such as InstallAssemblies. I recommend however doing all the steps manually a couple of times, this will help your understand how SharePoint works. One of the nicest tools to aid in deployment is wppackager. WPPackager will create MSIs to install the webparts. For testing your webparts you should take a look at
      Testing webparts checklist on MSDN

    More links
  • Tips for building webparts from Daniel McPherson
  • Webcast - The power of the dataviewwebpart

  • SPSFAQ customization section
  • More SharePoint customization tips :here and here
  • Server.Transfer wont work in a WebPart

  • Server configurations which may lead to web part failures
  • Tuesday, November 16, 2004

    My boss is blogging and also....

    Danny has a blog .... (Only in Dutch) and also one of the guys who is doing his work placement at Dolmen, Jurgen

    Monday, November 15, 2004

    Enterprise Library and Application blocks

    I have been using the .Net application blocks for quite a while now and it is great to see that they will have a successor in the future, check out the blog of Scott Densmore,Enterprise Library 'The Day After'.For those of you who don't know the application blocks, take a look at them. Application blocks are basically parts of code which can be freely downloaded from MSDN, they contain samples and source code so you can easily extend them. They are written with best practices in mind and are nicely documented.

    Overview application blocks
  • Exception Management application block - I have used in about every project I have done in the last couple of years. This block allows publishing of errors to the eventlog, xml files, databases without you having to write all the plumbing code. It is not even so hard to write your own exception publishers

  • Data Access application block: I use it less since I prefer the XS2 SQLdataaccessor from Sunblad

  • Aggregation application block

  • Caching application block

  • Updater application block

  • User interface process application block : looks very nice, definitely going to take a look at it when I find the time

  • Sunday, November 14, 2004

    Heathers resume blog idea

    Heather Leigh posted this . Link to her blogposting instead of sending a resume. Pretty innovative ...

    MSN Search beta

    Lots of people have already blogged about it, and I think it looks nice indeed, the new MSN Search arrived on the 11th of november. For Belgium go to . I especially liked the "link:http//" search, it returned 15 results, Google only one

    If you want to know what other people say about it, take a look at
  • MSN Search - Tisseghem

  • MSN Search Beta First Take

  • Google index double

  • New MSN Search Service

  • MSN Search Beta is now live

  • MSN Search

  • Msn Search Team are blogging

  • Microsoft Crawling Google Results For New Search Engine?
  • Saturday, November 13, 2004

    Office Information Bridge Framework (IBF)

    Office Information Bridge Framework (IBF) is a new solution that provides a standardised way for developers to integrate data from enterprise applications (CRM,HR, ERP,...) into Office. IBF is an example of a service oriented architecture in which your LOB applications are connected to Office through a webservices layer. At the clientside IBF leverages the smart tag and smart document functionality of Office 2003 Professional.

  • Download IBF

  • MSDN technical whitepaper about IBF

  • Technical overview IBF

  • Video about IBF on OfficeZealot

  • How to Build Solutions with the Information Bridge Framework
  • Office for Small Business Management

    Office for Small Business Management has been announced. This version
    will includes the familiar Microsoft Office 2003 programs as well as an updated version of Microsoft Office Outlook 2003 with Business Contact Manager and Microsoft Office Small Business Accounting—a new, comprehensive financial management package. So also take a look at I guess that when Microsoft will get the localization issue right, it will even become popular worldwide. But don't put your bets on it yet , II overheard saying that Microsoft seems to think that localization is the difference between English UK and English US, ... :-)

    Friday, November 12, 2004

    MSCMS - Disabling delete for authors

    The last couple of months I have been doing a lot of Microsoft Content Management Server (MSCMS) development and I think it definitely allows you to do some cool stuff. For those of you who don't know, MSCMS allows users without any html knowledge or special tools to publish content to a corporate website while maintaining a common look and feel and supporting an approval process for all of your published content.

    The product has however some shortcomings and last week the functional analyst of our project stumbled on one of them. For every posting you can define an approval process with authors, editors and moderators, so after an author creates a posting, the editor first approves the layout and then the moderator approves the content (This is a very quick overview). So for about every change, you need to go through this approval process EXCEPT for deletion of postings

    The obvious thing todo to disable the delete for authors was to add an ASP.Net panel control around the delete section in the defaultconsole.ascx and put it to visible false when a user had no editor or approve rights. Well this doesn't seem to be very simple:

  • Microsoft.ContentManagement.Publishing.CmsContext.UserCanApprove : you can't use this one, since it is a site-wide check, if the user has somewhere in your site editor or moderator rights, this property will return true.

  • Microsoft.ContentManagement.Publishing.CmsContext.Posting.CanApprove : seemed to be promising, it already checks the rights for this specific rights, but unfortunately it also takes into account the postingstate of the mode, so when your posting is published it will return false even when the current user has sufficient rights to approve

  • Microsoft.ContentManagement.Publishing.CmsContext.PostingApprovers() : returns a Microsoft.ContentManagement.Publishing.UserCollection with all the approvers, but it also takes the posting state into account.

  • So basically I'm stuck, anybody any ideas....

    Gotdotnnet CMS code samples and more

  • MCMS Plumtree Integration Pack v.2 CODE
  • Building MCMS 2002 sites without Visual Studio.NET
  • XHTML compliant MCMS placeholder
  • Switch placeholders based on custom property
  • RSS aggregator placeholder control and a little more explanation about it on Stefans blog.

  • Multi-site Development with Microsoft Content Management Server 2002 on Windows XP Professional

  • Blog posting about CMS revisions
  • Thursday, November 11, 2004

    SOA - Part 2

    I had a little spare time so I decided to run through all the blog postings which I had marked for review or follow up in RSS Bandit. I found these interesting postings about SOA
  • David Chappell on why SOA will help business-oriented software reuse

  • What is SOA? and his follow up posting SO != WS

  • It seems that Sam Gentile did a talk about SOA for Boston.NET. Download the slides over here.
  • Tuesday, November 09, 2004

    What I'm reading.

    The books I'm currently reading:
  • SharePoint Products and Technologies Resource Kit (ISBN 073561881): an invaluable resource if you are interesting in doing SharePoint development

  • XML Web Services and Server Components Development with VB.Net - Study Guide exam 70-310 (ISBN 007222653) : I know, I should probably read the C# version, but hey I found this one at a book sale

  • Developing windows-based applications - Self paced training kit (ISBN 0735615330) : The last couple of years I have mainly done webdevelopment so it is probably time to get up to speed in winform development as well

  • Building Portals, Intranets and Corporate Web Sites using Microsoft Servers: Nice overview of the different server products, not very technical
  • Sunday, November 07, 2004

    Service Oriented Architecture (SOA) and other stuff....

    One of the hottest topics lately seems to be Service Oriented Architecture and how to implement it. But it is also a major source of confusion and I think that for most developers, SOA is still a big mystery. The main problem seems to be that we have to translate these high level concepts to low-level code, because in the end we just like to write code. However I still think that SOA is an important concept that every developers needs a basic understanding about, so the next couple of weeks I'm going to try to write some posting about SOA, providing some summaries of presentations that I saw and links to interesting sites and blogs.

    Lets start with the basics, what is SOA? At PDC 2003 Pat Helland gave a presentation about SOA. The key message was the following:

  • A service-oriented architecture is one where an application is cut into pieces called services

  • These services are invoked by messages, these messages are containers for information which is transmitted from one service to another. Since services only communicate with messages we create explicit boundaries for all our application building blocks.

  • Services allow for loosely coupled relationships, the only thing that needs to be defined is the contract for the message. Integration is based on message formats and exchange patterns, not on classes and objects.

  • Services are policy based and autonomous

  • Definitely check out Pats blog if you want to read more interesting stuff...

    A couple of months later I saw a presentation from Clemens Vasters, he mainly talked about the same concepts but things got a little clearer. There are only 2 types of services, message producers and message consumers. The messages flow through "pipelines", a pipeline is a sequence of services and each service transforms the message. He also gave some concrete tips :

  • The message should be XML based and not binary

  • Use ASP.Net webservices and extend these through WSE to add routing, security,...

  • Do things you have already been doing : isolate out business and data tiers, use serviced components, use stored procedures for data...

  • Think assynchronous

  • Now if you want to know more about it, check out:
  • Architect exchange

  • MSDN .Net architecture center

  • Rich Turner about "When to use ASMX, ES or Remoting"some nice guidelines about how to actually implement all this stuff.

  • Selling Service Orientation and Indigo

  • MSDN TV: Service Orientation and Today's Technologies

  • Dolmen is hiring

    The Belgian company I work for, Dolmen, is looking for new people, check out our job event and the profiles that we are looking for. By the way, also take a look at the blogs of two other Dolmen employees, Ken and Bart

    Is it going too fast?

    Everything just goes a little too fast... people who didn't even start developing soo long ago, in the mid of the nineties like me, have seen like 3 different development environments - VS 6, VS.Net 2002/2003 and VS.Net 2005 -, 3 different server OS - NT4, Windows2000 & Windows 2003 -, 4 versions of Office - 97,2000,XP and 2003 -, ... Well I guess you get the point... What do you think about it?

    Blogger went down

    Yesterday Blogger suddenly crashed and gave the great error listed underneath... I just love a nice stacktrace.... ;-)

    java.lang.RuntimeException: Unable to load blog object for blogID: 7753577
    at com.pyra.blogger.UserBlog.getBlog(
    at org.apache.jsp.blog_pyra._jspService(
    at org.apache.jasper.runtime.HttpJspBase.service(
    at javax.servlet.http.HttpServlet.service(
    at org.apache.jasper.servlet.JspServletWrapper.service(
    at org.apache.jasper.servlet.JspServlet.serviceJspFile(
    at org.apache.jasper.servlet.JspServlet.service(
    at javax.servlet.http.HttpServlet.service(
    at org.apache.catalina.core.ApplicationDispatcher.invoke(
    at org.apache.catalina.core.ApplicationDispatcher.doForward(
    at org.apache.catalina.core.ApplicationDispatcher.forward(
    at com.pyra.blogger.frontend.PyraJspDispatchPipe.invoke(
    at com.pyra.blogger.frontend.IdentityCookiePipe.invoke(
    at javax.servlet.http.HttpServlet.service(
    at javax.servlet.http.HttpServlet.service(
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(
    at org.apache.catalina.core.StandardWrapperValve.invoke(
    at org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(
    at org.apache.catalina.core.StandardPipeline.invoke(
    at org.apache.catalina.core.ContainerBase.invoke(
    at org.apache.catalina.core.StandardContextValve.invoke(
    at org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(
    at org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invokeNext(
    at org.apache.catalina.core.StandardPipeline.invoke(
    at org.apache.catalina.core.ContainerBase.invoke(
    at org.apache.catalina.core.StandardContext.invoke(

    .... and so on.

    Friday, November 05, 2004

    How to create a webpart in ASP 2.0

    A couple of days I wrote a blogentry referring to a thread about the differences between webparts in SharePoint and ASP.Net 2.0. Well for those of you who want to dive right in check out this article about How to create a WebPart in ASP.Net 2.0

    Thursday, November 04, 2004

    More Office 12 details ...

    Microsoft Watch is a great site, and it regularly posts some hot stuff such as this one Office 12 details begin to trickle out. The first beta will be available in august 2005 and a final release is planned for july 2007. As part of the Office Server lineup we can also expect a new version of both CMS and SharePoint as well as some new server products - one for Excel and one for InfoPath. These are truly exciting times we are living in ;-)....

    Wednesday, November 03, 2004

    Citrix Metaframe 3.0

    The last couple of days I worked at a customer who was using Citrix Metaframe Presentation Server 3.0, it was the first time that I saw this product and I have to admit that I'm impressed. It enables user to start their everyday applications from within a browser environment. It was great to see that the web interface used ASP.Net and the nice thing is that you can completely customize it. If you run into problems, definitely check out the Citrix support site .

    After developing the UI we deployed it to a production server on which a previous version of Metraframe was running as well so we didn't do an install but copied the directory with the files and created the necessary virtual directories. Everything worked except the ICA files were not correctly handled. It seemed that we still associate the ICA extension with the aspnet dll in IIS.

    After installing ASP.Net, we however noticed that some of the users which were still using the Citrix 1.5, had problems logging into the web applications. They got a very strange error "error while encoding a gif file". It seems that while installing ASP.Net some userrights were removed. If you encounter this error, just give the anonymous user access to the nfuseicons directory

    Integrating SharePoint and Content Management Server

    One of the most posed questions lately seems to be how do we integrate SharePoint and Content Management Server. People often start working together on documents in SharePoint and later want to publish this document in an easy way to CMS (That authors can't work together on a CMS posting - because of the locking principle - is definitely a missing feature in CMS). With the release of SharePoint Connector for Content Management Server (Previously codenamed Spark) in february 2004- some things were solved.

    In short the connector enables the following :

    • Content from CMS can be displayed in SharePoint webparts

    • Documents in SPS can be exposed to MCMS managed sites

    • Integrating SPS search with the content in CMS

    Miscellaneous links

    Monday, November 01, 2004

    SharePoint in only 28 steps ;-)

    This SharePoint Developer in 28 steps from Gregory is definitely a great posting,... I'm wondering how much time you would need to master all this stuff...

    SharePoint multilinguage features

    When you are living in a country which has 3 official languages like me - yes we have Dutch, French and German in Belgium - multilanguage definitely becomes an issue when doing a portal implementation. Unfortunately multilanguage is not a very strong feature in SharePoint. Windows SharePoint Services allows you to choose a different language at the moment of site creation when you install language packs, once the site is created it is not possible to switch the language anymore. For SharePoint Portal Server it is even worse, it only support the language of the SharePoint installation, so if you want to have a French Portal you have to install the French version of SharePoint Portal Server. There are however some articles which can guide you, but better multilanguage functionality is definitely something I want to see in the next version of SharePoint. One of the things I noticed which were very hard to find where the way of working of thesaurus files in SharePoint.The best documentation is found in the SPS 2001 Resource Kit and still seems to be correct. Another interesting document is the one on Technet which describes the international features of SharePoint 2003.

    I tried out the feature of expansion sets in SharePoint 2003 and how it works with files which have different languages and different formats. SharePoint uses thesaurus files to enhance its search functionality. The thesaurus allows you not only to search for the search term, but also for synonyms and other matching words, like words with the same stem. You can expand the thesaurus by adding tags to the thesaurus file(s). For example, when a user searches for ‘apple*’, you want to automatically search for ‘pomme*’ and ‘appel*’ as well so that documents containing the French and Dutch translations of the word ‘sugar’ will be added to the search results. This is an example of stemming. An expansion set for the word "author" is for example that it searches for documents containing "writer"

    There are different thesaurus files for every language, these are xml files - e.g. tsneu.xml is the neutral thesaurus file,tseng.xml is for English, tsnld.xml is for Dutch. The easiest way to test it, is to modify the xml files at local_drive\Program Files\SharePoint Portal Server\Data\Applications\Application UID\Config and then restart the service Microsoft SharePointPS Search.
    I added the following lines to the xml files
    <sub weight="0.5">author</sub>
    <sub weight="0.5">writer</sub>
    Well it seems that depending of the file type a different thesaurus file will be used, this happens because of the way how iFilters are implemented.
    Then I tried the search for different file types :

  • A plain text file : Uses tsneu.xml and none of the language specific ones

  • A word document : Works with tsenu.xml (english) and also with tsneu.xml (neutral). So for office documents the language is actually recognized

  • PDF files don’t seem to use the thesaurus files at all. I tried it with iFilter 5.0 and it didn’t use any of the thesaurus files. It seems that iFilter 6.0 has been released last week so I'll try that one later

  • Miscellaneous SharePoint links: