Working with Webhits

August 1999

Microsoft Corporation

Introduction

Webhits is an extension to Microsoft® Windows NT® Internet Information Server (IIS) for adding hit-highlighting capabilities to search applications created with Microsoft Site Server 3.0 Commerce Edition. Webhits is contained in Index Server 2.0, which is available with the Web download of Windows NT 4.0, or Internet Information Server 3.0. Webhits is implemented as an Internet Server API (ISAPI) application, called Sswebhit.dll, that is invoked during a search.

To download Windows NT 4.0, visit http://www.microsoft.com/NTServer/all/downloads.asp.

Note   You must have a search catalog already set up through Microsoft Management Console (MMC) or Web Admin before running the Webhits default search page. For information on how to set up a catalog, select the Configuring Search option located at http://localhost/sites/samples.

Webhits comes with a ready-made results page, called Searchright.asp. This results page includes a link to the ISAPI application, Sswebhit.dll, for each document returned. When you perform a query from this default search page, the hit-highlight section code in Searchright.asp performs the following steps:

  1. The query string aims at a specific document and escapes.

  2. Formatting restrictions in the hit-highlight section define how the query string is to be highlighted in that specific document.

  3. An .stw template file, which displays either summary or full views of the query hits for the specific document, is referenced.

Figure 1 shows the Webhits results page with both summary and full hit-highlight reference displays.

Figure 1. Sample results page

Incorporating Webhits into an Existing Web Site

Using Webhits, you can modify your existing .asp file to:

Modifying an Existing Results Page

Site Server 3.0 Commerce Edition uses the same hit-highlight options as Index Server. The following procedure describes how to add the hit-highlight code to your existing .asp file. All sample code can be found in the Searchright.asp file.

To add a hit-highlight section to your existing .asp file

  1. In your existing .asp file, define a hit-highlight section as follows:
    <% if (InStr( RS("DocAddress"), "http:") = 1) then 
    
    <% end if %>
    
  2. Within this hit-highlight section, define the hit-highlight link, as seen in the following example, using the Webhits parameters. In this example:
    ' Set query and utility objects, and define query object properties.
    Set Q = Server.CreateObject("SSSEARCH.Query")
    
    <-Set query properties on Q here->
    
    ' Set the recordset
    set RS = Q.CreateRecordSet("sequential")
    
    ' Construct the URL for hit highlighting
    ' Define the file to highlight using the CiWebHitsFile parameter
        WebHitsQuery = "CiWebHitsFile=" & RS("DocAddress") 
    
    ' Define the query using the CiRestriction parameter, which must be escaped
        WebHitsQuery = WebHitsQuery & "&CiRestriction=" & Server.URLEncode( Q.Query )
    
    ' Define the formatting for the query results
        WebHitsQuery = WebHitsQuery & "&CiBeginHilite=" & Server.URLEncode( "<strong class=Hit>" )
        WebHitsQuery = WebHitsQuery & "&CiEndHilite=" & Server.URLEncode( "</strong>" )
        'WebHitsQuery = WebHitsQuery & "&CiQueryFile=" & "/definecolumns.txt"
          'WebHitsQuery = WebHitsQuery & "&CiLocale=" & Q.LocaleID
        QueryForm = Request.ServerVariables("PATH_INFO")
        WebHitsQuery = WebHitsQuery & "&CiUserParam3=" & QueryForm
    
  3. To reference either the summary (Qsumrhit.stw) or full (Qfullhit.stw) hit-highlight template files, add the following line to your hit-highlight section. The reference display then appears to the left of each link on the results page.

    For summary hit-highlight, reference Qsumrhit.stw:

    <a href="oop/qsumrhit.stw?<%= WebHitsQuery %>"><IMG src="images/hilight.gif" align=left alt="Highlight matching terms in document using Summary mode."> Highlight Summary</a>
    

    For full hit-highlight, reference Qfullhit.stw and define the CiHiliteType parameter:

    <a href="oop/qfullhit.stw?<%= WebHitsQuery %>&CiHiliteType=Full"><IMG src="images/hilight.gif" align=left alt="Highlight matching terms in document."> Highlight Full</a>
    

    Note   In the code above, you are using a relative URL. Consequently, you must copy the Oop directory, which contains the .stw files, into the same directory as your search page. As an option, you can use an absolute URL by replacing the path in the following code with:

    "http://localhost/sites/samples/knowledge/search/sswebhit/oop/<.stw file name>".
    
  4. Optional: If you want to modify how the hit-highlight link appears on your existing results page, modify the following code:
    <a <% = LinkTarget %> href='<% = Link %>'><img src="<% = Image %>" hspace=2 height=16 width=16 border=0></a> 
    

Webhits Parameters

You can specify the Webhits parameters in any order and include white space in any of the following places:

This section describes the Webhits parameters used in the hit-highlight section of your .asp file:

CiBeginHilite, CiEndHilite

Format: CiBeginHilite=BeginTags & CiEndHilite=EndTags

Customizes highlighted words in the query results. If you specify these tags, Site Server 3.0 Commerce Edition ignores all other formatting parameters CiBold, CiHiliteColor, CiItalic, and so on.

Important   You must match the BeginTags and EndTags with correct HTML formatting. Failure to do so will produce unpredictable results. When you specify these parameters in the query template file (.asp file), you must properly escape the tags. For example:

CiBeginHilite=<%escapeURL <font color="#FF0000"><em>%>&CiEndHilite=<%escapeURL </em></font> %>

The two parameters together in the above example make the highlighted words in the search results appear in red italics.

CiBold

Format: CiBold=value, where value can be any non-null string

Specifies that the highlighted text appear in bold format. Any non-null value will turn bold formatting on. This parameter is optional.

CiCodepage

Format: CiCodepage=charset

Determines the character set in which hit-highlight results appear. This parameter is optional.

CiHiliteColor

Format1: CiHiliteColor=24-bit color mask, where the mask takes the form 0xHHHHHH and each H is a hexadecimal digit.

Format2: CiHiliteColor=color, where name is one of the following colors:  red, green, blue, yellow, black.

This parameter is optional and specifies the color used to highlight the text matching the CiRestriction parameter. If it is not specified, or if the format used does not match either of the above, Webhits defaults to red.

CiHiliteType

Format: CiHiliteType=[Full|Summary]

This parameter is optional. If not specified, Summary is the default.

Full

Generates the full text of a document, highlighting the words that match the query. This option is mainly for documents that contain mostly text, as it does not do full-fidelity highlighting. Only the text section of the document is extracted and highlighted. In addition, this option tags the highlighted text (hits) with bookmarks, allowing navigation between the hits. The first hit is bookmarked as #CiTag0 and the top of the generated document is tagged as #CiTag-1. To help navigate, double-angle bracket tags (<< and >>) surround each hit. Click the << tag to go to the previous hit, and click the >> tag to go to the next hit.

Summary

Generates small excerpts of a document around the words that match the query.

CiItalic

Format: CiItalic=value, where value can be any non-null string

Specifies italics for the highlighted text. Any non-null value will turn on italics. This parameter is optional.

CiLocale

Format: CiLocale=LocaleString

This parameter is optional and specifies the locale to interpret the CiRestriction string. The output will also be generated by this locale. Valid values for the CiLocale string are found in \Winnt\Help\Ix\Htm\Ixvarloc.htm.

CiMaxLineLength

Format: CiMaxLineLength=Number

This parameter is optional and pre-formats the text with the <pre> and </pre> HTML tags. If a line length exceeds the specified number, it is broken at the next word boundary. This option works best when full-hit highlight is chosen.

CiRestriction

Format: CiRestriction=Query

Specifies the query for a hit-highlight link. The query string must be in escaped form, which means that spaces and other special characters have been converted to their respective ASCII codes. For example, if you want to search for the author of a document greater than 100 bytes, and the author’s name is George, the following query would be sent:

@DocAuthor = George AND @size > 100

However, before the query string can be passed to Webhits, it must be escaped, which results in:

@DocAuthor%20%3D%20George%20AND%20@size%20%3E%20100

In an .asp file, the restriction can be escaped by using the Server.URLEncode keyword. This parameter is required.

CiUserParamNumber

Format: CiUserParamNumber=value, where value can be any non-null string

Specifies any parameter available for Webhits, where Number is any number from 1 to 10. For example, CiUserParam1, CiUserParam3, CiUserParam5, and so on.

CiWebhitsFile

Format: CiWebhitsFile=URL

Specifies the URL of the hit-highlight link. This parameter is required.

Note   The Secure Socket Layer (SSL) setting for the Webhits template (.stw file) and the setting for CiWebhitsFile must be the same, or else the CiWebhitsFile must have a setting of zero (0). Otherwise, Webhits fails with an appropriate error message.

If your site requires all documents to be accessed only with SSL, be sure to set the same SSL access on the .stw file. If your site has some documents with SSL access required and others with no SSL access required, set the SSL access on the .stw file to be the same as those that require SSL access. Alternatively, you can create two .stw files (one for each type of document).

Modifying the .stw Files

If you want to modify the Qfullhit.stw or Qsumrhit.stw template file, the following section must remain in the file to enable the hit-highlight option:

<!-- The highlighted summaries are printed here -->
<table><tr><td>
<%begindetail%>
<%enddetail%>
</td></tr></table>

All other code in these two files can be modified using the standard HTML format.

Enabling File Queries for Webhits

Sswebhit.dll only accesses files that are available through "http:" anonymous access or NTLM authentication. To enable file access (for example, access to URLs prefixed by "file:"), you must set up the ISAPI application, called Sswebhit.dll, to run using an account that has access to the files you want highlighted.

The following procedure describes how to create a virtual root and point the Internet Information Server (IIS) to Sswebhit.dll. For more information, see the online Help located at http://localhost/sites/samples/knowledge/search.

! WARNING   Use extreme caution when setting this account. Since Sswebhit.dll is an Internet Server API (ISAPI) application, security issues must be considered when enabling a hit-highlight option for these "file:" addresses. Although your search page only generates links to documents to which a user has access, if a user types his/her own "highlighting" URL and substitutes a new vpath, they can gain access to any file to which the access account has permissions.

To enable file queries

  1. Put your .asp file on a share.

  2. Set up a virtual root through the IIS console to access that share.

  3. If you are using an .stw file, use the IIS Web Admin to set up the ISAPI mapping from this .stw to Sswebhit.dll.

  4. Optional: Now that you have an IIS virtual root pointing to a share, use the IIS console to specify an access account for that share.

Information in this document, including URL and other Internet web site references, is subject to change without notice.  The entire risk of the use or the results of the use of this resource kit remains with the user.  This resource kit is not supported and is provided as is without warranty of any kind, either express or implied.  The example companies, organizations, products, people and events depicted herein are fictitious.  No association with any real company, organization, product, person or event is intended or should be inferred.  Complying with all applicable copyright laws is the responsibility of the user.  Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document.  Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

© 1999-2000 Microsoft Corporation.  All rights reserved.

Microsoft, Windows and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A. and/or other countries/regions.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.