This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


MIND


This article assumes you're familiar with Active Server Pages, SQL Server, and English

Add Natural Language Search Capabilities to Your Site with English Query
Adam Blum

Now that you're building ASP-based sites to drive your Web content from a database, why not add advanced natural language search capabilities?
S o you've used Active Server Pages (ASP) to build yourself a dynamic Web site or application. You've gone beyond that to master Active Data Objects (ADO) and drive your Web content from SQL Server™ or some other database server. Next, you'll probably want to add some way for users to search and report on your database content. Of course, it's easy enough to add a query form that lets users search based on one or two fields. It's much harder to build form-based Web pages to allow searches across multiple tables and fields.
      There are many problems beyond the mundane UI and Web coding aspects, including defining how the various tables and fields are related to each other. And even with the best interface, your pages will be difficult for your casual Web visitors to use and understand. There's an inevitable learning curve for any complex search of structured data. Fortunately, there is a powerful, flexible search specification mechanism that all your users are likely to understand—the English language.
      Since the introduction of Microsoft® English Query 1.0 with SQL Server 6.5 Enterprise Edition last year, such a capability is no longer the stuff of science fiction. English Query provides the ability for users to ask questions about your data in plain English in your existing database-driven Web sites and applications. Putting up your initial natural language search is very easy—a small fraction of the effort of building your overall application. The development process is higher level than traditional programming, and can be mastered by nonprogrammers with some database background (such as a DBA or Web content developer who often works with databases).
      English Query ships with an engine (a COM server) used at runtime to convert a user's English question to a SQL-friendly query. Microsoft even provides sample ASP pages that drive the engine, prompt a user for English questions, execute the engine's returned SQL, and display the database's results to the user. Let's take a look at authoring your own English Query domain and embedding the English Query engine and authored domain into your Web site.
      English Query can be embedded into any application that supports COM, but most commonly it's embedded into a Web site built with ASP scripts. Figure 1 shows a Web page created using the sample ASP scripts that come with English Query. The user enters a question or clicks the Sample Questions button. The Sample Questions button gives the user some predefined questions that demonstrate the kinds of information that are available. Pressing Enter submits the question to the English Query engine, which generates a SQL statement that is submitted to SQL Server via ADO. The returned recordset is then displayed in the lower frame.

Authoring an English Query Domain
      The first step to building an English Query application is to design the semantics used in your particular application. In other words, you need to tell it how plain English entities (nouns), relationships (verbs), and descriptions (adjectives, traits, and subsets) map to tables, fields, and joins in your database. To do this you'll use the English Query authoring tool, which appears in the Microsoft English Query program group after installation. Selecting "Structure loaded from database" from File | New Application will initialize the database structure from your SQL Server schema and fill the database tab with tables and fields. Figure 2 shows the database structure of the SQL Server pubs database sample.
      If any of the tables in the database are missing primary keys, you need to edit the tables and supply one or more fields as the primary key. The underlying database doesn't have to have a primary key, but all of the tables must have them identified in the authoring tool or your application won't build correctly.
      If tables will be related to each other in queries, then joins should be indicated between them. These are usually retrieved from the foreign keys defined in your database. If the necessary foreign keys aren't present (usually they are there to force referential integrity), then you'll edit the joins manually inside the authoring tool.

Creating Entities
      Now you're ready to start adding semantic entities. Right-click on the Semantics tab to insert an entity. Supply some words to describe the entity (such as "author, writer"); the primary entity name should be first. Specify a type if it's a person, place, or time. Now identify what part of the database—the specific table or field—represents the entity. Major entities will generally correspond to entire tables. If it is a major entity, supply the fields that should be used to display the entity. This might be a person's first and last name, or the name and address of a company. For the entity "author" in the pubs database, the fields might look like Figure 3. This is the authoring tool's guess at what the name fields are, which happens to be correct. Accepting these entries will create an author name entity and a relationship indicating that authors have names.

Figure 3: Defining an Entity
Figure 3: Defining an Entity

      Major entities have two kinds of minor entities associated with them: names and traits. Names indicate how the entity is identified in plain English questions and statements. Clicking on the Autoname button for the author major entity will create a minor entity that represents the name of the author, represented by the first and last name fields. You'll want to create such name entities for major entities that are represented by entire tables so that the user has some way of identifying the entity in question.
Figure 4: Using Autotrait
Figure 4: Using Autotrait
      Clicking on Autotrait allows you to create traits for the major entity. You can select the highlighted fields as shown in Figure 4 to create minor entities for all of the semantically meaningful fields and trait relationships between the major entity and the newly created minor entities. You can continue to create more entities (along with names and traits) for all of the major nouns that users could ask questions about. In the pubs example, this would be books and publishers.

Creating Relationships
      Establishing traits for your major entities starts your model out with quite a few relationships. At this point you can ask questions about things that have traits. For example, you can ask "what authors have city Seattle?", "what books have title ‘The Busy Executives Database Guide'?", or "what publishers have country France?" For the really interesting questions that can be posed about your database, you'll need to create relationships between major entities, such as "authors write books" and "publishers publish books." Right-clicking on the Relationships branch and inserting a new relationship will present you with the Relationship Properties dialog box. First, add all of the entities that might participate in the relationship. If the relationship occurs at a time or place, it's helpful to include date or location entities. In this example, you'd supply pubdate as the entity identifying when the relationship occurs. The entities associated with the relationship "authors write books" might appear as shown in Figure 5.

Figure 5: Defining a Relationship
Figure 5: Defining a Relationship

      Next you'll create phrasings for that relationship. They can be verb phrasings ("authors write books"), preposition phrasings ("publishers are in cities"), adjective phrasings ("books are popular"), or subset phrasings ("some books are bestsellers"). Most trait phrasings ("books have royalties") and name phrasings ("author names are the names of authors") are created by Autotrait and Autoname, as I've just shown. Figure 6 shows a verb phrasing defined for "authors write books."
Figure 6: Adding a Phrasing
Figure 6: Adding a Phrasing

      Select the Database tab to specify a table that contains links to all relationships (in this example, the titleauthor table is computed automatically by the tool). You can create relationships for every kind of question that you want the user to be able to ask. For example, if you want the model to support questions about authors being in cities, you'd go back to the "authors have cities" relationship and supply a new preposition phrasing that says that "authors are in cities."

Testing Your Model
      At any time during the development process you can test what you've modeled by invoking Test Application from the Tools menu. For example, with only the "authors write books" relationship entered, English Query should be able to answer the question "who wrote the most books?" Figure 7 shows the Test Application dialog. Enter the query as an English sentence in the question field and click Submit. The generated SQL will be displayed below and, if you have Execute SQL checked, it will actually submit the query to SQL Server and display the answer. If you think this question is useful as a sample or for continued testing, click Add to Question File.

Figure 7: Testing Your Application
Figure 7: Testing Your Application

      Another Tools menu item called Regression Test will execute all queries in the Question File and compare the output against a reference file (which is usually generated the first time you run Regression). This is a good way to make sure you haven't broken your model as it becomes more and more complex over time. The Question File questions can also be used as a set of sample questions on the Web page to give users an idea of the kinds of information that are available.
      Once you've developed and tested the model to your satisfaction inside the English Query authoring tool, you're ready to build the application. Tools | Build Application will create the English Query domain (EQD) file. The EQD file and the English Query engine (the COM server MSEQ.Session) can be deployed inside any application that supports COM.

Adding English Query to Your Web Site
      Microsoft supplies a sample framework in the samples\asp2 subdirectory of the Microsoft English Query directory. If you're running Internet Information Server (IIS) 4.0 and doing the install from your IIS-based server, the simple way to deploy a Web page that allows users to ask your EQD questions is to click on the file setupasp.vbs. This is a Windows® Scripting Host script that copies the ASP files and your EQD file to a directory on the Web server, creates an IIS virtual directory for the English Query pages, and sets options in a file called params.inc to tell the ASP scripts where your database is. If you're running IIS 3.0, then just perform these steps manually (readme.htm explains these steps). Then you're ready to test in your browser by hitting the page http: //localhost/pubs. You can then link this page to other Web pages on your site, including other ASP apps.
      You can go further and use the sample application as a starting point for integrating English Query into other ASP applications. For example, you could have an English Query text box available on your search page to supplement existing reporting mechanisms. A little explanation of the structure of this sample application should make it easier for you. The code fragment in Figure 8 (a simplified version of the ASP sample code) shows how to convert user questions into SQL. You would embed code like this into the ASP page that processes the user's query.
      First, create an English Query object with Server.CreateObject("MSEQ.Session"). To load your domain, call the InitDomain method and the name of the EQD file. You'll then get a response object back by calling the ParseRequest method with the user's question. The reply can be a command response, which is a set of commands that are either SQL commands or direct answers that English Query can supply without needing you to provide the SQL command. The SQL command should be executed against your SQL Server database. You'll generally want to display the result as a table on your Web page.
      The DoSQLCommand function (available in samples\asp\common.inc) is used to execute the SQL command via ADO and display the result in a table. If the command is an answer, it's displayed directly to the user. The response might also be a request for clarification. For example, the question might be "what are all the Honda Civics in Washington" and the clarification might ask whether Washington is a city or a state. The DoClarification call (also available in common.inc) encapsulates the code necessary to prompt the user for clarification by displaying possible values from the UserInputs collection on the response object.

Summary
      English Query provides a powerful searching capability for your SQL Server-based Web site or application. It's easy to implement, especially in an ASP-based application. Adding this to your site should let you move away from having to write a lot of custom reports or complex searching forms.
      Microsoft English Query 1.0 shipped last year with Microsoft SQL Server 6.5 Enterprise Edition, and is also available with beta 2 of SQL Server 7.0. For details on how to get SQL Server 6.5 or participate in the SQL Server 7.0 beta, see http://www.microsoft.com/sql.

From the April 1998 issue of Microsoft Interactive Developer.