Initializing Our ID3 Table

This is the table that we will use for generating the results of the ID3 algorithm. Here we are going to take liberty with our data. After doing a bit of research on our markets, we have decided that each country has its own language and each belongs to a region of the globe. We can use our ID3 algorithm to decide if the country, region, or language is important to new sales for each individual product.

While our database has all of the details of each sale, the products purchased, etc., we want to take a broader look. So instead of just looking at each order, we might get the results that price is important to the quantity of products sold. While this is helpful, this type of 'literal' information can be quickly retrieved with an SQL statement. We want to work a bit smarter, through. So we will add some meta-information (i.e. information about information) about the sale of our product. We might wish to know what language our purchasers speak. Or what region of the globe they live. This will provide our ID3 algorithm with more to work with.

We will update our table by looking at each market our customers are in and associating each with a region and language. So we are adding these additional parameters to our table. We have no way of knowing if they will be important - that is for the algorithm to tell us. As mentioned, we could have just run the ID3 algorithm on the data as-is. But this simply illustrates how you might be able to add simple meta-data to provide additional insight on the underlying data.

Take a look at the following diagram. This is how the core of our application will operate. A lot of this won't make much sense to you right now, but keep this chart in mind as you work through the rest of this chapter.

Andy - can you put the 'flowchart' that starts "Select Product from list box, User pressed cmdAnalyze..." in here. Thank you

Try It Out - Coding the setupID3 Subroutine

1.  Please add the following sub to the frmID3 form. This is a routine that will simply update our ID3 table with the region of the globe and the language spoken in each of the countries where we have customers. We will simply look at each country where we have a customer and add the region of the globe where the country is located and the language spoken by its inhabitants.

Public Sub setupID3()

Dim adoTempCountry As ADODB.Recordset
'holds each unique country

Dim adoTempid3 As ADODB.Recordset
'holds a recordset for ID3

Dim sCountry As String
Dim sSql As String

'-----------------------------------------------------
'-- Get the unique countries and place in temp --
'-----------------------------------------------------
Set adoTempCountry = New ADODB.Recordset
sSql = "SELECT DISTINCT Country FROM Customers"
Set adoTempCountry = adoConnection.Execute(sSql)
'-----------------------------------------------

'-----------------------------------------------------
'-- Clean out all of the data from the ID3 table --
'-----------------------------------------------------
adoConnection.Execute ("DELETE * FROM ID3")

'----------------------------
'-- Now set up the ID3 --
'---------------------------
Set adoTempid3 = New ADODB.Recordset
adoTempid3.CursorType = adOpenKeyset
adoTempid3.LockType = adLockOptimistic
adoTempid3.Open "ID3", adoConnection

'-------------------------------------------------------
'-- Iterate through the country table and add each   --
'-- unique entry to the Country field of ID3.      --
'-------------------------------------------------------

While Not adoTempCountry.EOF

   sCountry = adoTempCountry!country
   
   With adoTempid3
     .AddNew
     !Country = sCountry
   
   Select Case UCase(sCountry)
        Case "ARGENTINA"
          !CountryRegion = "South America"
          !CountryLanguage = "Spanish"
         Case "AUSTRIA"
          !CountryRegion = "Europe"
          !CountryLanguage = "German"
        Case "BELGIUM"
          !CountryRegion = "Europe"
          !CountryLanguage = "French"
        Case "BRAZIL"
          !CountryRegion = "South America"
          !CountryLanguage = "Portuguese"
        Case "CANADA"
          !CountryRegion = "North America"
          !CountryLanguage = "English"
        Case "DENMARK"
          !CountryRegion = "Scandinavia"
          !CountryLanguage = "Danish"
        Case "FINLAND"
          !CountryRegion = "Scandinavia"
          !CountryLanguage = "Finnish"
        Case "FRANCE"
          !CountryRegion = "Europe"
          !CountryLanguage = "French"
        Case "GERMANY"
          !CountryRegion = "Europe"
          !CountryLanguage = "German"
        Case "IRELAND"
          !CountryRegion = "British Isles"
          !CountryLanguage = "English"
        Case "ITALY"
          !CountryRegion = "Europe"
          !CountryLanguage = "Italian"
        Case "MEXICO"
          !CountryRegion = "North America"
          !CountryLanguage = "Spanish"
        Case "NORWAY"
          !CountryRegion = "Scandinavia"
          !CountryLanguage = "Norwegian"
        Case "POLAND"
          !CountryRegion = "Europe"
          !CountryLanguage = "Polish"
        Case "PORTUGAL"
          !CountryRegion = "Mediterranean"
          !CountryLanguage = "Portuguese"
        Case "SPAIN"
          !CountryRegion = "Mediterranean"
          !CountryLanguage = "Spanish"
        Case "SWEDEN"
          !CountryRegion = "Scandinavia"
          !CountryLanguage = "Swedish"
        Case "SWITZERLAND"
          !CountryRegion = "Europe"
          !CountryLanguage = "German"
        Case "UK"
          !CountryRegion = "British Isles"
          !CountryLanguage = "English"
        Case "USA"
          !CountryRegion = "North America"
          !CountryLanguage = "English"
        Case "VENEZUELA"
          !CountryRegion = "South America"
          !CountryLanguage = "Spanish"
        Case Else
          !CountryRegion = "Unknown"
          !CountryLanguage = "Unknown"
     End Select
     .Update
   End With
   adoTempCountry.MoveNext
 Wend
adoTempCountry.Close
adoTempid3.Close

Call gridID3

End Sub

How It Works

We will open two locally scoped recordsets in this routine, adoTempCountry and adoTempid3. First, we wish to find out all of the individual countries where our clients reside. So by selecting a group of records that have a distinct country from the customers table, we know that there will only be one entry per country. Even though we might have 100 clients in Spain, by using the DISTINCT keyword we are assured we will only get one record for Spain. In this example, we are using the Execute method of the connection object to return our recordset and assign it to adoTempCountry.

sSql = "SELECT DISTINCT Country FROM Customers"
Set adoTempCountry = adoConnection.Execute(sSql)

The product manager will probably be doing analysis on many products. So each time a new analysis is done, we will delete all previous entries from the ID3 table so we can start fresh.

adoConnection.Execute ("DELETE * FROM ID3")

Now we are ready to open a recordset for the currently empty ID3 table. Again, we are still using the single form-level adoConnection object here. So we set a reference to our adoTempid3 table, set two properties, and open the table. We can then update the adoTempid3 table with information that will populate the underlying ID3 table.

Set adoTempid3 = New ADODB.Recordset
adoTempid3.CursorType = adOpenKeyset
adoTempid3.LockType = adLockOptimistic
adoTempid3.Open "ID3", adoConnection

Now we can add our data to the ID3 table. We take the first record of the adoTempCountry table that contains a record for each individual country where we have customers. Since we know the ID3 table is empty, we add a record with the .AddNew method and assign the new country to the Country field. Now here is where we add our new perspective on the country. Depending on the country, we add both a CountryRegion and a CountryLanguage to that record. So now each country belongs to a region and language. Why is this important? Well, our ID3 algorithm might determine that all Spanish-speaking people are just craving our new beer. Or that the South American region is not buying as much of our Boysenberry, or that the Mediterranean region is starting to take off. This could provide the product manager with valuable information on how to spend those advertising dollars.

While Not adoTempCountry.EOF

   sCountry = adoTempCountry!Country
   
   With adoTempid3
     .AddNew
     !Country = sCountry
   
   Select Case UCase(sCountry)
        Case "ARGENTINA"
          !CountryRegion = "South America"
          !CountryLanguage = "Spanish"

We can see that a new record is added to the ID3 table for each record in the adoTempCountry recordset. We know that adoTempCountry will contain a unique record for each and every country. So we walk through the adoTempCountry recordset. For each record in the adoTempCountry recordset we add a new record to the adoTempid3 recordset with our additional information. Once we reach the end of the adoTempCountry recordset, we close both of them and then call gridID3, which will actually display the results in a flex grid.

adoTempCountry.Close
adoTempid3.Close

Call gridID3

© 1998 by Wrox Press. All rights reserved.