Hypertext Markup Language (HTML) is a standard for presenting information over an intranet or the Internet. HTML files are text files that contain tags that specify how information in the file is displayed. You can use the HTML table tags (<TD> and </TD>) to embed one or more tables in an HTML file. The Microsoft Jet HTML IISAM driver can access information in HTML files that is formatted as tabular data or data in HTML files that is formatted as list data.
You can directly open, link to, or import data stored in HTML tables. The methods for linking or importing HTML data are similar to those used for other external data sources.
The following table contains a description of the required elements used to create the identifier, source, and options portions of a connection string for an HTML data source.
Element | Description |
Source | HTML Import; (required). |
Database name | The fully qualified path to the HTML file that contains the data you want to import. |
Source table name | The table named by <CAPTION> tags (if they exist) within the HTML file that contains the data you want to import. |
The Microsoft Jet HTML IISAM driver assumes that the first row of table data does not contain field names. You can override this behavior by adding the HDR=YES element to the connection string to indicate that the first row of data does contain field names.
The source table name element of the source argument is the caption of the table within the HTML file, if the table has a caption. If the table does not have a caption, and it is the only table in the file, use the title of the HTML file to refer to the table. If more than one table exists and none of the tables has a caption, you refer to them sequentially as Table1, Table2, and so on. The IISAM driver interprets these references as the first unnamed table in the HTML file, the second unnamed table in the HTML file, and so on.
The Microsoft Jet HTML IISAM driver reads the data in the HTML table and chooses the data type for the data in the table columns by interpreting the data contained in the table cells. For example, if any of the data in a column is text, the IISAM driver interprets the column data type as text, with a field size of 255 characters. If the data in a column is numeric, the IISAM driver interprets the column data type as Long or Double, depending on whether most of the values are integer or floating point. If the data in a column is a combination of numeric and text values, the column data type is interpreted as text.
You can force the Microsoft Jet HTML IISAM driver to interpret column data as a specific data type by creating a Schema.ini file that contains information about the data type for each column of data. The only difference between a Schema.ini file created for an HTML table and one created to import text file data is that the section heading contains the name of the table, not the name of the file. For more information about Schema.ini files, see the “Examples of Schema.ini Files” section earlier in this chapter.
The following code fragment shows how you can link to the table named Sales in an HTML file named Sales_1.html. In this example, strDbPath
is the path to the database:
Sub LinkToHTMLTable() Dim dbs As Database Dim tdf As TableDef Dim strConnect As String Set dbs = OpenDatabase(strDbPath) ' Specify connection string. strConnect = "HTML Import;" & _ "DATABASE=C:\JetBook\Samples\HTML\Sales_1.html;" ' Create new TableDef object. Set tdf = dbs.CreateTableDef("LinkedHTMLTable") ' Link table to HTML data source. With tdf .Connect = strConnect .SourceTableName = "Sales" End With dbs.TableDefs.Append tdf End Sub
The following code fragment shows how to set the connection string to an HTML file on an HTTP server:
strConnect = "HTML Import;DSN=Sales_1 " & _ "Link Specification;HDR=NO;IMEX=2;" & _ "DATABASE=HTTP://WebServer/YTDSales/Sales_1.html;"