September 1999

Importing Delimited Text Files

by Steven W. Disbrow

While ASP is intended for the creation of HTML documents, it can be a very handy tool for solving lots of other programming problems. Consider the task of importing a delimited text file into a SQL Server 6.5 table. While SQL Server does include a tool for this (the bulk copy or bcp utility), it does have at least one major drawback: it assumes that the data you're importing is perfect. So, for example, if the delimited file contains a text string where a number was expected, the bcp utility will fail. In a case like this, a strong impulse might be to use Access as a front end to import the data into SQL Server. This is a good idea, but Access is also picky about fields that contain the wrong kind of data.

Eventually, it will become obvious that a custom program is the best solution. ASP is an excellent choice for solving this sort of problem because it has almost all of the power of Visual Basic, but it's even easier to get a program up and running. And, for a task like this, we don't really need a user-interface, we just want to import some data and get a report on any bad records.

Considerations

Unfortunately, using ASP to change a delimited text file into a database table isn't as easy as one might hope. After all, ASP isn't a database tool, so we need to do some setup work before we can actually import any data. Specifically, we need a way to give our ASP code some information about the table we'll be inserting data into. At a minimum, we need to know the number of fields, the names of those fields, their generic types (for example, character, numeric, date, etc.) and how much data each field can hold (if it's a character field). A good solution here is use a simple text file that describes what we need to know about the target database table. An example of this is shown in Listing A.

Listing A: A simple database table description file

3
name, c, 50
age, n, 0
birthday, d, 0

The first line tells us how many fields there are in this table (in this case, three). Each of the remaining lines describes a single field in the table. Each line contains the name of the field, its type (n = number, c = character, and d = date), and finally, the number of characters the field can hold. (Note that this last bit of information only applies to character fields.) It's important to note that these fields must be listed in the same order as they appear in the delimited text file! So, from Listing A, we see that our table has three fields:

In Listing B, we've got a very simple tab-delimited text file. It contains the names of various Presidents, their ages when they took office, and their dates of birth. (Note that the fields in this file appear in the same order shown in Listing A. Also note that the age field for the second record contains a non-numeric value.)

Listing B: A very simple tab-delimited file

Jimmy Carter	52	10/01/1924
Ronald Reagan	sixty-nine	02/06/1911
George Bush	64	06/12/1924

As with Listing B, most files will be delimited with tabs. However, it could be any character at all. So, it would be nice if our solution could handle any delimiter we happen to come across.

The Solution

Listing C shows one possible solution to this problem. It's a Visual Basic function that will read a delimited text file and insert the data into a specified database table.

Click here for Listing C.

To be as flexible as possible, this function takes several different parameters:

How it works

The source code for this function contains detailed comments on its operation, so let's just get a quick overview of how everything works. Before we even get to the function, we have to set the server.scripttimeout property for the Web server. This property tells the Web server how long (in seconds) that it should let scripts run before giving up on them.

Depending on the amount of data that will be imported, this value should be adjusted up or down accordingly. With that small bit of housekeeping taken care of, we can now look at the actual function.

The first thing the function does is to open the descriptors file that contains the description of the target database table. The information in this file is used to build an array that holds everything we need to know about the database table.

When completed, this array will hold the name, type and size of each file in the table. Next, we open up the error log file and prepare it to receive any error messages we generate. At this point, it's time to open the delimited file and start pulling data out of it. As soon as the file is open, we read a record and begin taking it apart, one field at a time. As we do this, we test the data in each field to see if it matches the criteria set up in the descriptors file.

For example, if the descriptors file marked a field as n (numeric), we use the isNumeric() function to see if the value really is a number. If it is, we add the value to our SQL Insert statement as is. If it isn't, we output an error message and add the value to our SQL Insert statement as NULL. (This is an important point: Even if one or more fields have bad data, we still attempt to add the record! We simply remove the bad data before doing so.)

After all of the fields have been parsed, we take the SQL Insert statement that we've created and use it to insert the data into the target table. If this generates an error, we record the error in the error log.

At this point, the processing for this record is complete. So, we simply move on to the next one. When the source file is empty, we print out some information on the results and then return the number of records inserted back to the caller.

Conclusion

Of course, this technique can be adapted for more than just loading database tables. For example, it would be a fairly simple matter to change this function to generate HTML table rows rather than SQL Insert statements. (This would be an excellent way to plug a less-than-perfect delimited file into a Web site.) But, the key thing to realize here is that ASP is a good tool for more than just HTML generation. In fact, once you use ASP for something else, you'll find yourself using it for all sorts of strange little programming tasks. If you come up with a really interesting one, let us know!

Copyright © 1999, ZD Inc. All rights reserved. ZD Journals and the ZD Journals logo are trademarks of ZD Inc. Reproduction in whole or in part in any form or medium without express written permission of ZD Inc. is prohibited. All other product names and logos are trademarks or registered trademarks of their respective owners.