This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


MIND


This article assumes you're familiar with JScript and Dynamic HTML.
Download the code (2KB)

Validation Made Simple with JScript Regular Expressions
Heidi Housten

One overlooked feature of JScript is the regular expression interpreter, which makes string pattern matching easy.
One of the new and exciting features of Microsoft® JScript® version 3.0 that receives very little attention is the regular expression object. If you know Perl, you’re already familiar with these very powerful and simple objects, designed to make pattern matching easy. Let’s see how to implement them in JScript.
      Input validation is one of the most common requirements in Web site development. At design time, you don’t know what exact values viewers will enter, but you do know the format they should be in. A regular expression is a way of representing a pattern you are looking for in a string (see Figure 1). For example, let’s say I want users to input 6 digits followed by a decimal and 4 more digits. In a regular expressions \d represents any digit. A simple regular expression for my desired input would be:
 /\d\d\d\d\d\d\.\d\d\d\d/
Notice the slashes surrounding the expression. This is the proper way to indicate a regular expression pattern, and its assignment to a variable in JScript creates a regular expression object (see Figure 2 and Figure 3). Also notice the backslash before the decimal; this has a special meaning in the context of a regular expression. The characters . ? + * - ( ) [ ] {} $ ^ | / and \ all have special meaning in regular expressions. To use one of these special characters as a match character in a regular expression, you must precede it with a backslash.
      You can specify the number of times you expect something to appear by specifying a number in curly brackets. The pattern shown previously can be simplified like this:
 /\d{6}\.\d{4}/
      To create a regular expression object in JScript, you just assign a slash-delimited, unquoted pattern to a variable like this:
 template = /\d{6}\.\d{4}/
      To test to see if that pattern matches a particular string value, you can use the test method of the object, which returns true if there is a match or false if not. The code to validate the number you’ve been looking at would look like this:


 template = /\d{6}\.\d{4}/
 
 function validate(instr)
 {
     
     if (!template.test(instr))
     {  // Not True - so no match was found
 
     alert ( "please use the format :\n\n123456.1234" )
     }
 }
It’s much simpler than trying to do some indexOfs and calculating the lengths of each bit, isn’t it? And it only gets better.

Varying Patterns
      Now, let’s say that you know there will be between 4 and 6 digits before the decimal and at least 2 after it. That regular expression would look like this:


 /\d{4,6}\.\d{2,}/
      You can also allow multiple regular expressions in one test by using the | operator to OR them together. The following will match either a 6.2 or a 2.6 construction.

 /\d{6}\.\d{2}|\d{2}\.\d{6}/
      You can also specify a selection of values for a character using [] and - together.

 /[13-5]\d{6}\.\d{2}/
The above will match on any 6.2 digit string that starts with one of the digits 1, 3, 4, or 5. The following code illustrates the negation operator, which in this case would match any 6.2 digit string that starts with any digit other than 1, 3, 4, or 5.

 /[^13-5]\d{5}\.\d{2}/

Fond Memories
      One of the best features of regular expressions is that they can remember which parts of the expression were matched. You can surround the parts of the regular expression you want to access later with (). Suppose you want to retrieve the values on the left and right sides of the decimal after test parses a string. That pattern would look like this:

 /(\d{4,6})\.(\d{2,})/
This brings up another related object. When you call the test method of a regular expression it creates a RegExp object (see Figure 4), which stores information relevant to that regular expression search. You can then use that object to find the matched strings. This object can recall up to nine arguments that were indicated by parentheses. To retrieve the two values you’ve requested the object to remember, use:
 RegExp.$1
 RegExp.$2
      You can even use a remembered value within the regular expression itself by preceding the remembered match number with a backslash (like \1). You may know that a value repeats in a string, but the value itself could be different in each string. This code
 /(\d{3})-\d{3}-\1/
would return true for both 987-353-987 and 455-319-455, indicating that the third field matched the first.

Changing Patterns on the Fly
      There is another way to create regular expression objects that can be particularly useful if you want to validate many different input values.

 template = new RegExp("pattern",["options"])
This method of creating a RegExp object is good for changing the pattern. It means you can use one function for validation, and pass both the string and pattern as parameters. A simple function would look like this:

 function validate(inobj,pattern)
 {
 template = new RegExp( pattern )
 
     if (! template.test(inobj.value))
     {
         // Failed
         inobj.focus()
         alert ( "Pattern used : " + template.source)
     }
 }
      Using this function in Dynamic HTML, the button code to validate a standard email address in a text field called tbox would look like this:

 <input type=button
  onclick="validate(tbox,'^[a-z\.]+)@([a-z\.]+)$')"
  value="show me">
This example introduces several new special characters. Let’s look at that regular expression closely. [a-z\.] means match any lowercase letter or a period. The + says to match that pattern one or more times. The ^ matches the beginning of the line or string and the $ signifies the end of the line. If there were extra digits at the beginning or end of the first sample pattern, it would still find the required pattern string, but the extra digits would get ignored. The surrounding ^ and $ force a match on the entire string.
      Notice that I don’t use \1 in the second part of the regular expression allowing letters and periods. If I did, it would try to match the exact text found by the first pattern, rather than reusing the pattern itself.
      Remember that ["options"] in the RegExp definition? If you put the letter i there, causing a case-insensitive match on a string, it would ignore the case of the letters in that previous regular expression. Using [a-z\.] with the i option would match all uppercase and lowercase letters and a period. (Note: [a-z\.] will not match letters that aren’t in the standard English alphabet, such as ä, Å, or ñ. To match all the letters in the Western font map use [a-zà-öø-ÿ\\.]. Although you could use \w instead, it will also match underscores and digits.)

A Useful Little Tool
      Now let’s build a useful generic validation routine, using a quick order window as a sample. The quick order popup window will allow a viewer to enter their account number, the part number they want, and a contact
Figure 5: Order Popup
Figure 5: Order Popup
Example Online!
email or telephone number (see Figure 5). The code is shown in Figure 6, and is available in the code archive at the top of this article if you want to test it yourself.
      The previous validate routine is a good start. What it is missing are appropriate error messages and the facility for using the options parameter for the regexp. It would be easy to do this using arrays, but I’m in the mood for a little object orientation. The VO function shows the constructor. For each field to validate, you hand this constructor the pattern, error message, and the regular expression parameters you want to use for that field. Save the resulting object in the ValObjs array with the same name you’ve given the edit field in the form. Notice that in the email field validation, I use the ignore case option for the regular expression.
      I have created two functions. The first, validate_field, just checks to see that the field value matches the corresponding pattern by passing the name of the field to the function; validate_form loops through all the items in the ValObjs array to be sure that all items there have valid values in their corresponding fields. It stops when an error has been found, so the user has the chance to correct it (see Figure 7). By separating the steps this way, I can choose to validate some fields in their onchange events, and also do a complete check on submission of the form. For testing purposes, I have only put in an alert box, confirming the form input is valid, in the place of executing the form action.

Figure 7: Alert Box
Figure 7: Alert Box


       This quick order code is easy to modify for ASP pages if you know your site needs to cater to multiple browsers. Instead of stopping the form validation function after the first error, continue through the loop, printing all the error messages, and change the field validation routine so that it looks at the request object rather than the form object.

Conclusion
      Although it looks a little complicated at first, the regular expressions in Microsoft JScript 3.0 are actually very easy to use, and they make data validation a breeze. Once you recognize the power of these neglected objects, they will gain a favored position in your repertoire. JScript 3.0 is included in Internet Explorer 4.0 and higher, comes with Internet Information Server 4.0 or better, and is available as an upgrade to older versions of Internet Explorer from the Microsoft Scripting Web site http://msdn.microsoft.com/scripting.

Also see the sidebar: What About VBScript?

From the October 1998 issue of Microsoft Interactive Developer.