International Installation and Configuration Issues

This section provides comprehensive guidelines for installing Windows NT, SQL Server, and SQL Server clients to support international applications.

Installing and Configuring Windows NT for SQL Server

Microsoft SQL Server 6.5 supports both Windows NT 3.51 and Windows NT 4.0. For convenience, localized editions of Windows NT 4.0 are listed in Appendix C. SQL Server's ANSI code page can be the same or different from the Windows NT installation that SQL Server is running on, so in most cases it does not matter which localized edition of Windows NT is installed. The exception to this is when a DBCS code page is used by a SQL Server installation. In this case, you should install the appropriate Asian edition of Windows NT because SQL Server depends upon several different operating system capabilities to support DBCS code pages.

Developers who want to simulate a different localized edition of Windows NT Server without having to procure additional localized editions will be interested in a Windows NT 4.0 feature called language packs. Installation of language packs add code page conversion and NLSAPI support for additional code pages. For example, a developer who is building applications for the Japanese market may be working with the U.S. version of Windows NT Server 4.0. Installing the Japanese language pack allows the developer to install SQL Server using the Japanese DBCS code page without having to purchase a copy of the Japanese edition of Windows NT Server.

Language packs and instruction for how to install them are located in the langpack subdirectory on the distribution CD-ROM for Windows NT 4.0. Language packs were designed to be used with Internet Explorer and Microsoft exchange and have not been tested with Microsoft SQL Server in a production environment. They are unsupported, and there are known reliability issues.

Once you have installed Windows NT, make sure that you use the appropriate locale setting for users who will be connecting using the default language. This is particularly important when using DBCS code pages. You can set the current locale for Windows NT in Control Panel using International (NT 3.51) or Regional Settings (NT 4.0).

Installing and Configuring SQL Server

There are additional considerations to install and configure SQL Server for international use beyond those discussed for Windows NT. These include code page and sort order choices, langid installation issues, and how to work with SQL Server utilities.

Installing a Code Page and Sort Order

During installation you are required to select a single code page with a single sort order for the entire server. It is critical that you select the appropriate code page and sort order when you first install SQL Server. If you need to change your code page or sort order, you must re-install SQL Server and reload your data using either bcp (bulk copy program) or batch Transact-SQL scripts.

The code page and sort order also determine how data is written to database and transaction log backups. Make sure that the code page and sort order used on your backup SQL Server are the same as the code page being used on the SQL Server you are restoring to. You can determine the current character set and sort order by running the sp_helpsort stored procedure.

Choosing a Code Page

You should use the same code page for the client and the server to avoid unpredictable results. If the code pages differ, extended characters can be corrupted in conversations between the client and the server unless you are using client-side character set conversion. A general rule is that you should choose a code page compatible with the ANSI code page being used by the localized edition(s) of Windows 95 or Windows NT that will be running on client workstations connecting to your SQL Server. For a comprehensive list of code pages supported by SQL Server, see Appendix B. The Appendixes also contain a comprehensive list of localized editions of Windows 95 and Windows NT and which ANSI code page they use by default.

If you plan to use multiple SQL Servers in cooperative distributed operations, such as remote stored procedure calls, distributed transactions, or replication, it is recommended that each SQL Server that participates in the distributed operation runs the same code page.

ANSI Code Pages

The English edition of the SQL Server Setup program refers to ANSI code page 1252 as the ISO Character Set. Code page 1252 is a commonly used ANSI code page because it covers most of the languages spoken in Western Europe. It is therefore a good choice for many Windows 95, Windows NT and Microsoft SQL Server installations (including those in the United States) and is also common in Unix and VMS installations of other third-party relational databases.

The English, French, and German localized editions of SQL Server 6.5 all present code page 1252 as the default code page choice in the SQL Setup program. Support is also provided for the common ANSI code pages used in localized editions of Windows 95 and Windows NT (code pages 1250 through 1257). Additionally, four DBCS code pages for Asian countries are included for Japanese, Chinese, and Korean installations (code pages 932, 936, 949, 950).

OEM Code Pages

Code Page 850

Code page 850 (also called Multilingual) is an OEM code page developed for European versions of MS-DOS. It replaced the upper 128 characters that were used for graphics in Code Page 437 with enough characters and symbols to cover most Western European languages. This code page was common with Microsoft SQL Server 4.x installations and should be used only if your application was written using MS-DOS and requires database storage of extended characters. ANSI code page 1252 is usually a better choice.

Code Page 437

Code page 437 (also called U.S. English) is an OEM code page that is commonly used in MS-DOS applications in the United States. The upper 128 characters in this code page were used for graphics characters. This code page was common with Microsoft SQL Server 1.x installations, and should be used only if your application was written using MS-DOS and requires database storage of extended characters. ANSI code page 1252 is usually a better choice.

Choosing a Sort Order

Although there are from 2 to 17 sort order options for each SQL Server code page, there are a few common types including dictionary sort orders, binary sort order, and a Windows NT provided sort order. You can view the definition file for each sort order by opening the appropriate text file in SQL Server's charsets directory. For a comprehensive list of all the sort order options for each character set, see Appendix B.

Dictionary Sort Orders

A dictionary sort order collates data alphabetically (for example, A, a, B, b, C, c in English). Dictionary sort orders are usually preferred by end users because SQL Server collates results in a culturally correct fashion, the way a phone book or dictionary is sorted. Depending upon the code page, there are several dictionary sort order options including case-sensitive, case-insensitive, uppercase preference, and accent-insensitive.

Case-sensitive

A case-sensitive sort order requires the user to specify search arguments using the exact case in which the data is stored (for example, "John" ¹ "john"). Additionally uppercase and lowercase characters sort separately during an order by operation. It is important to note that the case sensitivity of a sort order also controls how object names are validated. A table named "ORDERS" is different from a table named "orders" when using a case-sensitive sort order.

Case-insensitive

A case-insensitive sort order does not require matching case for search arguments (for example, "John" = "john"), and case is ignored during order by operations. Object names are validated without regard for case, for example, a table named "ORDERS" is the same as a table named "orders."

Uppercase Preference

A sort order that supports uppercase preference sorts uppercase characters before their corresponding lowercase characters. Uppercase preference has no impact on search arguments or name validation.

Accent-insensitive

An accent-insensitive sort order ignores diacritical marks in search arguments or ordering operations (for example, a = à = á = â).

Binary Sort Order

When using a binary sort order, data is collated using the decimal value representing each code point. This is the highest performance option because SQL Server does not have to check sort order rules when searching or collating data. Unfortunately, binary sort orders are not very user friendly (for example, A, B, C, a, b, c).

Windows NT-provided Sort Order

When using a Windows NT-provided sort order, SQL Server retrieves and caches sort keys using NLSAPI locale settings rather than retrieving sort keys from the syscharsets system table. This technique is used for DBCS character sets in SQL Server 6.5.

Custom Sort Orders

SQL Server provides a mechanism that allows Microsoft to incrementally add new sort orders after the product had been released. This feature is exposed in the Setup program as a custom sort order. When selecting a custom sort order, the Setup program prompts the user for a sort order file.

As of July 1997, there are currently no custom sort orders available from Microsoft, so it is unlikely that you will use this option in version 6.5. It is possible to create user-defined sort orders using this capability; however, this is unsupported and can complicate an upgrade to a future version of Microsoft SQL Server.

Additional Sort Orders for Code Page 850

Additional sort orders were provided for code page 850 in earlier versions of Microsoft SQL Server and are available in version 6.5 for backward compatibility. Appendix A of Microsoft SQL Server Setup describes these sort orders in detail. Because code page 850 is no longer recommended for application development it is not detailed here.

A "strict compatibility" sort order is provided for backward compatibility with SQL Server 1.x applications. Several "alternate" sort orders are provided with subtle differences from their dictionary order counterparts. Several "Scandinavian" sort orders are also provided with slight differences from their dictionary order counterparts.

Installing and Configuring langids

Changing the Default langid

It is possible to change the default langid for a SQL Server installation to that of a different localized edition of SQL Server without reinstalling. If you have previously installed additional langids, you can change the default langid to one of those previously installed by reconfiguring the default language server configuration option using the langid of the new default language. To do this, the SQL Server must be running ANSI code page 1252. Attempting to change the default language can yield unexpected results if SQL Server was installed using an OEM code page. This process cannot be performed unattended.

To change the default langid for a SQL Server installation

  1. Obtain copies of the SQL Server localization files Server.loc and Common.loc from the distribution CD-ROM for the appropriate localized edition of SQL Server.

  2. Run the SQL Server Setup program.

  3. Click the Add Language option.

  4. Type in the name of the new default language (for example, English, French, Français, German, or Deutsche).

  5. Type in the sa password.

  6. Click the Default Language option.

  7. Click OK.

  8. Type in the path to the files you obtained in Step 1 and click OK.

This table lists the choices for changing a default langid. Japanese is omitted because the Japanese edition of SQL Server 6.5 uses the Japanese DBCS code page whose extended characters are not compatible with ANSI code page 1252.

Current langid Possible new default langids
English French German
English ü ü
French ü ü
German ü ü

Adding Additional langids

SQL Server can be configured to support multiple langids in addition to the default. There are three methods for doing this depending upon which localized edition of SQL Server was initially installed and the alternate langid(s) you want to support. To determine which alternate langids are installed on your SQL Server, run the sp_helplanguage stored procedure.

Adding Additional Primary langids

Using the SQL Setup Server program to add an alternate langid is the preferred approach because it adds a localized set of system messages and SQL Server-defined locale settings in addition to the default langid. To do this the SQL Server must be running ANSI code page 1252. Attempting to add an alternate langid can yield unexpected results if SQL Server is installed using an OEM code page. This process cannot be performed unattended, and is the same steps for how to change the default langid. The only change is that you do not check the Default Language option.

Once you complete these steps, the Setup program adds an alternate langid to the syslanguages table and new localized system messages to the sysmessages table. This table lists the additional primary langids that can be installed. Japanese is omitted because the Japanese edition of SQL Server 6.5 uses the Japanese DBCS code page whose extended characters are not compatible with ANSI code page 1252.

Current langid Possible additional default langids
English French German
English ü ü
French ü ü
German ü ü

Adding Alternate langids

Instlang.sql is a Unicode Transact-SQL script located in SQL Server's install subdirectory. This script can be used to add SQL Server alternate langids for certain code pages. The script adds any compatible alternate langid for the currently installed code page. Remember that alternate langids do not provide localized system messages. For a comprehensive list of alternate langids supported by each code page, see the Appendix.

This approach is useful for enabling and testing SQL Server-supplied locale settings for alternate languages when there is no localized version of SQL Server available for the language in question. Appendix B lists each alternate langid that is added to the syslanguages table by running this script depending on the code page that is currently installed.

Adding User-defined Alternate langids

It is possible to add user-defined alternate langids by running an undocumented system stored procedure called sp_addlanguage. This approach is not recommended or supported because user-defined alternate languages can compromise an upgrade to a future version of SQL Server.

Working with SQL Server Utilities

This section discusses the available SQL Server utilities.

Localized Editions of SQL Server Setup

Localized editions of the SQL Server Setup program may use different defaults for character set and sort order choices. Setup and a subordinate installation program called Langinst.exe use two special localization files called Server.loc and Common.loc in order to install the default language, SQL Server-defined locale settings, and localized versions of system messages. These files are located on the distribution CD-ROM of each localized edition of SQL Server. Localized editions of the Setup program are otherwise identical.

Working with Transact-SQL Scripts

An important behavior developers must be aware of when using ISQL/w or the SQL Query Tool feature of SQL Enterprise Manager is that these tools perform an implicit conversion from the system's OEM code page to system's ANSI code page when reading script files from the disk for backward compatibility.

Conversely, an implicit conversion from the system's ANSI code page back to the system's OEM code page is made when these tools are used to save a script file to disk. If you use these tools to edit and save a script file that contains extended characters, there is a chance that those characters may be corrupted.

Scripts with extended characters should be edited and saved using a text editor that fully supports the system's ANSI code page to avoid data loss. You should also be wary of editing or saving scripts created on systems that use a different OEM code page than the one on the current system. Future versions of these tools will support a variety of code page choices when saving files to disk.

Working with BCP

Microsoft SQL Server 6.5 supports an interface called the DB-Library bcp (Bulk Copy Program) API that is used for importing and exporting data. This API is used by the bcp command-line utility as well as by the SQL-DMO Transfer object.

A couple of important system settings affect code page conversion when data is read and written using bcp interfaces. These settings are active system-wide and can be changed by using the SQL Server Client Configuration utility. These settings can be modified in an unattended fashion by modifying registry keys in:

HKEY_LOCALMACHINE/
   SOFTWARE/
      MICROSOFT/
         MSSQLSERVER/
            CLIENT/
               DB-LIB

AutoAnsiToOem Registry Key

When the AutoAnsiToOem registry key is set to ON and data is being bulk copied out of SQL Server, the data is encoded using the system's OEM code page. ANSI extended characters are automatically converted to the appropriate OEM counterpart if one exists. If one does not, a null character is used. When the AutoAnsiToOem key is set to OFF, SQL Server encodes the data using the system's ANSI code page.

UseIntlSettings Registry Key

When the UseIntlSettings registry key is set to ON SQL Server calls the NLSAPI using the current Windows locale to convert SQL Server datetime values to and from strings. When the key is set to OFF SQL Server uses the same default datetime styles that are used for the Transact-SQL CONVERT function.

Note that default date order of day, month, and year can change if data is being bulk copied across the network from a different system. To avoid this situation make sure that the same Windows locale settings are being used on the client and server computers if the UseIntlSettings registry key is enabled.

Installing and Configuring SQL Server Clients

International issues can affect the installation and configuration of client workstations running Windows 95 or Windows NT that will be used to run applications that connect to Microsoft SQL Server. This section addresses how to choose the right version of Windows for the client workstation, when you need to use client-side character set conversion, and how to specify a langid at connect time.

Windows NT and Windows 95 Client Installation

Windows 95 supports a single ANSI code page for Win32 applications and a single OEM code page for MS-DOS applications. The particular code pages used depend upon the localized edition of Windows. To ensure compatibility, you should choose an appropriate localized edition of Windows 95 that supports the code page being used by SQL Server. For a comprehensive list of localized editions of Windows 95 and which SQL Server code pages are supported by each, see the Appendixes.

Windows NT code page support depends upon the subsystem being used. The Win32 subsystem supports Unicode with code page translation for backward compatibility to ANSI and OEM code pages. Because SQL Server does not yet support Unicode, Windows NT systems that access data stored in SQL Server will use the Win32 ANSI code page for character encoding. Using the new language pack feature of Windows NT 4.0, developers can configure a Windows NT system to support multiple ANSI code pages. For a comprehensive list of localized editions of Windows NT, see Appendix C.

Client-side Code Page Conversion

Windows clients should use the same code page as Microsoft SQL Server whenever possible. Applications perform better and your environment is easier to support. In some cases you may be forced to configure SQL Server to use a legacy OEM code page, such as support for:

Using an OEM code page on SQL Server presents a problem for Win32 applications, which use ANSI code pages. In this situation you need to investigate using client-side character set conversion to ensure the proper operation of your Win32 applications. For a reference to Microsoft Knowledge Base article Q153449 that describes this scenario in detail, see "Finding More Information," later in this paper.

It is important to stress that client-side character set conversion requires additional processing and memory overhead at the client and need not be enabled if your Windows clients and the SQL Server are using the same ANSI code page. If the SQL Server is running a legacy OEM code page and it is being accessed by Windows applications, you should use either OemToAnsi conversion or an ODBC Translator at your Windows client to perform client-side code page conversion to avoid corruption of your data.

OemToAnsi Conversion

The OemToAnsi Conversion option should be used if the client's OEM code page is the same as SQL Server's code page. For example, a Windows 95 client that is configured to use locale ID 0403 (Catalan - Spain) uses code page 850 as its default OEM code page. If the SQL Server is also running OEM code page 850 you should enable the OemToAnsi option. To determine which OEM code page is being used by your Windows clients, see Appendix A.

The OemToAnsi option is the highest performance client-side character set conversion option. This is because the conversions are performed internally by the Microsoft SQL Server ODBC driver using code pages that are already installed on the operating system.

This option is disabled by default.

To enable the OemToAnsi

To enable the OemtoAnsi option using the ODBC Administrator

Conversion Using an ODBC Translator

ODBC provides a generic mechanism that allows an ODBC driver to translate data from one character set to another. This mechanism calls a DLL called a translation DLL to perform code page conversion. The ODBC driver, not the application, calls the translation DLL to perform translation.

This option should be used if SQL Server is using a legacy OEM code page that is different from the OEM code page being used by your Windows client. For example, a Windows 95 client that is configured to use locale ID 0403 (Catalan - Spain) uses code page 850 as its default OEM code page. If the SQL Server is running OEM code page 437 you should use an ODBC translator instead of the OemToAnsi option because the client's OEM code page is incompatible with SQL Server's.

The ODBC run-time files that ship with Microsoft SQL Server 6.5 include a translation DLL called Mscpxl32.dll, which can be used to handle conversions between Windows clients running ANSI code page 1252 and a SQL Server running OEM code pages 437 or 850. The decision to provide only these two translation options was based upon the installed base of SQL Server 6.5 and the fact that only two OEM code pages are supported. Developers familiar with the ODBC API could easily build additional translation DLLs to support character set conversions other than the ones handled by Microsoft's translation DLL.

To enable this option in an unattended fashion, make sure the OemToAnsi registry entry for the ODBC data source is set to "No." Then, modify the registry for the ODBC data source to use the Mscpxl32.dll translator DLL for code page conversion by adding these registry keys and values.

Key Type Value
OemToAnsi REG_SZ "No"
TranslationDLL REG_SZ Path to MSCPXL32.DLL in the System32 directory including the file name
TranslationName REG_SZ "MS Code Page Translator"
TranslationOption REG_SZ "12520850" if SQL Server is using code page 850, or "12520437" if SQL Server is using code page 437

To enable this option using the ODBC Administrator

  1. Disable the Convert OEM to ANSI Characters option.

  2. Click Select.

  3. In the list box, select the MS Code Page Translator option, then click OK.

  4. If the SQL Server is running OEM code page 850, select the Windows Latin 1(1252)/850 (Multilingual-Latin 1) option, or if the SQL Server is running OEM code page 437, select the Windows Latin 1/437 (English) option. Then click OK.

Requesting a langid at Connect Time

The Microsoft SQL Server ODBC driver supports the LANGUAGE connection option that allows applications to specify a langid at connect time. This is useful if you have configured SQL Server to support multiple langids.

This setting can be configured in an unattended fashion by modifying the value of the Language registry key for an ODBC data source with the name (not the ID) of the langid. The Microsoft SQL Server ODBC driver is hard-coded to accept the values "us_english," "deutsch," or "français."

To configure this setting using the ODBC Administrator

To determine the current language for a user connection, you can query the Transact-SQL @@LANGUAGE global variable.