INF: How to Manipulate DBCS Data in a Text Field

Last reviewed: August 29, 1997
Article ID: Q172308
The information in this article applies to:
  • Microsoft SQL Server, version 6.5

SUMMARY

This article describes how to correctly manipulate double-byte character set (DBCS) data in text fields.

MORE INFORMATION

UPDATETEXT and WRITETEXT are powerful tools that allow users to modify existing text data through a text pointer. While using a text pointer, it is very common for users to accidentally split a double-byte character in half. This may result in "data corruption" from user's perspective. The following scripts demonstrate this common problem.

   USE pubs
   GO

   SET NOCOUNT ON

   DROP TABLE textTable
   GO

   CREATE TABLE textTable
   (
      id SMALLINT
         IDENTITY(1,1)
         PRIMARY KEY CLUSTERED,
      hexValues VARBINARY(20) NOT NULL,
      textField TEXT NOT NULL
   )
   GO

   DECLARE @hexValues VARBINARY(20)
   SELECT @hexValues = 0xa540a541a542
   INSERT textTable VALUES
      (@hexValues,
         CONVERT (VARCHAR(20), @hexValues))
   INSERT textTable VALUES
      (@hexValues,
         CONVERT (VARCHAR(20), @hexValues))
   GO

   SELECT * FROM textTable

   DECLARE @hexValues VARBINARY(20)
   SELECT @hexValues = 0xa543
   DECLARE @string VARCHAR(20)
   SELECT @string = CONVERT(VARCHAR(20), @hexValues)

   DECLARE @textPointer VARBINARY(16)
   SELECT @textPointer = TEXTPTR(textField)
      FROM textTable WHERE id = 2
   UPDATETEXT textTable.textField @textPointer 1 2 @string

   UPDATE textTable
      SET hexValues =
         CONVERT (VARBINARY(20), CONVERT(VARCHAR(20), textField))
      WHERE id = 2

As the execution result indicates, the text data becomes invalid because a double-byte character has been split. It is important for users to check whether the target offset from the start of the existing text is a lead byte or not, so that incorrect results can be avoided. The following sample scripts demonstrate checking the target offset of a text field in a SQL Server that was installed with code page 950. Please note the lead byte ranges may vary in the other DBCS code pages. For more information about lead byte ranges in different code pages, please see "Developing International Software for Windows 95 and Windows NT" (ISBN 1-55615-840-8).

   CREATE PROCEDURE IsThisLeadByteInCP950
      @string VARCHAR(255), @offset SMALLINT
   /* @string: is the target string that will be checked */
   /* @offset: is the 1-based starting position,         */
   /*          specified as the number of bytes (from    */
   /*          the start of the existing text value) to  */
   /*          skip.                                     */
   AS
      SELECT @string = RIGHT(REVERSE(@string), @offset)
      DECLARE @i SMALLINT
      SELECT @i = ASCII(@string)
      IF @i >= 129 and @i <= 254
      /* lead byte range: 0x81 to 0xFE */
         RETURN 1
      ELSE
         RETURN 0
   GO

   DECLARE @hexValues VARBINARY(20)
   SELECT @hexValues = 0xa540a541a542
   DECLARE @string VARCHAR(20)
   SELECT @string = CONVERT(VARCHAR(20), @hexValues)
   DECLARE @i SMALLINT
   EXEC @i = IsThisLeadByteInCP950 @string, 1
   IF (@i = 1)
      PRINT "It is a lead byte in code page 950, command aborted."
   ELSE
      PRINT "It is not a lead byte in code page 950, command continue..."


Additional query words: JIS Wansung Johab Hangul KB BIG-5 932 936 949 950
traditional Chinese Japanese Korean codepage cp
Keywords : SSrvGen kbusage
Version : WINDOWS:6.5
Platform : WINDOWS
Issue type : kbhowto


THE INFORMATION PROVIDED IN THE MICROSOFT KNOWLEDGE BASE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. MICROSOFT DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING THE WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL MICROSOFT CORPORATION OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL, LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF MICROSOFT CORPORATION OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SOME STATES DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES SO THE FOREGOING LIMITATION MAY NOT APPLY.

Last reviewed: August 29, 1997
© 1998 Microsoft Corporation. All rights reserved. Terms of Use.