INF: How to Manipulate DBCS Data in a Text FieldLast reviewed: August 29, 1997Article ID: Q172308 |
The information in this article applies to:
SUMMARYThis article describes how to correctly manipulate double-byte character set (DBCS) data in text fields.
MORE INFORMATIONUPDATETEXT and WRITETEXT are powerful tools that allow users to modify existing text data through a text pointer. While using a text pointer, it is very common for users to accidentally split a double-byte character in half. This may result in "data corruption" from user's perspective. The following scripts demonstrate this common problem.
USE pubs GO SET NOCOUNT ON DROP TABLE textTable GO CREATE TABLE textTable ( id SMALLINT IDENTITY(1,1) PRIMARY KEY CLUSTERED, hexValues VARBINARY(20) NOT NULL, textField TEXT NOT NULL ) GO DECLARE @hexValues VARBINARY(20) SELECT @hexValues = 0xa540a541a542 INSERT textTable VALUES (@hexValues, CONVERT (VARCHAR(20), @hexValues)) INSERT textTable VALUES (@hexValues, CONVERT (VARCHAR(20), @hexValues)) GO SELECT * FROM textTable DECLARE @hexValues VARBINARY(20) SELECT @hexValues = 0xa543 DECLARE @string VARCHAR(20) SELECT @string = CONVERT(VARCHAR(20), @hexValues) DECLARE @textPointer VARBINARY(16) SELECT @textPointer = TEXTPTR(textField) FROM textTable WHERE id = 2 UPDATETEXT textTable.textField @textPointer 1 2 @string UPDATE textTable SET hexValues = CONVERT (VARBINARY(20), CONVERT(VARCHAR(20), textField)) WHERE id = 2As the execution result indicates, the text data becomes invalid because a double-byte character has been split. It is important for users to check whether the target offset from the start of the existing text is a lead byte or not, so that incorrect results can be avoided. The following sample scripts demonstrate checking the target offset of a text field in a SQL Server that was installed with code page 950. Please note the lead byte ranges may vary in the other DBCS code pages. For more information about lead byte ranges in different code pages, please see "Developing International Software for Windows 95 and Windows NT" (ISBN 1-55615-840-8).
CREATE PROCEDURE IsThisLeadByteInCP950 @string VARCHAR(255), @offset SMALLINT /* @string: is the target string that will be checked */ /* @offset: is the 1-based starting position, */ /* specified as the number of bytes (from */ /* the start of the existing text value) to */ /* skip. */ AS SELECT @string = RIGHT(REVERSE(@string), @offset) DECLARE @i SMALLINT SELECT @i = ASCII(@string) IF @i >= 129 and @i <= 254 /* lead byte range: 0x81 to 0xFE */ RETURN 1 ELSE RETURN 0 GO DECLARE @hexValues VARBINARY(20) SELECT @hexValues = 0xa540a541a542 DECLARE @string VARCHAR(20) SELECT @string = CONVERT(VARCHAR(20), @hexValues) DECLARE @i SMALLINT EXEC @i = IsThisLeadByteInCP950 @string, 1 IF (@i = 1) PRINT "It is a lead byte in code page 950, command aborted." ELSE PRINT "It is not a lead byte in code page 950, command continue..." |
Additional query words: JIS Wansung Johab Hangul KB BIG-5 932 936 949 950
© 1998 Microsoft Corporation. All rights reserved. Terms of Use. |