GetStringTypeW

  BOOL GetStringTypeW(dwInfoType, lpSrcStr, cchSrc, lpCharType)    
  DWORD dwInfoType;    
  LPWSTR lpSrcStr;    
  int cchSrc,;    
  LPWORD lpCharType;    

The GetStringTypeW function returns CTYPE information about a Unicode string.

Parameters

dwInfoType

Specifies the type of character information the user wants to retrieve. The various types are divided into different levels (see the comments at the end of this call description for a list of what information is included in each type). The options are mutually e xclusive. For the first release the following types will be supported:

CT_CTYPE1 CT_CTYPE2 CT_CTYPE3

lpSrcStr

The string for which character types are requested. If cchSrc is -1, then lpSrcStr is assumed to be null-terminated.

cchSrc

The character count of lpSrcStr. NOTE: this must also be the character count of lpCharType.

lpCharType

an array of the same length as lpSrcStr (cchSrc), which on output contains one word corresponding to each Unicode character in lpSrcStr.

Return Value

Success: TRUE

Failure: FALSE

This function sets GetLastError with the following errors: ERROR_INVALID_PARAMETER.

Comments

The lpSrcStr and lpCharType pointers may NOT be the same; in this case the error ERROR_INVALID_PARAMETER results.

The character type bits are divided up into several levels. One level's information can be retrieved by a single call. Each level is limited to 16 bits of information so that the other mapping routines can also return ctype information (i.e., all other mapping routines are limited to 16-bits of representation per character).

The various character types supported by this function include the following:

Ctype 1: These are the types needed to support ANSI C and POSIX (LC_CTYPE) character typing functions. A bitwise OR of these values is returned when dwInfoType is set to CT_CTYPE1.

Name Value Meaning

C1_UPPER 0x0001 uppercase
C1_LOWER 0x0002 lowercase
C1_DIGIT 0x0004 decimal digits
C1_SPACE 0x0008 space characters
C1_PUNCT 0x0010 punctuation
C1_CNTRL 0x0020 control characters
C1_BLANK 0x0040 blank characters
C1_XDIGIT 0x0080 hex digits
C1_ALPHA 0x0100 any letter

The following character types are either constant or computable from basic types and do not need to be supported by this call. Alphanumeric (alpha + digits) Printable (graphic + blank)

Ctype 2: These are the types supplied to support proper layout of Unicode text. The directionality attributes are assigned so that the BiDi layout algorithm standardized by Unicode produces the correct results. See The Unicode Standard: Worldwide Character Encoding from Addison Wesley for more information on the use of these attributes.

Name Value Meaning

Strong:    
C2_LEFTTORIGHT 0x1 left-to-right
C2_RIGHTTOLEFT 0x2 right-to-left
Weak:    
C2_EUROPENUMBER 0x3 European number, European digit
C2_EUROPESEPARATOR 0x4 European numeric separator
C2_EUROPETERMINATOR 0x5 European numeric terminator
C2_ARABICNUMBER 0x6 Arabic number
C2_COMMONSEPARATOR 0x7 common numeric separator
Neutral:    
C2_BLOCKSEPARATOR 0x8 block separator
C2_SEGMENTSEPARATOR 0x9 segment separator
C2_WHITESPACE 0xA white space
C2_OTHERNEUTRAL 0xB other neutrals
Not applicable:    
C2_NOTAPPLICABLE 0x0 no implicit directionality, eg. control codes

Ctype 3: As yet undefined, this is intended to be a placeholder for extensions to the POSIX types required for general text processing or for the C Runtimes. These are the types supported in product 1. A bitwise OR of these values is returned when dwInfoType is set to CT_CTYPE3.

Name Value Meaning

C3_NONSPACING 0x1 nonspacing mark
C3_DIACRITIC 0x2 diacritic nonspacing mark
C3_VOWELMARK 0x4 vowel nonspacing mark
C3_SYMBOL 0x8 symbol
Not applicable:    
C3_NOTAPPLICABLE 0x0 not applicable