BOOL GetStringTypeW(dwInfoType, lpSrcStr, cchSrc, lpCharType) | |||
DWORD dwInfoType; | |||
LPWSTR lpSrcStr; | |||
int cchSrc,; | |||
LPWORD lpCharType; |
The GetStringTypeW function returns CTYPE information about a Unicode string.
dwInfoType
Specifies the type of character information the user wants to retrieve. The various types are divided into different levels (see the comments at the end of this call description for a list of what information is included in each type). The options are mutually e xclusive. For the first release the following types will be supported:
CT_CTYPE1 CT_CTYPE2 CT_CTYPE3
lpSrcStr
The string for which character types are requested. If cchSrc is -1, then lpSrcStr is assumed to be null-terminated.
cchSrc
The character count of lpSrcStr. NOTE: this must also be the character count of lpCharType.
lpCharType
an array of the same length as lpSrcStr (cchSrc), which on output contains one word corresponding to each Unicode character in lpSrcStr.
Success: TRUE
Failure: FALSE
This function sets GetLastError with the following errors: ERROR_INVALID_PARAMETER.
The lpSrcStr and lpCharType pointers may NOT be the same; in this case the error ERROR_INVALID_PARAMETER results.
The character type bits are divided up into several levels. One level's information can be retrieved by a single call. Each level is limited to 16 bits of information so that the other mapping routines can also return ctype information (i.e., all other mapping routines are limited to 16-bits of representation per character).
The various character types supported by this function include the following:
Ctype 1: These are the types needed to support ANSI C and POSIX (LC_CTYPE) character typing functions. A bitwise OR of these values is returned when dwInfoType is set to CT_CTYPE1.
Name | Value | Meaning |
C1_UPPER | 0x0001 | uppercase |
C1_LOWER | 0x0002 | lowercase |
C1_DIGIT | 0x0004 | decimal digits |
C1_SPACE | 0x0008 | space characters |
C1_PUNCT | 0x0010 | punctuation |
C1_CNTRL | 0x0020 | control characters |
C1_BLANK | 0x0040 | blank characters |
C1_XDIGIT | 0x0080 | hex digits |
C1_ALPHA | 0x0100 | any letter |
The following character types are either constant or computable from basic types and do not need to be supported by this call. Alphanumeric (alpha + digits) Printable (graphic + blank)
Ctype 2: These are the types supplied to support proper layout of Unicode text. The directionality attributes are assigned so that the BiDi layout algorithm standardized by Unicode produces the correct results. See The Unicode Standard: Worldwide Character Encoding from Addison Wesley for more information on the use of these attributes.
Name | Value | Meaning |
Strong: | |||
C2_LEFTTORIGHT | 0x1 | left-to-right | |
C2_RIGHTTOLEFT | 0x2 | right-to-left | |
Weak: | |||
C2_EUROPENUMBER | 0x3 | European number, European digit | |
C2_EUROPESEPARATOR | 0x4 | European numeric separator | |
C2_EUROPETERMINATOR | 0x5 | European numeric terminator | |
C2_ARABICNUMBER | 0x6 | Arabic number | |
C2_COMMONSEPARATOR | 0x7 | common numeric separator | |
Neutral: | |||
C2_BLOCKSEPARATOR | 0x8 | block separator | |
C2_SEGMENTSEPARATOR | 0x9 | segment separator | |
C2_WHITESPACE | 0xA | white space | |
C2_OTHERNEUTRAL | 0xB | other neutrals | |
Not applicable: | |||
C2_NOTAPPLICABLE | 0x0 | no implicit directionality, eg. control codes |
Ctype 3: As yet undefined, this is intended to be a placeholder for extensions to the POSIX types required for general text processing or for the C Runtimes. These are the types supported in product 1. A bitwise OR of these values is returned when dwInfoType is set to CT_CTYPE3.
Name | Value | Meaning |
C3_NONSPACING | 0x1 | nonspacing mark | |
C3_DIACRITIC | 0x2 | diacritic nonspacing mark | |
C3_VOWELMARK | 0x4 | vowel nonspacing mark | |
C3_SYMBOL | 0x8 | symbol | |
Not applicable: | |||
C3_NOTAPPLICABLE | 0x0 | not applicable |