This function returns character-type information for the characters in the specified source string. For each character in the string, the function sets one or more bits in the corresponding 16-bit element of the output array. Each bit identifies a given character type, such as whether the character is a letter, a digit, or neither.
At a Glance
Header file: | Winnls.h |
Windows CE versions: | 1.0 and later |
Syntax
BOOL GetStringType (DWORD dwInfoType, LPCWSTR lpSrcStr, int cchSrc, LPWORD lpCharType);
Parameters
dwInfoType
[in] Specifies the type of character information the user wants to retrieve. The various types are divided into different levels (see the following Remarks section for a list of the information included in each type). This parameter can specify one of the following character type flags:
Value | Description |
CT_CTYPE1 | Retrieve character type information. |
CT_CTYPE2 | Retrieve bidirectional layout information. |
CT_CTYPE3 | Retrieve text processing information. |
lpSrcStr
[in] Pointer to the string for which character types are requested. If cchSrc is –1, the string is assumed to be null terminated. This must be a Unicode string.
cchSrc
[in] Specifies the size, in characters, of the string pointed to by the lpSrcStr parameter. If this count includes a terminating null character, the function returns character type information for the terminating null character. If this value is –1, the string is assumed to be null terminated and the length is calculated automatically.
lpCharType
[out] Pointer to an array of 16-bit values. The length of this array must be large enough to receive one 16-bit value for the number of characters specified in the cchSrc parameter. When the function returns, this array contains one word corresponding to each Unicode character in the source string.
Return Values
Nonzero indicates success. Zero indicates failure. To get extended error information, call GetLastError. Possible values for GetLastError include the following:
Remarks
The GetStringType function is designed for Unicode strings only. In the Microsoft Platform SDK, this function is named GetStringTypeW to distinguish it from the similar ANSI-string function, GetStringTypeA. Because Windows CE only supports Unicode, it requires only the single GetStringType function.
The lpSrcStr and lpCharType pointers must not be the same. If they are the same, the function fails and GetLastError returns ERROR_INVALID_PARAMETER.
The character-type bits are divided into several levels. The information for one level can be retrieved by a single call to this function. Each level is limited to 16 bits of information so that the other mapping routines, which are limited to 16 bits of representation per character, can also return character-type information.
The character types supported by this function include the following.
Ctype 1
These types support ANSI C and POSIX (LC_CTYPE) character-typing functions. A combination of these values is returned in the array pointed to by the lpCharType parameter when the dwInfoType parameter is set to CT_CTYPE1.
Name | Value | Description |
C1_UPPER | 0x0001 | Uppercase |
C1_LOWER | 0x0002 | Lowercase |
C1_DIGIT | 0x0004 | Decimal digits |
C1_SPACE | 0x0008 | Space characters |
C1_PUNCT | 0x0010 | Punctuation |
C1_CNTRL | 0x0020 | Control characters |
C1_BLANK | 0x0040 | Blank characters |
C1_XDIGIT | 0x0080 | Hexadecimal digits |
C1_ALPHA | 0x0100 | Any linguistic character: alphabetic, syllabary, or ideographic |
The following character types are either constant or computable from basic types and do not need to be supported by this function.
Type | Description |
Alphanumeric | Alphabetic characters and digits (C1_ALPHA and C1_DIGIT) |
Printable | Graphic characters and blanks (all C1_* types except C1_CNTRL) |
Ctype 2
These types support proper layout of Unicode text. The direction attributes are assigned so that the bidirectional layout algorithm standardized by Unicode produces accurate results. These types are mutually exclusive. For more information about the use of these attributes, see The Unicode Standard: Worldwide Character Encoding, Volumes 1 and 2, Addison Wesley Publishing Company: 1991, 1992, ISBN 0201567881.
Name | Value | Description |
Strong | ||
C2_LEFTTORIGHT | 0x0001 | Left to right |
C2_RIGHTTOLEFT | 0x0002 | Right to left |
Weak | ||
C2_EUROPENUMBER | 0x0003 | European number, European digit |
C2_EUROPESEPARATOR | 0x0004 | European numeric separator |
C2_EUROPETERMINATOR | 0x0005 | European numeric terminator |
C2_ARABICNUMBER | 0x0006 | Arabic number |
C2_COMMONSEPARATOR | 0x0007 | Common numeric separator |
Neutral | ||
C2_BLOCKSEPARATOR | 0x0008 | Block separator |
C2_SEGMENTSEPARATOR | 0x0009 | Segment separator |
C2_WHITESPACE | 0x000A | White space |
C2_OTHERNEUTRAL | 0x000B | Other neutrals |
Not applicable | ||
C2_NOTAPPLICABLE | 0x0000 | No implicit directionality (for example, control codes) |
Ctype 3
These types are intended to be placeholders for extensions to the POSIX types required for general text processing or for the standard C library functions. A combination of these values is returned when dwInfoType is set to CT_CTYPE3.
Name | Value | Description |
C3_NONSPACING | 0x0001 | Nonspacing mark |
C3_DIACRITIC | 0x0002 | Diacritic nonspacing mark |
C3_VOWELMARK | 0x0004 | Vowel nonspacing mark |
C3_SYMBOL | 0x0008 | Symbol |
C3_KATAKANA | 0x0010 | Katakana character |
C3_HIRAGANA | 0x0020 | Hiragana character |
C3_HALFWIDTH | 0x0040 | Half-width character |
C3_FULLWIDTH | 0x0080 | Full-width character |
C3_IDEOGRAPH | 0x0100 | Ideographic character |
C3_KASHIDA | 0x0200 | Arabic Kashida character |
C3_LEXICAL | 0x0400 | Punctuation which is counted as part of the word (Kashida, hyphen, feminine/masculine ordinal indicators, equal sign, and so forth) |
C3_ALPHA | 0x8000 | All linguistic characters (alphabetic, syllabary, and ideographic) |
Not applicable | ||
C3_NOTAPPLICABLE | 0x0000 | Not applicable |
See Also