Platform SDK: International Features

LCMapString

The LCMapString function maps one character string to another, performing a specified locale-dependent transformation. The function can also be used to generate a sort key for the input string.

int LCMapString(
  LCID Locale,       // locale identifier
  DWORD dwMapFlags,  // mapping transformation type
  LPCTSTR lpSrcStr,  // source string
  int cchSrc,        // number of characters in source string
  LPTSTR lpDestStr,  // destination buffer
  int cchDest        // size of destination buffer
);

Parameters

Locale
[in] Specifies a locale identifier. The locale provides a context for the string mapping or sort key generation. An application can use the MAKELCID macro to create a locale identifier.
dwMapFlags
[in] Specifies the type of transformation to be used during string mapping or sort key generation. An application can specify more than one of these options on a single transformation, although some combinations are invalid. The following mapping options are defined; restrictions are noted following the table.
Option Meaning
LCMAP_BYTEREV Windows NT/2000: Use byte reversal. For example, if you pass in 0x3450 0x4822 the result is 0x5034 0x2248.
LCMAP_FULLWIDTH Uses wide characters (where applicable).
LCMAP_HALFWIDTH Uses narrow characters (where applicable).
LCMAP_HIRAGANA Hiragana.
LCMAP_KATAKANA Katakana.
LCMAP_LINGUISTIC_CASING Uses linguistic rules for casing, rather than file system rules (the default). Valid with LCMAP_LOWERCASE or LCMAP_UPPERCASE only.
LCMAP_LOWERCASE Uses lowercase.
LCMAP_SIMPLIFIED_CHINESE Windows NT 4.0 and later: Maps traditional Chinese characters to simplified Chinese characters.
LCMAP_SORTKEY Produces a normalized wide character–sort key.
LCMAP_TRADITIONAL_CHINESE Windows NT 4.0 and later: Maps simplified Chinese characters to traditional Chinese characters.
LCMAP_UPPERCASE Uses uppercase.
NORM_IGNORECASE Ignores case.
NORM_IGNOREKANATYPE Does not differentiate between Hiragana and Katakana characters. Corresponding Hiragana and Katakana will compare as equal.
NORM_IGNORENONSPACE Ignores nonspacing. This flag also removes Japanese accent characters.
NORM_IGNORESYMBOLS Ignores symbols.
NORM_IGNOREWIDTH Does not differentiate between a single-byte character and the same character as a double-byte character.
SORT_STRINGSORT Treats punctuation the same as symbols.

The NORM_* flags are normalization options that are only used in combination with the LCMAP_SORTKEY flag.

If the LCMAP_SORTKEY flag is not specified, the LCMapString function performs string mapping. In this case the following restrictions apply:

When the LCMAP_SORTKEY flag is specified, the LCMapString function generates a sort key. In this case the following restriction applies:

lpSrcStr
[in] Pointer to a source string that the function maps or uses for sort key generation.
cchSrc
[in] Specifies the number of TCHARs in the string pointed to by the lpSrcStr parameter.

This count can include the NULL terminator, or not include it. If the NULL terminator is included in the character count, it does not greatly affect the mapping behavior. That is because NULL is considered to be unsortable, and always maps to itself.

A cchSrc value of –1 specifies that the string pointed to by lpSrcStr is null-terminated. If this is the case, and LCMapString is being used in its string-mapping mode, the function calculates the string's length itself, and null-terminates the mapped string stored into *lpDestStr.

lpDestStr
[out] Pointer to a buffer that receives the mapped string or sort key.

If LCMAP_SORTKEY is specified, LCMapString stores a sort key into the buffer. The sort key is stored as an array of byte values in the following format:

[all Unicode sort weights] 0x01 [all Diacritic weights] 0x01 [all Case weights] 0x01 [all Special weights] 0x00 

Note that the sort key is null-terminated. This is true regardless of the value of cchSrc. Also note that, even if some of the sort weights are absent from the sort key, due to the presence of one or more ignore flags in dwMapFlags, the 0x01 separators and the 0x00 terminator are still present.

cchDest
[in] Specifies the size, in TCHARs, of the buffer pointed to by lpDestStr.

If the function is being used for string mapping, the size is a character count. If space for a NULL terminator is included in cchSrc, then cchDest must also include space for a NULL terminator.

If the function is being used to generate a sort key, the size is a byte count. This byte count must include space for the sort key 0x00 terminator.

If cchDest is zero, the function's return value is the number of characters, or bytes if LCMAP_SORTKEY is specified, required to hold the mapped string or sort key. In this case, the buffer pointed to by lpDestStr is not used.

Return Values

If the function succeeds, and the value of cchDest is nonzero, the return value is the number of characters, or bytes if LCMAP_SORTKEY is specified, written to the buffer. This count includes room for a NULL terminator.

If the function succeeds, and the value of cchDest is zero, the return value is the size of the buffer in characters, or bytes if LCMAP_SORTKEY is specified, required to receive the translated string or sort key. This size includes room for a NULL terminator.

If the function fails, the return value is 0. To get extended error information, call GetLastError. GetLastError may return one of the following error codes:

Remarks

The mapped string is null terminated if the source string is null terminated.

The ANSI version of this function maps strings to and from Unicode based on the specified LCID's default ANSI code page.

For the ANSI version of this function, the LCMAP_UPPERCASE flag produces the same result as AnsiUpper in the locale. Likewise, the LCMAP_LOWERCASE flag produces the same result as AnsiLower. This function always maps a single character to a single character.

If LCMAP_UPPERCASE or LCMAP_LOWERCASE is set and if LCMAP_SORTKEY is not set, the lpSrcStr and lpDestStr pointers can be the same. Otherwise, the lpSrcStr and lpDestStr pointers must not be the same. If they are the same, the function fails, and GetLastError returns ERROR_INVALID_PARAMETER.

If the LCMAP_HIRAGANA flag is specified to map Katakana characters to Hiragana characters, and LCMAP_FULLWIDTH is not specified, the function only maps full-width characters to Hiragana. In this case, any half-width Katakana characters are placed as-is in the output string, with no mapping to Hiragana. An application must specify LCMAP_FULLWIDTH if it wants half-width Katakana characters mapped to Hiragana.

Even if the Unicode version of this function is called, the output string is only in WCHAR or CHAR format if the string mapping mode of LCMapString is used. If the sort key generation mode is used, specified by LCMAP_SORTKEY, the output is an array of byte values. An application can compare sort keys by using a byte-by-byte comparison.

An application can call the function with the NORM_IGNORENONSPACE and NORM_IGNORESYMBOLS flags set, and all other options flags cleared, in order to simply strip characters from the input string. If this is done with an input string that is not null-terminated, it is possible for LCMapString to return an empty string and not return an error.

The LCMapString function ignores the Arabic Kashida. If an application calls the function to create a sort key for a string containing an Arabic Kashida, there will be no sort key value for the Kashida.

The function treats the hyphen and apostrophe a bit differently than other punctuation symbols, so that words like coop and co-op stay together in a list. All punctuation symbols other than the hyphen and apostrophe sort before the alphanumeric characters. An application can change this behavior by setting the SORT_STRINGSORT flag. See CompareString for a more detailed discussion of this issue.

When LCMapString is used to generate a sort key, by setting the LCMAP_SORTKEY flag, the sort key stored into *lpDestStr may contain an odd number of bytes. The LCMAP_BYTEREV option only reverses an even number of bytes. If both options are chosen, the last (odd-positioned) byte in the sort key is not reversed. If the terminating 0x00 byte is an odd-positioned byte, then it remains the last byte in the sort key. If the terminating 0x00 byte is an even-positioned byte, it exchanges positions with the byte that precedes it.

When LCMAP_SORTKEY flag is specified, the function transforms the two strings so that when they are compared with strcmp, the same order will result as if the original strings were compared with CompareString. When LCMAP_SORTKEY flag is specified, the output string is a string, but the character values are not meaningful display values.

Windows 2000: The ANSI version of this function will fail if it is used with a Unicode-only locale. See Language Identifiers.

Requirements

  Windows NT/2000: Requires Windows NT 3.1 or later.
  Windows 95/98: Requires Windows 95 or later.
  Header: Declared in Winnls.h; include Windows.h.
  Library: Use Kernel32.lib.
  Unicode: Implemented as Unicode and ANSI versions on Windows NT/2000.

See Also

National Language Support Overview, National Language Support Functions, AnsiLower, AnsiUpper, CompareString, FoldString, MAKELCID