This function converts a URL to a canonical form, including conversion of unsafe characters into escape sequences.
At a Glance
Header file: | Wininet.h |
Windows CE versions: | 2.0 and later |
Syntax
BOOL WINAPI InternetCanonicalizeUrl(
LPCTSTR lpszUrl, LPTSTR lpszBuffer, LPDWORD lpdwBufferLength, DWORD dwFlags);
Parameters
lpszUrl
Long pointer to the null-terminated string that contains the input URL to canonicalize.
lpszBuffer
Long pointer to the buffer that receives the null-terminated string that contains the resulting canonicalized URL.
lpdwBufferLength
Long pointer to the length, in bytes, of the lpszBuffer buffer. If the function succeeds, this parameter receives the length of the lpszBuffer buffer—the length does not include the terminating null. If the function fails, this parameter receives the required length, in bytes, of the lpszBuffer buffer—the required length includes the terminating null.
dwFlags
Specifies control canonicalization. Can be one of the following values:
Value | Description |
ICU_BROWSER_MODE | Does not encode or decode characters after "#" or "?", and does not remove trailing white space after "?". If this value is not specified, the entire URL is encoded, and trailing white space is removed. |
ICU_DECODE | Converts all %XX sequences to characters, including escape sequences, before the URL is parsed. |
ICU_ENCODE_SPACES_ONLY | Encodes spaces only. |
ICU_NO_ENCODE | Does not convert unsafe characters to escape sequences. |
ICU_NO_META | Does not remove meta sequences (such as "." and "..") from the URL. |
If no flags are specified (dwFlags = 0), the function converts all unsafe characters and meta sequences (such as \.,\ .., and \...) to escape sequences.
Return Values
TRUE indicates success. FALSE indicates failure. To get extended error information, call GetLastError. Possible error values for GetLastError are described in the following table.
Value | Meaning |
ERROR_BAD_PATHNAME | The URL could not be canonicalized. |
ERROR_INSUFFICIENT_BUFFER | Canonicalized URL is too large to fit in the buffer provided. The *lpdwBufferLength parameter is set to the size, in bytes, of the buffer required to hold the resultant, canonicalized URL. |
ERROR_INTERNET_INVALID_URL | The format of the URL is invalid. |
ERROR_INVALID_PARAMETER | Bad string, buffer, buffer size, or flags parameter. |
Windows CE Remarks
The lpdwBufferLength parameter refers to count of characters. If the function succeeds, this parameter receives the length, in characters, of the lpszBuffer buffer excluding the terminating null. If the function fails, this parameter receives the required length, in characters, of the lpszBuffer buffer, including the terminating null.
Remarks
InternetCanonicalizeUrl always encodes by default, even if the ICU_DECODE flag has been specified. To decode without re-encoding, use ICU_DECODE | ICU_NO_ENCODE. If the ICU_DECODE flag is used without ICU_NO_ENCODE, the URL is decoded before being parsed; unsafe characters then are re-encoded after parsing. This function will handle arbitrary protocol schemes, but to do so it must make inferences from the unsafe character set.
The application calling InternetCanonicalizeUrl should track the usage of this function on a particular URL. If unsafe characters in a URL have been converted to escape sequences, using InternetCanonicalizeUrl again on the URL (with no flags) will cause the escape sequences to be converted to another escape sequence. For example, a blank space in a URL would be converted to the escape sequence "%20". Calling InternetCanonicalizeUrl again on the URL would cause the escape sequence "%20" to be converted to the escape sequence "%2520", because the "%" sign is an unsafe character that is reserved for escape sequences and is replaced by the function with the escape sequence "%25".