Click to return to the Networking, Protocols     
Uniform Resource Locator ...     InternetCombineUrl Functi...     Uniform Resource Locator ...    
Web Workshop  |  Networking, Protocols & Data Formats

InternetCanonicalizeUrl Function


Canonicalizes a URL, which includes converting unsafe characters and spaces into escape sequences.

Syntax

BOOL InternetCanonicalizeUrl(
    IN LPCTSTR lpszUrl,
    OUT LPTSTR lpszBuffer,
    IN OUT LPDWORD lpdwBufferLength,
    IN DWORD dwFlags
);

The actual syntax of this function varies between its ANSI and Unicode implementations. For more information, see Win32 Internet Functions Syntax.

Parameters

lpszUrl
Address of the string that contains the URL to canonicalize.
lpszBuffer
Address of the buffer that receives the resulting canonicalized URL.
lpdwBufferLength
Address of an unsigned long integer value that contains the length, in TCHARs, of the lpszBuffer buffer. If the function succeeds, this parameter receives the length of the lpszBuffer buffer—the length does not include the terminating null character. If the function fails, this parameter receives the required length, in bytes, of the lpszBuffer buffer—the required length includes the terminating null character.
dwFlags
Unsigned long integer value that contains the flags that control canonicalization. This can be one of the following values:
ICU_BROWSER_MODE
Does not encode or decode characters after "#" or "?", and does not remove trailing white space after "?". If this value is not specified, the entire URL is encoded and trailing white space is removed.
ICU_DECODE
Converts all %XX sequences to characters, including escape sequences, before the URL is parsed.
ICU_ENCODE_PERCENT
Encodes any percent signs encountered. By default, percent signs are not encoded. This value is available in Microsoft® Internet Explorer 5 and later versions of the Win32® Internet functions.
ICU_ENCODE_SPACES_ONLY
Encodes spaces only.
ICU_NO_ENCODE
Does not convert unsafe characters to escape sequences.
ICU_NO_META
Does not remove meta sequences (such as "." and "..") from the URL.

If no flags are specified (dwFlags = 0), the function converts all unsafe characters and meta sequences (such as \.,\ .., and \...) to escape sequences.

Return Value

Returns TRUE if successful, or FALSE otherwise. To get extended error information, call GetLastError. Possible errors include:

ERROR_BAD_PATHNAME The URL could not be canonicalized. This flag is valid for Internet Explorer 5 and later versions of the Win32 Internet API.
ERROR_INSUFFICIENT_BUFFER The canonicalized URL is too large to fit in the buffer provided. The lpdwBufferLength parameter is set to the size, in bytes, of the buffer required to hold the canonicalized URL.
ERROR_INTERNET_INVALID_URL The format of the URL is invalid.
ERROR_INVALID_PARAMETER There is a bad string, buffer, buffer size, or flags parameter.

Remarks

In Internet Explorer 4.0 and later, InternetCanonicalizeUrl always functions as if the ICU_BROWSER_MODE flag is set. Client applications that need to canonicalize the entire URL should use either CoInternetParseUrl (with the action PARSE_CANONICALIZE and the flag URL_ESCAPE_UNSAFE) or UrlCanonicalize.

InternetCanonicalizeUrl always encodes by default, even if the ICU_DECODE flag has been specified. To decode without re-encoding, use ICU_DECODE | ICU_NO_ENCODE. If the ICU_DECODE flag is used without ICU_NO_ENCODE, the URL is decoded before being parsed; unsafe characters then are re-encoded after parsing. This function will handle arbitrary protocol schemes, but to do so it must make inferences from the unsafe character set.

Applications calling InternetCanonicalizeUrl when using the Internet Explorer 3.0 version of the Win32 Internet API (or when setting the ICU_ENCODE_PERCENT flag for Internet Explorer 5 and later) should track the usage of this function on a particular URL. If unsafe characters in a URL have been converted to escape sequences, using InternetCanonicalizeUrl again on the URL (with no flags) will cause the escape sequences to be converted to another escape sequence. For example, a blank space in a URL would be converted to the escape sequence %20. Calling InternetCanonicalizeUrl again on the URL would cause the escape sequence %20 to be converted to the escape sequence %2520, because the % sign is an unsafe character that is reserved for escape sequences and is replaced by the function with the escape sequence %25.

Function Information

Windows NT Use version 4.0. Implemented as ANSI and Unicode functions.
Windows Use Windows 95 and later. Implemented as ANSI and Unicode functions.
Header Wininet.h
Import library Wininet.lib
Minimum availability Internet Explorer 3.0 (ANSI only), 5 (ANSI and Unicode)

Windows CE

Windows CE Use version 2.12 and later. Implemented as ANSI and Unicode functions.
Minimum availability Internet Explorer 4.0

See Also

Microsoft Win32 Internet Functions Overview, Handling Uniform Resource Locators, Microsoft Win32 Internet Functions Reference, Uniform Resource Locator (URL) Functions



Back to topBack to top

Did you find this topic useful? Suggestions for other topics? Write us!

© 1999 Microsoft Corporation. All rights reserved. Terms of use.