This is a snapshot of an early working draft and has therefore been superseded by the HTML standard.

This document will not be further updated.

HTML 5

Call For Comments — 27 October 2007

Certain actions cause the browsing context to navigate. For example, following a hyperlink, form submission, and the window.open() and location.assign() methods can all cause a browsing context to navigate. A user agent may also provide various ways for the user to explicitly cause a browsing context to navigate.

When a browsing context is navigated, the user agent must run the following steps:

  1. Cancel any preexisting attempt to navigate the browsing context.

  2. If the new resource is the same as the current resource, but a fragment identifier has been specified, then scroll for that fragment identifier and abort these steps.

  3. If the new resource is to be handled by displaying some sort of inline content, e.g. an error message because the specified scheme is not one of the supported protocols, or an inline prompt to allow the user to select a registered handler for the given scheme, then display the inline content and abort these steps.

  4. If the new resource is to be handled using a mechanism that does not affect the browsing context, then abort these steps and proceed with that mechanism instead.

  5. If the new resource is to be fetched using HTTP GET or equivalent, and if the browsing context being navigated is a top-level browsing context, then check if there are any application caches that have a manifest with the same scheme/host/port as the URI in question, and that have this URI as one of their entries (excluding entries marked as manifest), and that already contain their manifest, categorised as a manifest. If so, then the user agent must then fetch the resource from the most appropriate application cache of those that match.

    Otherwise, start fetching the specified resource in the appropriate manner (e.g. performing an HTTP GET or POST operation, or reading the file from disk, or executing script in the case of a javascript: URI). If this results in a redirect, return to step 2 with the new resource.

    For example, imagine an HTML page with an associated application cache displaying an image and a form, where the image is also used by several other application caches. If the user right-clicks on the image and chooses "View Image", then the user agent could decide to show the image from any of those caches, but it is likely that the most useful cache for the user would be the one that was used for the aforementioned HTML page. On the other hand, if the user submits the form, and the form does a POST submission, then the user agent will not use an application cache at all; the submission will be made to the network.

  6. Wait for one or more bytes to be available or for the user agent to establish that the resource in question is empty. During this time, the user agent may allow the user to cancel this navigation attempt or start other navigation attempts.

  7. If the resource was not fetched from an application cache, and was to be fetched using HTTP GET or equivalent, and its URI matches the opportunistic caching namespace of one or more application caches, then:

    If the file was successfully downloaded
    The user agent must cache the resource in all those application caches, categorised as opportunistically cached entries.
    If the server returned a 4xx or 5xx status code or equivalent, or there were network errors
    If the browsing context being navigated is a top-level browsing context, then the user agent must discard the failed load and instead use the fallback resource specified for the opportunistic caching namespace in question. If multiple application caches match, the user agent must use the fallback of the most appropriate application cache of those that match. For the purposes of session history (and features that depend on session history, e.g. bookmarking) the user agent must use the URI of the resource that was requested (the one that matched the opportunistic caching namespace), not the fallback resource. However, the user agent may indicate to the user that the original page load failed, that the page used was a fallback resource, and what the URI of the fallback resource actually is.
  8. If the document's out-of-band metadata (e.g. HTTP headers), not counting any type information (such as the Content-Type HTTP header), requires some sort of processing that will not affect the browsing context, then perform that processing and abort these steps.

    Such processing might be triggered by, amongst other things, the following:

    • HTTP status codes (e.g. 204 No Content or 205 Reset Content)
    • HTTP Content-Disposition headers
    • Network errors
  9. Let type be the sniffed type of the resource.

  10. If the user agent has been configured to process resources of the given type using some mechanism other than rendering the content in a browsing context, then skip this step. Otherwise, if the type is one of the following types, jump to the appropriate entry in the following list, and process the resource as described there:

    "text/html"
    Follow the steps given in the HTML document section, and abort these steps.
    Any type ending in "+xml"
    "application/xml"
    "text/xml"
    Follow the steps given in the XML document section. If that section determines that the content is not to be displayed as a generic XML document, then proceed to the next step in this overall set of steps. Otherwise, abort these steps.
    "text/plain"
    Follow the steps given in the plain text file section, and abort these steps.
    A supported image type
    Follow the steps given in the image section, and abort these steps.
    A type that will use an external application to render the content in the browsing context
    Follow the steps given in the plugin section, and abort these steps.
  11. Otherwise, the document's type is such that the resource will not affect the browsing context, e.g. because the resource is to be handed to an external application. Process the resource appropriately.

Some of the sections below, to which the above algorithm defers in certain cases, require the user agent to update the session history with the new page. When a user agent is required to do this, it must follows the set of steps given below that is appropriate for the situation at hand. From the point of view of any script, these steps must occur atomically.

  1. pause for scripts

  2. onbeforeunload

  3. onunload

  4. If the navigation was initiated for entry update of an entry
    1. Replace the entry being updated with a new entry representing the new resource and its Document object and related state. The user agent may propagate state from the old entry to the new entry (e.g. scroll position).

    2. Traverse the history to the new entry.

    Otherwise
    1. Remove all the entries after the current entry in the browsing context's Document object's History object.

      This doesn't necessarily have to affect the user agent's user interface.

    2. Append a new entry at the end of the History object representing the new resource and its Document object and related state.

    3. Traverse the history to the new entry.

    4. If the navigation was initiated with replacement enabled, remove the entry immediately before the new current entry in the session history.

4.8.1. Page load processing model for HTML files

When an HTML document is to be loaded in a browsing context, the user agent must create a Document object, mark it as being an HTML document, create an HTML parser, associate it with the document, and begin to use the bytes provided for the document as the input stream for that parser.

The input stream converts bytes into characters for use in the tokeniser. This process relies, in part, on character encoding information found in the real Content-Type metadata of the resource; the "sniffed type" is not used for this purpose.

When no more bytes are available, an EOF character is implied, which eventually causes a load event to be fired.

After creating the Document object, but potentially before the page has finished parsing, the user agent must update the session history with the new page.

Application cache selection happens in the HTML parser.

4.8.2. Page load processing model for XML files

When faced with displaying an XML file inline, user agents must first create a Document object, following the requirements of the XML and Namespaces in XML recommendations, RFC 3023, DOM3 Core, and other relevant specifications. [XML] [XMLNS] [RFC3023] [DOM3CORE]

The actual HTTP headers and other metadata, not the headers as mutated or implied by the algorithms given in this specification, are the ones that must be used when determining the character encoding according to the rules given in the above specifications.

If the root element, as parsed according to the XML specifications cited above, is found to be an html element with an attribute manifest, then, as soon as the element is inserted into the DOM, the user agent must run the application cache selection algorithm with the value of that attribute as the manifest URI. Otherwise, as soon as the root element is inserted into the DOM, the user agent must run the application cache selection algorithm with no manifest.

Because the processing of the manifest attribute happens only once the root element is parsed, any URIs referenced by processing instructions before the root element (such as <?xml-styleesheet?> and <?xbl?> PIs) will be fetched from the network and cannot be cached.

User agents may examine the namespace of the root Element node of this Document object to perform namespace-based dispatch to alternative processing tools, e.g. determining that the content is actually a syndication feed and passing it to a feed handler. If such processing is to take place, abort the steps in this section, and jump to step 10 in the navigate steps above.

Otherwise, then, with the newly created Document, the user agents must update the session history with the new page. User agents may do this before the complete document has been parsed (thus achieving incremental rendering).

Error messages from the parse process (e.g. namespace well-formedness errors) may be reported inline by mutating the Document.

4.8.3. Page load processing model for text files

When a plain text document is to be loaded in a browsing context, the user agent should create a Document object, mark it as being an HTML document, create an HTML parser, associate it with the document, act as if the tokeniser had emitted a start tag token with the tag name "pre", set the tokenisation stage's content model flag to PLAINTEXT, and begin to pass the stream of characters in the plain text document to that tokeniser.

The rules for how to convert the bytes of the plain text document into actual characters are defined in RFC 2046, RFC 2646, and subsequent versions thereof. [RFC2046] [RFC2646]

Upon creation of the Document object, the user agent must run the application cache selection algorithm with no manifest.

When no more character are available, an EOF character is implied, which eventually causes a load event to be fired.

After creating the Document object, but potentially before the page has finished parsing, the user agent must update the session history with the new page.

User agents may add content to the head element of the Document, e.g. linking to stylesheet or an XBL binding, providing script, giving the document a title, etc.

4.8.4. Page load processing model for images

When an image resource is to be loaded in a browsing context, the user agent should create a Document object, mark it as being an HTML document, append an html element to the Document, append a head element and a body element to the html element, append an img to the body element, and set the src attribute of the img element to the address of the image.

Then, the user agent must act as if it had stopped parsing.

Upon creation of the Document object, the user agent must run the application cache selection algorithm with no manifest.

After creating the Document object, but potentially before the page has finished fully loading, the user agent must update the session history with the new page.

User agents may add content to the head element of the Document, or attributes to the img element, e.g. to link to stylesheet or an XBL binding, to provide a script, to give the document a title, etc.

4.8.5. Page load processing model for content that uses plugins

When a resource that requires an external resource to be rendered is to be loaded in a browsing context, the user agent should create a Document object, mark it as being an HTML document, append an html element to the Document, append a head element and a body element to the html element, append an embed to the body element, and set the src attribute of the img element to the address of the image.

Then, the user agent must act as if it had stopped parsing.

Upon creation of the Document object, the user agent must run the application cache selection algorithm with no manifest.

After creating the Document object, but potentially before the page has finished fully loading, the user agent must update the session history with the new page.

User agents may add content to the head element of the Document, or attributes to the embed element, e.g. to link to stylesheet or an XBL binding, or to give the document a title.

4.8.6. Page load processing model for inline content that doesn't have a DOM

When the user agent is to display a user agent page inline in a browsing context, the user agent should create a Document object, mark it as being an HTML document, and then either associate that Document with a custom rendering that is not rendered using the normal Document rendering rules, or mutate that Document until it represents the content the user agent wants to render.

Once the page has been set up, the user agent must act as if it had stopped parsing.

Upon creation of the Document object, the user agent must run the application cache selection algorithm with no manifest.

After creating the Document object, but potentially before the page has been completely set up, the user agent must update the session history with the new page.

4.8.7. Scrolling to a fragment identifier

When a user agent is supposed to scroll for a fragment identifier, then the user agent must follow these steps:

  1. First, update the session history with the new page, where "the new page" has the same Document as before but with the URI having the newly specified fragment identifier.

  2. Then, change the scrolling position of the document, or perform some other action, such that the indicated part of the document is brought to the user's attention.

how to get "the indicated part of the document" from a frag id -- id="", name="", XPointer, etc; missing IDs (e.g. the infamous "#top")