2 Common infrastructure

2.1 Terminology

This specification refers to both HTML and XML attributes and IDL attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and IDL attributes for those defined on IDL interfaces. Similarly, the term "properties" is used for both JavaScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.

Generally, when the specification states that a feature applies to the HTML syntax or the XHTML syntax, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".

This specification uses the term document to refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications. The term is used to refer both to Document objects and their descendant DOM trees, and to serialized byte streams using the HTML syntax or XHTML syntax, depending on context.

In the context of the DOM structures, the terms HTML document and XML document are used as defined in the DOM specification, and refer specifically to two different modes that Document objects can find themselves in. [DOM] (Such uses are always hyperlinked to their definition.)

In the context of byte streams, the term HTML document refers to resources labeled as text/html, and the term XML document refers to resources labeled with an XML MIME type.

The term XHTML document is used to refer to both Documents in the XML document mode that contains element nodes in the HTML namespace, and byte streams labeled with an XML MIME type that contain elements from the HTML namespace, depending on context.

For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.

When an algorithm B says to return to another algorithm A, it implies that A called B. Upon returning to A, the implementation must continue from where it left off in calling B.

The term "transparent black" refers to the color with red, green, blue, and alpha channels all set to zero.

2.1.1 Resources

The specification uses the term supported when referring to whether a user agent has an implementation capable of decoding the semantics of an external resource. A format or type is said to be supported if the implementation can process an external resource of that format or type without critical aspects of the resource being ignored. Whether a specific resource is supported can depend on what features of the resource's format are in use.

For example, a PNG image would be considered to be in a supported format if its pixel data could be decoded and rendered, even if, unbeknownst to the implementation, the image also contained animation data.

An MPEG-4 video file would not be considered to be in a supported format if the compression format used was not supported, even if the implementation could determine the dimensions of the movie from the file's metadata.

What some specifications, in particular the HTTP specification, refer to as a representation is referred to in this specification as a resource. [HTTP]

The term MIME type is used to refer to what is sometimes called an Internet media type in protocol literature. The term media type in this specification is used to refer to the type of media intended for presentation, as used by the CSS specifications. [RFC2046] [MQ]

A string is a valid MIME type if it matches the media-type rule defined in section 3.7 "Media Types" of RFC 2616. In particular, a valid MIME type may include MIME type parameters. [HTTP]

A string is a valid MIME type with no parameters if it matches the media-type rule defined in section 3.7 "Media Types" of RFC 2616, but does not contain any U+003B SEMICOLON characters (;). In other words, if it consists only of a type and subtype, with no MIME Type parameters. [HTTP]

The term HTML MIME type is used to refer to the MIME type text/html.

A resource's critical subresources are those that the resource needs to have available to be correctly processed. Which resources are considered critical or not is defined by the specification that defines the resource's format.

The term data: URL refers to URLs that use the data: scheme. [RFC2397]

2.1.2 XML

To ease migration from HTML to XHTML, UAs conforming to this specification will place elements in HTML in the http://www.w3.org/1999/xhtml namespace, at least for the purposes of the DOM and CSS. The term "HTML elements", when used in this specification, refers to any element in that namespace, and thus refers to both HTML and XHTML elements.

Except where otherwise stated, all elements defined or mentioned in this specification are in the HTML namespace ("http://www.w3.org/1999/xhtml"), and all attributes defined or mentioned in this specification have no namespace.

The term element type is used to refer to the set of elements that have a given local name and namespace. For example, button elements are elements with the element type button, meaning they have the local name "button" and (implicitly as defined above) the HTML namespace.

Attribute names are said to be XML-compatible if they match the Name production defined in XML, they contain no U+003A COLON characters (:), and their first three characters are not an ASCII case-insensitive match for the string "xml". [XML]

The term XML MIME type is used to refer to the MIME types text/xml, application/xml, and any MIME type whose subtype ends with the four characters "+xml". [RFC3023]

2.1.3 DOM trees

The root element of a Document object is that Document's first element child, if any. If it does not have one then the Document has no root element.

The term root element, when not referring to a Document object's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself if it has no ancestors. When the node is a part of the document, then the node's root element is indeed the document's root element; however, if the node is not currently part of the document tree, the root element will be an orphaned node.

When an element's root element is the root element of a Document object, it is said to be in a Document. An element is said to have been inserted into a document when its root element changes and is now the document's root element. Analogously, an element is said to have been removed from a document when its root element changes from being the document's root element to being another element.

A node's home subtree is the subtree rooted at that node's root element. When a node is in a Document, its home subtree is that Document's tree.

The Document of a Node (such as an element) is the Document that the Node's ownerDocument IDL attribute returns. When a Node is in a Document then that Document is always the Node's Document, and the Node's ownerDocument IDL attribute thus always returns that Document.

The Document of a content attribute is the Document of the attribute's element.

The term tree order means a pre-order, depth-first traversal of DOM nodes involved (through the parentNode/childNodes relationship).

When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.

A content attribute is said to change value only if its new value is different than its previous value; setting an attribute to a value it already has does not change it.

The term empty, when used of an attribute value, Text node, or string, means that the length of the text is zero (i.e. not even containing spaces or control characters).

2.1.4 Scripting

The construction "a Foo object", where Foo is actually an interface, is sometimes used instead of the more accurate "an object implementing the interface Foo".

An IDL attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.

If a DOM object is said to be live, then the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data.

In the contexts of events, the terms fire and dispatch are used as defined in the DOM specification: firing an event means to create and dispatch it, and dispatching an event means to follow the steps that propagate the event through the tree. The term trusted event is used to refer to events whose isTrusted attribute is initialized to true. [DOM]

2.1.5 Plugins

The term plugin refers to a user-agent defined set of content handlers used by the user agent that can take part in the user agent's rendering of a Document object, but that neither act as child browsing contexts of the Document nor introduce any Node objects to the Document's DOM.

Typically such content handlers are provided by third parties, though a user agent can also designate built-in content handlers as plugins.

A user agent must not consider the types text/plain and application/octet-stream as having a registered plugin.

One example of a plugin would be a PDF viewer that is instantiated in a browsing context when the user navigates to a PDF file. This would count as a plugin regardless of whether the party that implemented the PDF viewer component was the same as that which implemented the user agent itself. However, a PDF viewer application that launches separate from the user agent (as opposed to using the same interface) is not a plugin by this definition.

This specification does not define a mechanism for interacting with plugins, as it is expected to be user-agent- and platform-specific. Some UAs might opt to support a plugin mechanism such as the Netscape Plugin API; others might use remote content converters or have built-in support for certain types. Indeed, this specification doesn't require user agents to support plugins at all. [NPAPI]

A plugin can be secured if it honors the semantics of the sandbox attribute.

For example, a secured plugin would prevent its contents from creating pop-up windows when the plugin is instantiated inside a sandboxed iframe.

Browsers should take extreme care when interacting with external content intended for plugins. When third-party software is run with the same privileges as the user agent itself, vulnerabilities in the third-party software become as dangerous as those in the user agent.

Since different users having differents sets of plugins provides a fingerprinting vector that increases the chances of users being uniquely identified, user agents are encouraged to support the exact same set of plugins for each user. (This is a fingerprinting vector.)

2.1.6 Character encodings

A character encoding, or just encoding where that is not ambiguous, is a defined way to convert between byte streams and Unicode strings, as defined in the WHATWG Encoding standard. An encoding has an encoding name and one or more encoding labels, referred to as the encoding's name and labels in the Encoding standard. [ENCODING]

An ASCII-compatible character encoding is a single-byte or variable-length encoding in which the bytes 0x09, 0x0A, 0x0C, 0x0D, 0x20 - 0x22, 0x26, 0x27, 0x2C - 0x3F, 0x41 - 0x5A, and 0x61 - 0x7A, ignoring bytes that are the second and later bytes of multibyte sequences, all correspond to single-byte sequences that map to the same Unicode characters as those bytes in Windows-1252. [ENCODING]

This includes such encodings as Shift_JIS, HZ-GB-2312, and variants of ISO-2022, even though it is possible in these encodings for bytes like 0x70 to be part of longer sequences that are unrelated to their interpretation as ASCII. It excludes UTF-16 variants, as well as obsolete legacy encodings such as UTF-7, GSM03.38, and EBCDIC variants.

The term a UTF-16 encoding refers to any variant of UTF-16: UTF-16LE or UTF-16BE, regardless of the presence or absence of a BOM. [ENCODING]

The term code unit is used as defined in the Web IDL specification: a 16 bit unsigned integer, the smallest atomic component of a DOMString. (This is a narrower definition than the one used in Unicode, and is not the same as a code point.) [WEBIDL]

The term Unicode code point means a Unicode scalar value where possible, and an isolated surrogate code point when not. When a conformance requirement is defined in terms of characters or Unicode code points, a pair of code units consisting of a high surrogate followed by a low surrogate must be treated as the single code point represented by the surrogate pair, but isolated surrogates must each be treated as the single code point with the value of the surrogate. [UNICODE]

In this specification, the term character, when not qualified as Unicode character, is synonymous with the term Unicode code point.

The term Unicode character is used to mean a Unicode scalar value (i.e. any Unicode code point that is not a surrogate code point). [UNICODE]

The code-unit length of a string is the number of code units in that string.

This complexity results from the historical decision to define the DOM API in terms of 16 bit (UTF-16) code units, rather than in terms of Unicode characters.

2.2 Conformance requirements

All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.

The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. The key word "OPTIONALLY" in the normative parts of this document is to be interpreted with the same normative meaning as "MAY" and "OPTIONAL". For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

For example, were the spec to say:

To eat an orange, the user must:
1. Peel the orange.
2. Separate each slice of the orange.
3. Eat the orange slices.

...it would be equivalent to the following:

To eat an orange:
1. The user must peel the orange.
2. The user must separate each slice of the orange.
3. The user must eat the orange slices.

Here the key word is "must".

The former (imperative) style is generally preferred in this specification for stylistic reasons.

Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

2.2.1 Conformance classes

This specification describes the conformance criteria for user agents (relevant to implementors) and documents (relevant to authors and authoring tool implementors).

Conforming documents are those that comply with all the conformance criteria for documents. For readability, some of these conformance requirements are phrased as conformance requirements on authors; such requirements are implicitly requirements on documents: by definition, all documents are assumed to have had an author. (In some cases, that author may itself be a user agent — such user agents are subject to additional rules, as explained below.)

For example, if a requirement states that "authors must not use the foobar element", it would imply that documents are not allowed to contain elements named foobar.

There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.

User agents fall into several (overlapping) categories with different conformance requirements.

Web browsers and other interactive user agents

Web browsers that support the XHTML syntax must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.

A conforming XHTML processor would, upon finding an XHTML script element in an XML document, execute the script contained in that element. However, if the element is found within a transformation expressed in XSLT (assuming the user agent also supports XSLT), then the processor would instead treat the script element as an opaque element that forms part of the transform.

Web browsers that support the HTML syntax must process documents labeled with an HTML MIME type as described in this specification, so that users can interact with them.

User agents that support scripting must also be conforming implementations of the IDL fragments in this specification, as described in the Web IDL specification. [WEBIDL]

Unless explicitly stated, specifications that override the semantics of HTML elements do not override the requirements on DOM objects representing those elements. For example, the script element in the example above would still implement the HTMLScriptElement interface.

Non-interactive presentation user agents

User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.

Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.

A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.

Visual user agents that support the suggested default rendering

User agents, whether interactive or not, may be designated (possibly as a user option) as supporting the suggested default rendering defined by this specification.

This is not required. In particular, even user agents that do implement the suggested default rendering are encouraged to offer settings that override this default to improve the experience for the user, e.g. changing the color contrast, using different focus styles, or otherwise making the experience more accessible and usable to the user.

User agents that are designated as supporting the suggested default rendering must, while so designated, implement the rules in the rendering section that that section defines as the behavior that user agents are expected to implement.

User agents with no scripting support

Implementations that do not support scripting (or which have their scripting features disabled entirely) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.

Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.

Conformance checkers

Conformance checkers must verify that a document conforms to the applicable conformance criteria described in this specification. Automated conformance checkers are exempt from detecting errors that require interpretation of the author's intent (for example, while a document is non-conforming if the content of a blockquote element is not a quote, conformance checkers running without the input of human judgement do not have to check that blockquote elements only contain quoted material).

Conformance checkers must check that the input document conforms when parsed without a browsing context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and should also check that the input document conforms when parsed with a browsing context in which scripts execute, and that the scripts never cause non-conforming states to occur other than transiently during script execution itself. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE])

The term "HTML validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification.

XML DTDs cannot express all the conformance requirements of this specification. Therefore, a validating XML processor and a DTD cannot constitute a conformance checker. Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.

To put it another way, there are three types of conformance criteria:

  1. Criteria that can be expressed in a DTD.
  2. Criteria that cannot be expressed by a DTD, but can still be checked by a machine.
  3. Criteria that can only be checked by a human.

A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.

Data mining tools

Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance with the semantics of the documents that they process.

A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.

Authoring tools and markup generators

Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.

Authoring tools are exempt from the strict requirements of using elements only for their specified purpose, but only to the extent that authoring tools are not yet able to determine author intent. However, authoring tools must not automatically misuse elements or encourage their users to do so.

For example, it is not conforming to use an address element for arbitrary contact information; that element can only be used for marking up contact information for the author of the document or section. However, since an authoring tool is likely unable to determine the difference, an authoring tool is exempt from that requirement. This does not mean, though, that authoring tools can use address elements for any block of italics text (for instance); it just means that the authoring tool doesn't have to verify that when the user uses a tool for inserting contact information for a section, that the user really is doing that and not inserting something else instead.

In terms of conformance checking, an editor has to output documents that conform to the same extent that a conformance checker will verify.

When an authoring tool is used to edit a non-conforming document, it may preserve the conformance errors in sections of the document that were not edited during the editing session (i.e. an editing tool is allowed to round-trip erroneous content). However, an authoring tool must not claim that the output is conformant if errors have been so preserved.

Authoring tools are expected to come in two broad varieties: tools that work from structure or semantic data, and tools that work on a What-You-See-Is-What-You-Get media-specific editing basis (WYSIWYG).

The former is the preferred mechanism for tools that author HTML, since the structure in the source information can be used to make informed choices regarding which HTML elements and attributes are most appropriate.

However, WYSIWYG tools are legitimate. WYSIWYG tools should use elements they know are appropriate, and should not use elements that they do not know to be appropriate. This might in certain extreme cases mean limiting the use of flow elements to just a few elements, like div, b, i, and span and making liberal use of the style attribute.

All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.

User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations. (This is a fingerprinting vector.)

For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as the XHTML syntax), and one using a custom format inspired by SGML (referred to as the HTML syntax). Implementations must support at least one of these two formats, although supporting both is encouraged.

Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents. Similarly, some conformance requirements are phrased as requirements on authors; such requirements are to be interpreted as conformance requirements on the documents that authors produce. (In other words, this specification does not distinguish between conformance criteria on authors and conformance criteria on documents.)

2.2.2 Dependencies

This specification relies on several other underlying specifications.

Unicode and Encoding

The Unicode character set is used to represent textual data, and the WHATWG Encoding standard defines requirements around character encodings. [UNICODE]

This specification introduces terminology based on the terms defined in those specifications, as described earlier.

The following terms are used as defined in the WHATWG Encoding standard: [ENCODING]

  • Getting an encoding
  • The encoder and decoder algorithms for various encodings, including the UTF-8 encoder and UTF-8 decoder
  • The generic decode algorithm which takes a byte stream and an encoding and returns a character stream
  • The UTF-8 decode algorithm which takes a byte stream and returns a character stream, additionally stripping one leading UTF-8 Byte Order Mark (BOM), if any

The UTF-8 decoder is distinct from the UTF-8 decode algorithm. The latter first strips a Byte Order Mark (BOM), if any, and then invokes the former.

For readability, character encodings are sometimes referenced in this specification with a case that differs from the canonical case given in the WHATWG Encoding standard. (For example, "UTF-16LE" instead of "utf-16le".)


Implementations that support the XHTML syntax must support some version of XML, as well as its corresponding namespaces specification, because that syntax uses an XML serialization with namespaces. [XML] [XMLNS]


The following terms are defined in the WHATWG URL standard: [URL]

  • URL
  • Absolute URL
  • Relative URL
  • Relative schemes
  • The URL parser
  • Parsed URL
  • The scheme component of a parsed URL
  • The scheme data component of a parsed URL
  • The username component of a parsed URL
  • The password component of a parsed URL
  • The host component of a parsed URL
  • The port component of a parsed URL
  • The path component of a parsed URL
  • The query component of a parsed URL
  • The fragment component of a parsed URL
  • Parse errors from the URL parser
  • The URL serializer
  • Default encode set
  • Percent encode
  • UTF-8 percent encode
  • Percent decode
  • Decoder error
  • The domain label to ASCII algorithm
  • The domain label to Unicode algorithm
  • URLUtils interface
  • URLUtilsReadOnly interface
  • href attribute
  • protocol attribute
  • The get the base hook for URLUtils
  • The update steps hook for URLUtils
  • The set the input algorithm for URLUtils
  • The query encoding of an URLUtils object
  • The input of an URLUtils object
  • The url of an URLUtils object

The following terms are defined in the Cookie specification: [COOKIES]

  • cookie-string
  • receives a set-cookie-string

The following terms are defined in the CORS specification: [CORS]

  • cross-origin request
  • cross-origin request status
  • custom request headers
  • simple cross-origin request
  • redirect steps
  • omit credentials flag
  • resource sharing check

The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]

The terms supported property indices, determine the value of an indexed property, support named properties, supported property names, unenumerable, determine the value of a named property, platform array objects, and read only (when applied to arrays) are used as defined in the Web IDL specification. The algorithm to convert a DOMString to a sequence of Unicode characters is similarly that defined in the Web IDL specification.

When this specification requires a user agent to create a Date object representing a particular time (which could be the special value Not-a-Number), the milliseconds component of that time, if any, must be truncated to an integer, and the time value of the newly created Date object must represent the resulting truncated time.

For instance, given the time 23045 millionths of a second after 01:00 UTC on January 1st 2000, i.e. the time 2000-01-01T00:00:00.023045Z, then the Date object created representing that time would represent the same time as that created representing the time 2000-01-01T00:00:00.023Z, 45 millionths earlier. If the given time is NaN, then the result is a Date object that represents a time value NaN (indicating that the object does not represent a specific instant of time).


Some parts of the language described by this specification only support JavaScript as the underlying scripting language. [ECMA262]

The term "JavaScript" is used to refer to ECMA262, rather than the official term ECMAScript, since the term JavaScript is more widely known. Similarly, the MIME type used to refer to JavaScript in this specification is text/javascript, since that is the most commonly used type, despite it being an officially obsoleted type according to RFC 4329. [RFC4329]

The term JavaScript global environment refers to the global environment concept defined in the ECMAScript specification.

The ECMAScript SyntaxError exception is also defined in the ECMAScript specification. [ECMA262]

The ArrayBuffer and related object types and underlying concepts from the ECMAScript Specification are used for several features in this specification. [ECMA262]

The following helper IDL is used for referring to ArrayBuffer-related types:

typedef (Int8Array or Uint8Array or Uint8ClampedArray or
         Int16Array or Uint16Array or
         Int32Array or Uint32Array or
         Float32Array or Float64Array) ArrayBufferView;

In particular, the Uint8ClampedArray type is used by some 2D canvas APIs, and the WebSocket API uses ArrayBuffer objects for handling binary frames.


The Document Object Model (DOM) is a representation — a model — of a document and its content. The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM. [DOM]

Implementations must support DOM and the events defined in DOM Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM interfaces. [DOM] [DOMEVENTS]

In particular, the following features are defined in the DOM specification: [DOM]

  • Attr interface
  • Comment interface
  • DOMImplementation interface
  • Document interface
  • DocumentFragment interface
  • DocumentType interface
  • DOMException interface
  • ChildNode interface
  • Element interface
  • Node interface
  • NodeList interface
  • ProcessingInstruction interface
  • Text interface
  • HTMLCollection interface
  • item() method
  • The terms collections and represented by the collection
  • DOMTokenList interface
  • DOMSettableTokenList interface
  • createDocument() method
  • createHTMLDocument() method
  • createElement() method
  • createElementNS() method
  • getElementById() method
  • insertBefore() method
  • ownerDocument attribute
  • childNodes attribute
  • localName attribute
  • parentNode attribute
  • namespaceURI attribute
  • tagName attribute
  • id attribute
  • textContent attribute
  • The insert, append, remove, replace, and adopt algorithms for nodes
  • The nodes are inserted and nodes are removed concepts
  • An element's adopting steps
  • The attribute list concept.
  • The data of a text node.
  • Event interface
  • EventTarget interface
  • EventInit dictionary type
  • target attribute
  • isTrusted attribute
  • The type of an event
  • The concept of an event listener and the event listeners associated with an EventTarget
  • The concept of a regular event parent and a cross-boundary event parent
  • The encoding (herein the character encoding) and content type of a Document
  • The distinction between XML documents and HTML documents
  • The terms quirks mode, limited-quirks mode, and no-quirks mode
  • The algorithm to clone a Node, and the concept of cloning steps used by that algorithm
  • The concept of base URL change steps and the definition of what happens when an element is affected by a base URL change
  • The concept of an element's unique identifier (ID)
  • The concept of a DOM range, and the terms start, end, and boundary point as applied to ranges.
  • MutationObserver interface
  • The MutationObserver scripting environment concept
  • The invoke MutationObserver objects algorithm
  • Promise interface
  • The resolver concept
  • The fulfill and reject algorithms

The term throw in this specification is used as defined in the DOM specification. The following DOMException types are defined in the DOM specification: [DOM]

  1. IndexSizeError
  2. HierarchyRequestError
  3. WrongDocumentError
  4. InvalidCharacterError
  5. NoModificationAllowedError
  6. NotFoundError
  7. NotSupportedError
  8. InvalidStateError
  9. SyntaxError
  10. InvalidModificationError
  11. NamespaceError
  12. InvalidAccessError
  13. SecurityError
  14. NetworkError
  15. AbortError
  16. URLMismatchError
  17. QuotaExceededError
  18. TimeoutError
  19. InvalidNodeTypeError
  20. DataCloneError

For example, to throw a TimeoutError exception, a user agent would construct a DOMException object whose type was the string "TimeoutError" (and whose code was the number 23, for legacy reasons) and actually throw that object as an exception.

The following features are defined in the DOM Events specification: [DOMEVENTS]

  • MouseEvent interface
  • MouseEventInit dictionary type
  • The UIEvent interface's detail attribute
  • click event
  • dblclick event
  • mousedown event
  • mouseenter event
  • mouseleave event
  • mousemove event
  • mouseout event
  • mouseover event
  • mouseup event
  • mousewheel event
  • keydown event
  • keyup event
  • keypress event

The following features are defined in the Touch Events specification: [TOUCH]

  • Touch interface
  • Touch point concept

This specification sometimes uses the term name to refer to the event's type; as in, "an event named click" or "if the event name is keypress". The terms "name" and "type" for events are synonymous.

The following features are defined in the DOM Parsing and Serialization specification: [DOMPARSING]

  • innerHTML
  • outerHTML

User agents are also encouraged to implement the features described in the HTML Editing APIs and UndoManager and DOM Transaction specifications. [EDITING] [UNDO]

The following parts of the Fullscreen specification are referenced from this specification, in part to define the rendering of dialog elements, and also to define how the Fullscreen API interacts with the sandboxing features in HTML: [FULLSCREEN]

  • The top layer concept
  • requestFullscreen()
  • The fullscreen enabled flag
  • The fully exit fullscreen algorithm
File API

This specification uses the following features defined in the File API specification: [FILEAPI]

  • Blob
  • File
  • FileList
  • Blob.close()
  • Blob.type
  • The concept of read errors

This specification references the XMLHttpRequest specification to describe how the two specifications interact and to use its ProgressEvent features. The following features and terms are defined in the XMLHttpRequest specification: [XHR]

  • XMLHttpRequest
  • ProgressEvent
  • Fire a progress event named e
Media Queries

Implementations must support the Media Queries language. [MQ]

CSS modules

While support for CSS as a whole is not required of implementations of this specification (though it is encouraged, at least for Web browsers), some features are defined in terms of specific CSS requirements.

In particular, some features require that a string be parsed as a CSS <color> value. When parsing a CSS value, user agents are required by the CSS specifications to apply some error handling rules. These apply to this specification also. [CSSCOLOR] [CSS]

For example, user agents are required to close all open constructs upon finding the end of a style sheet unexpectedly. Thus, when parsing the string "rgb(0,0,0" (with a missing close-parenthesis) for a color value, the close parenthesis is implied by this error handling rule, and a value is obtained (the color 'black'). However, the similar construct "rgb(0,0," (with both a missing parenthesis and a missing "blue" value) cannot be parsed, as closing the open construct does not result in a viable value.

The term CSS element reference identifier is used as defined in the CSS Image Values and Replaced Content specification to define the API that declares identifiers for use with the CSS 'element()' function. [CSSIMAGES]

Similarly, the term provides a paint source is used as defined in the CSS Image Values and Replaced Content specification to define the interaction of certain HTML elements with the CSS 'element()' function. [CSSIMAGES]

The term default object size is also defined in the CSS Image Values and Replaced Content specification. [CSSIMAGES]

Implementations that support scripting must support the CSS Object Model. The following features and terms are defined in the CSSOM specifications: [CSSOM] [CSSOMVIEW]

  • Screen
  • LinkStyle
  • CSSStyleDeclaration
  • cssText attribute of CSSStyleDeclaration
  • StyleSheet
  • The terms create a CSS style sheet, remove a CSS style sheet, and associated CSS style sheet
  • CSS style sheets and their properties: type, location, parent CSS style sheet, owner node, owner CSS rule, media, title, alternate flag, disabled flag, CSS rules, origin-clean flag
  • Alternative style sheet sets and the preferred style sheet set
  • Serializing a CSS value
  • Scroll an element into view
  • Scroll to the beginning of the document
  • The resize event
  • The scroll event

The term environment encoding is defined in the CSS Syntax specifications. [CSSSYNTAX]

The term CSS styling attribute is defined in the CSS Style Attributes specification. [CSSATTR]

The CanvasRenderingContext2D object's use of fonts depends on the features described in the CSS Fonts and Font Load Events specifications, including in particular FontLoader. [CSSFONTS] [CSSFONTLOAD]


The following interface is defined in the SVG specification: [SVG]

  • SVGMatrix

The following interface is defined in the WebGL specification: [WEBGL]

  • WebGLRenderingContext

Implementations may support WebVTT as a text track format for subtitles, captions, chapter titles, metadata, etc, for media resources. [WEBVTT]

The following terms, used in this specification, are defined in the WebVTT specification:

  • WebVTT file
  • WebVTT file using cue text
  • WebVTT file using chapter title text
  • WebVTT file using only nested cues
  • WebVTT parser
  • The rules for updating the display of WebVTT text tracks
  • The rules for interpreting WebVTT cue text
  • The WebVTT text track cue writing direction
The WebSocket protocol

The following terms are defined in the WebSocket protocol specification: [WSP]

  • establish a WebSocket connection
  • the WebSocket connection is established
  • validate the server's response
  • extensions in use
  • subprotocol in use
  • headers to send appropriate cookies
  • cookies set during the server's opening handshake
  • a WebSocket message has been received
  • send a WebSocket Message
  • fail the WebSocket connection
  • close the WebSocket connection
  • start the WebSocket closing handshake
  • the WebSocket closing handshake is started
  • the WebSocket connection is closed (possibly cleanly)
  • the WebSocket connection close code
  • the WebSocket connection close reason

The terms strong native semantics is used as defined in the ARIA specification. The term default implicit ARIA semantics has the same meaning as the term implicit WAI-ARIA semantics as used in the ARIA specification. [ARIA]

The role and aria-* attributes are defined in the ARIA specification. [ARIA]

This specification does not require support of any particular network protocol, style sheet language, scripting language, or any of the DOM specifications beyond those required in the list above. However, the language described by this specification is biased towards CSS as the styling language, JavaScript as the scripting language, and HTTP as the network protocol, and several features assume that those languages and protocols are in use.

A user agent that implements the HTTP protocol must implement the Web Origin Concept specification and the HTTP State Management Mechanism specification (Cookies) as well. [HTTP] [ORIGIN] [COOKIES]

This specification might have certain additional requirements on character encodings, image formats, audio formats, and video formats in the respective sections.

2.2.3 Extensibility

Vendor-specific proprietary user agent extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.

If such extensions are nonetheless needed, e.g. for experimental purposes, then vendors are strongly urged to use one of the following extension mechanisms:

Attribute names beginning with the two characters "x-" are reserved for user agent use and are guaranteed to never be formally added to the HTML language. For flexibility, attributes names containing underscores (the U+005F LOW LINE character) are also reserved for experimental purposes and are guaranteed to never be formally added to the HTML language.

Pages that use such attributes are by definition non-conforming.

For DOM extensions, e.g. new methods and IDL attributes, the new members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification.

For events, experimental event types should be prefixed with vendor-specific strings.

For example, if a user agent called "Pleasold" were to add an event to indicate when the user is going up in an elevator, it could use the prefix "pleasold" and thus name the event "pleasoldgoingup", possibly with an event handler attribute named "onpleasoldgoingup".

All extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.

For example, while strongly discouraged from doing so, an implementation "Foo Browser" could add a new IDL attribute "fooTypeTime" to a control's DOM interface that returned the time it took the user to select the current value of a control (say). On the other hand, defining a new control that appears in a form's elements array would be in violation of the above requirement, as it would violate the definition of elements given in this specification.

When adding new reflecting IDL attributes corresponding to content attributes of the form "x-vendor-feature", the IDL attribute should be named "vendorFeature" (i.e. the "x" is dropped from the IDL attribute's name).

When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. When someone applying this specification to their activities decides that they will recognize the requirements of such an extension specification, it becomes an applicable specification for the purposes of conformance requirements in this specification.

Someone could write a specification that defines any arbitrary byte stream as conforming, and then claim that their random junk is conforming. However, that does not mean that their random junk actually is conforming for everyone's purposes: if someone else decides that that specification does not apply to their work, then they can quite legitimately say that the aforementioned random junk is just that, junk, and not conforming at all. As far as conformance goes, what matters in a particular community is what that community agrees is applicable.

User agents must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them.

When support for a feature is disabled (e.g. as an emergency measure to mitigate a security problem, or to aid in development, or for performance reasons), user agents must act as if they had no support for the feature whatsoever, and as if the feature was not mentioned in this specification. For example, if a particular feature is accessed via an attribute in a Web IDL interface, the attribute itself would be omitted from the objects that implement that interface — leaving the attribute on the object but making it return null or throw an exception is insufficient.

2.2.4 Interactions with XPath and XSLT

Implementations of XPath 1.0 that operate on HTML documents parsed or created in the manners described in this specification (e.g. as part of the document.evaluate() API) must act as if the following edit was applied to the XPath 1.0 specification.

First, remove this paragraph:

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

Then, insert in its place the following:

A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. If the QName has a prefix, then there must be a namespace declaration for this prefix in the expression context, and the corresponding namespace URI is the one that is associated with this prefix. It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

If the QName has no prefix and the principal node type of the axis is element, then the default element namespace is used. Otherwise if the QName has no prefix, the namespace URI is null. The default element namespace is a member of the context for the XPath expression. The value of the default element namespace when executing an XPath expression through the DOM3 XPath API is determined in the following way:

  1. If the context node is from an HTML DOM, the default element namespace is "http://www.w3.org/1999/xhtml".
  2. Otherwise, the default element namespace URI is null.

This is equivalent to adding the default element namespace feature of XPath 2.0 to XPath 1.0, and using the HTML namespace as the default element namespace for HTML documents. It is motivated by the desire to have implementations be compatible with legacy HTML content while still supporting the changes that this specification introduces to HTML regarding the namespace used for HTML elements, and by the desire to use XPath 1.0 rather than XPath 2.0.

This change is a willful violation of the XPath 1.0 specification, motivated by desire to have implementations be compatible with legacy content while still supporting the changes that this specification introduces to HTML regarding which namespace is used for HTML elements. [XPATH10]

XSLT 1.0 processors outputting to a DOM when the output method is "html" (either explicitly or via the defaulting rule in XSLT 1.0) are affected as follows:

If the transformation program outputs an element in no namespace, the processor must, prior to constructing the corresponding DOM element node, change the namespace of the element to the HTML namespace, ASCII-lowercase the element's local name, and ASCII-lowercase the names of any non-namespaced attributes on the element.

This requirement is a willful violation of the XSLT 1.0 specification, required because this specification changes the namespaces and case-sensitivity rules of HTML in a manner that would otherwise be incompatible with DOM-based XSLT transformations. (Processors that serialize the output are unaffected.) [XSLT10]

This specification does not specify precisely how XSLT processing interacts with the HTML parser infrastructure (for example, whether an XSLT processor acts as if it puts any elements into a stack of open elements). However, XSLT processors must stop parsing if they successfully complete, and must set the current document readiness first to "interactive" and then to "complete" if they are aborted.

This specification does not specify how XSLT interacts with the navigation algorithm, how it fits in with the event loop, nor how error pages are to be handled (e.g. whether XSLT errors are to replace an incremental XSLT output, or are rendered inline, etc).

There are also additional non-normative comments regarding the interaction of XSLT and HTML in the script element section, and of XSLT, XPath, and HTML in the template element section.

2.3 Case-sensitivity and string comparison

Comparing two strings in a case-sensitive manner means comparing them exactly, code point for code point.

Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.

Comparing two strings in a compatibility caseless manner means using the Unicode compatibility caseless match operation to compare the two strings, with no language-specific tailoirings. [UNICODE]

Except where otherwise stated, string comparisons must be performed in a case-sensitive manner.

Converting a string to ASCII uppercase means replacing all characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) with the corresponding characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z).

Converting a string to ASCII lowercase means replacing all characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) with the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z).

A string pattern is a prefix match for a string s when pattern is not longer than s and truncating s to pattern's length leaves the two strings as matches of each other.