This is a frozen copy of the spec, to act as a more stable point for people who wish to review the specification.
You can take part in this work: join the working group's technical discussion mailing list, read our FAQ, take part in the forum, ask newbie questions in the help mailing list, read or post on our blog, or talk with us on IRC.
© Copyright 2004-2007 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA.
You are granted a license to use, reproduce and create derivative works of this document.
This specification introduces features to HTML and the DOM that ease the authoring of Web-based applications. Additions include the context menus, a direct-mode graphics canvas, inline popup windows, and server-sent events.
This is a work in progress, though this copy of the document is a frozen working draft. Comments are very welcome, please send them to whatwg@whatwg.org. Thank you.
Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the call for implementations should join the WHATWG mailing list and take part in the discussions.
This specification is also being produced by the W3C HTML WG. The two specifications are identical from the table of contents onwards.
This specification is intended to replace (be the new version of) what was previously the HTML4, XHTML 1.x, and DOM2 HTML specifications.
Different parts of this specification are at different levels of maturity.
Some of the more major known issues are marked like
this. There are many other issues that have been raised as well; the
issues given in this document are not the only known issues! There are
also some spec-wide issues that have not yet been addressed:
case-sensitivity is a very poorly handled topic right now, and the firing
of events needs to be unified (right now some bubble, some don't, they all
use different text to fire events, etc). It would also be nice to unify
the rules on downloading content when attributes change (e.g. src attributes) - should they initiate downloads when the
element immediately, is inserted in the document, when active scripts end,
etc. This matters e.g. if an attribute is set twice in a row (does it hit
the network twice).
body element
section element
nav element
article element
blockquote element
aside element
h1, h2, h3, h4,
h5, and h6 elements
header element
footer element
address element
a element
q element
cite element
em element
strong element
small element
m element
dfn element
abbr element
time element
progress element
meter element
code element
var element
samp element
kbd element
sup and sub elements
span element
i element
b element
bdo element
figure element
img element
iframe element
embed element
object element
param element
video element
audio element
source element
canvas element
map element
area element
form element
fieldset element
input element
button element
label element
select element
datalist element
optgroup element
option element
textarea element
output element
details element
datagrid element
datagrid data model
datagrid element
datagrid
command element
menu element
Storage interface
StorageItem interface
sessionStorage attribute
globalStorage attribute
storage event
alternate"
archives"
author"
bookmark"
contact"
external"
feed"
help"
icon"
license"
nofollow"
pingback"
prefetch"
search"
stylesheet"
sidebar"
tag"
contenteditable attribute
This section is non-normative.
The World Wide Web's markup language has always been HTML. HTML was primarily designed as a language for semantically describing scientific documents, although its general design and adaptations over the years has enabled it to be used to describe a number of other types of documents.
The main area that has not been adequately addressed by HTML is a vague subject referred to as Web Applications. This specification attempts to rectify this, while at the same time updating the HTML specifications to address issues raised in the past few years.
This section is non-normative.
This specification is limited to providing a semantic-level markup language and associated semantic-level scripting APIs for authoring accessible pages on the Web ranging from static documents to dynamic applications.
The scope of this specification does not include addressing presentation concerns (although default rendering rules for Web browsers are included at the end of this specification).
The scope of this specification does not include documenting every HTML
or DOM feature supported by Web browsers. Browsers support many features
that are considered to be very bad for accessibility or that are otherwise
inappropriate. For example, the blink element is clearly
presentational and authors wishing to cause text to blink should instead
use CSS.
The scope of this specification is not to describe an entire operating system. In particular, hardware configuration software, image manipulation tools, and applications that users would be expected to use with high-end workstations on a daily basis are out of scope. In terms of applications, this specification is targeted specifically at applications that would be expected to be used by users on an occasional basis, or regularly but from disparate locations, with low CPU requirements. For instance online purchasing systems, searching systems, games (especially multiplayer online games), public telephone books or address books, communications software (e-mail clients, instant messaging clients, discussion software), document editing software, etc.
For sophisticated cross-platform applications, there already exist several proprietary solutions (such as Mozilla's XUL and Macromedia's Flash). These solutions are evolving faster than any standards process could follow, and the requirements are evolving even faster. These systems are also significantly more complicated to specify, and are orders of magnitude more difficult to achieve interoperability with, than the solutions described in this document. Platform-specific solutions for such sophisticated applications (for example the MacOS X Core APIs) are even further ahead.
This section is non-normative.
This specification represents a new version of HTML4 and XHTML1, along with a new version of the associated DOM2 HTML API. Migration from HTML4 or XHTML1 to the format and APIs described in this specification should in most cases be straightforward, as care has been taken to ensure that backwards-compatibility is retained.
This specification will eventually supplant Web Forms 2.0 as well. [WF2]
This section is non-normative.
XHTML2 [XHTML2] defines a new HTML vocabulary with better features for hyperlinks, multimedia content, annotating document edits, rich metadata, declarative interactive forms, and describing the semantics of human literary works such as poems and scientific papers.
However, it lacks elements to express the semantics of many of the non-document types of content often seen on the Web. For instance, forum sites, auction sites, search engines, online shops, and the like, do not fit the document metaphor well, and are not covered by XHTML2.
This specification aims to extend HTML so that it is also suitable in these contexts.
XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor.
This section is non-normative.
This specification is independent of the various proprietary UI languages that various vendors provide. As an open, vender-neutral language, HTML provides for a solution to the same problems without the risk of vendor lock-in.
This section is non-normative.
This specification is divided into the following important sections:
There are also a couple of appendices, defining shims for WYSIWYG editors, rendering rules for Web browsers, and listing areas that are out of scope for this specification.
This specification should be read like all other specifications. First, it should be read cover-to-cover, multiple times. Then, it should be read backwards at least once. Then it should be read by picking random sections from the contents list and following all the cross-references.
All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in RFC2119. For readability, these words do not appear in all uppercase letters in this specification. [RFC2119]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
This specification describes the conformance criteria for user agents (relevant to implementors) and documents (relevant to authors and authoring tool implementors).
There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.
User agents fall into several (overlapping) categories with different conformance requirements.
Web browsers that support XHTML must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.
A conforming XHTML processor would, upon finding an
XHTML script element in an XML
document, execute the script contained in that element. However, if the
element is found within an XSLT transformation sheet (assuming the UA
also supports XSLT), then the processor would instead treat the script element as an opaque element that
forms part of the transform.
Web browsers that support HTML must
process documents labelled as text/html as described in
this specification, so that users can interact with them.
User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.
Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.
A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.
Implementations that do not support scripting (or which have their scripting features disabled) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.
Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.
Conformance checkers must verify that a document conforms to the
applicable conformance criteria described in this specification.
Conformance checkers are exempt from detecting errors that require
interpretation of the author's intent (for example, while a document is
non-conforming if the content of a blockquote element is not a quote,
conformance checkers do not have to check that blockquote elements only contain quoted
material).
Conformance checkers must check that the input document conforms when scripting is disabled, and should also check that the input document conforms when scripting is enabled. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [HALTINGPROBLEM])
The term "HTML5 validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification.
XML DTDs cannot express all the conformance requirements of this specification. Therefore, a validating XML processor and a DTD cannot constitute a conformance checker. Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.
To put it another way, there are three types of conformance criteria:
A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.
Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance to the semantics of the documents that they process.
A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.
Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.
Authoring tools are exempt from the strict requirements of using elements only for their specified purpose, but only to the extent that authoring tools are not yet able to determine author intent.
For example, it is not conforming to use an address element for arbitrary contact
information; that element can only be used for marking up contact
information for the author of the document or section. However, since an
authoring tools is likely unable to determine the difference, an
authoring tool is exempt from that requirement.
In terms of conformance checking, an editor is therefore required to output documents that conform to the same extent that a conformance checker will verify.
When an authoring tool is used to edit a non-conforming document, it may preserve the conformance errors in sections of the document that were not edited during the editing session (i.e. an editing tool is allowed to round-trip errorneous content). However, an authoring tool must not claim that the output is conformant if errors have been so preserved.
Authoring tools are expected to come in two broad varieties: tools that work from structure or semantic data, and tools that work on a What-You-See-Is-What-You-Get media-specific editing basis (WYSIWYG).
The former is the preferred mechanism for tools that author HTML, since the structure in the source information can be used to make informed choices regarding which HTML elements and attributes are most appropriate.
However, WYSIWYG tools are legitimate, and this specification makes certain concessions to WYSIWYG editors.
All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.
Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories; those describing content model restrictions, and those describing implementation behaviour. The former category of requirements are requirements on documents and authoring tools. The second category are requirements on user agents.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as XHTML5), and one using a custom format inspired by SGML (referred to as HTML5). Implementations may support only one of these two formats, although supporting both is encouraged.
Such XML documents may contain a DOCTYPE if desired, but
this is not required to conform to this specification.
According to the XML specification, XML processors are not guaranteed to process the external DTD subset referenced in the DOCTYPE. This means, for example, that using entities for characters in XHTML documents is unsafe (except for <, >, &, " and '). For interoperability, authors are advised to avoid optional features of XML.
The language in this specification assumes that the user agent expands all entity references, and therefore does not include entity reference nodes in the DOM. If user agents do include entity reference nodes in the DOM, then user agents must handle them as if they were fully expanded when implementing this specification. For example, if a requirement talks about an element's child text nodes, then any text nodes that are children of an entity reference that is a child of that element would be used as well.
A lot of arrays/lists/collections in this spec assume zero-based indexes but use the term "indexth" liberally. We should define those to be zero-based and be clearer about this.
Unless other specified, if a DOM attribute that is a floating point
number type (float) is assigned an Infinity or
Not-a-Number value, a NOT_SUPPORTED_ERR
exception must be raised.
Unless other specified, if a DOM attribute that is a signed numeric type
is assigned a negative value, a NOT_SUPPORTED_ERR exception must be raised.
Unless other specified, if a method with an argument that is a floating
point number type (float) is passed an Infinity or
Not-a-Number value, a NOT_SUPPORTED_ERR
exception must be raised.
Unless other specified, if a method is passed fewer arguments than is
defined for that method in its IDL definition, a NOT_SUPPORTED_ERR exception must be raised.
Unless other specified, if a method is passed more arguments than is defined for that method in its IDL definition, the excess arguments must be ignored.
Unless other specified, if a method is expecting, as one of its
arguments, as defined by its IDL definition, an object implementing a
particular interface X, and the argument passed is an
object whose [[Class]] property is neither that interface X, nor the name of an interface Y where
this specification requires that all objects implementing interface Y also implement interface X, nor the
name of an interface that inherits from the expected interface X, then a TYPE_MISMATCH_ERR exception
must be raised.
Anything else? Passing the wrong type of object, maybe? Implied conversions to int/float?
This specification relies on several other underlying specifications.
Implementations that support XHTML5 must support some version of XML, as well as its corresponding namespaces specification, because XHTML5 uses an XML serialisation with namespaces. [XML] [XMLNAMES]
User agents must follow the rules given by XML Base to resolve relative URIs in HTML and XHTML fragments. That is the mechanism used in this specification for resolving relative URIs in DOM trees. [XMLBASE]
It is possible for xml:base attributes to be present even in
HTML fragments, as such attributes can be added dynamically using
script.
Implementations must support some version of DOM Core and DOM Events, because this specification is defined in terms of the DOM, and some of the features are defined as extensions to the DOM Core interfaces. [DOM3CORE] [DOM3EVENTS]
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings for DOM Specifications specification, as this specification uses that specification's terminology. [EBFD]
This specification does not require support of any particular network transport protocols, image formats, audio formats, video formats, style sheet language, scripting language, or any of the DOM and WebAPI specifications beyond those described above. However, the language described by this specification is biased towards CSS as the styling language, ECMAScript as the scripting language, and HTTP as the network protocol, and several features assume that those languages and protocols are in use.
Some elements are defined in terms of their DOM textContent attribute. This is an
attribute defined on the Node interface in DOM3 Core. [DOM3CORE]
Should textContent be defined differently for dir="" and <bdo>? Should we come up with an alternative to textContent that handles those and other things, like alt=""?
The term activation behavior is used as defined in the DOM3 Events specification. [DOM3EVENTS] At the time of writing, DOM3 Events hadn't yet been updated to define that phrase.
The rules for handling alternative style sheets are defined in the CSS object model specification. [CSSOM]
Certain features are defined in terms of CSS <color> values. When
the CSS value currentColor is specified in this
context, the "computed value of the 'color' property" for the purposes of
determining the computed value of the currentColor
keyword is the computed value of the 'color' property on the element in
question. [CSS3COLOR]
If a canvas gradient's addColorStop() method is called with the
currentColor keyword as the color, then the computed
value of the 'color' property on the canvas element is the one that is used.
This specification refers to both HTML and XML attributes and DOM attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and DOM attributes for those from the DOM. Similarly, the term "properties" is used for both ECMAScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.
To ease migration from HTML to XHTML, UAs conforming
to this specification will place elements in HTML in the
http://www.w3.org/1999/xhtml namespace, at least for the
purposes of the DOM and CSS. The term "elements in the
HTML namespace", or "HTML elements" for
short, when used in this specification, thus refers to both HTML and XHTML
elements.
Unless otherwise stated, all elements defined or mentioned in this
specification are in the http://www.w3.org/1999/xhtml
namespace, and all attributes defined or mentioned in this specification
have no namespace (they are in the per-element partition).
The term HTML documents is sometimes used in contrast with XML documents to mean specifically documents that were parsed using an HTML parser (as opposed to using an XML parser or created purely through the DOM).
Generally, when the specification states that a feature applies to HTML or XHTML, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".
This specification uses the term document to refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.
For readability, the term URI is used to refer to both ASCII URIs and Unicode IRIs, as those terms are defined by RFC 3986 and RFC 3987 respectively. On the rare occasions where IRIs are not allowed but ASCII URIs are, this is called out explicitly. [RFC3986] [RFC3987]
The term root element, when not qualified to explicitly refer to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself is there is none. When the node is a part of the document, then that is indeed the document's root element. However, if the node is not currently part of the document tree, the root element will be an orphaned node.
An element is said to have been inserted into a document when its root element changes and is now the document's root element.
The term tree order means a pre-order,
depth-first traversal of DOM nodes involved (through the parentNode/childNodes relationship).
When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.
When an XML name, such as an attribute or element name, is referred to
in the form prefix:localName, as in xml:id or
svg:rect, it refers to a name with the local name localName and the namespace given by the prefix, as defined
by the following table:
xml
http://www.w3.org/XML/1998/namespace
html
http://www.w3.org/1999/xhtml
svg
http://www.w3.org/2000/svg
For simplicity, terms such as shown, displayed, and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.
Various DOM interfaces are defined in this specification using pseudo-IDL. This looks like OMG IDL but isn't. For instance, method overloading is used, and types from the W3C DOM specifications are used without qualification. Language-specific bindings for these abstract interface definitions must be derived in the way consistent with W3C DOM specifications. Some interface-specific binding information for ECMAScript is included in this specification.
The current situation with IDL blocks is pitiful. IDL is totally inadequate to properly represent what objects have to look like in JS; IDL can't say if a member is enumerable, what the indexing behaviour is, what the stringification behaviour is, what behaviour setting a member whose type is a particular interface should be (e.g. setting of document.location or element.className), what constructor an object implementing an interface should claim to have, how overloads work, etc. I think we should make the IDL blocks non-normative, and/or replace them with something else that is better for JS while still being clear on how it applies to other languages. However, we do need to have something that says what types the methods take as arguments, since we have to raise exceptions if they are wrong.
The construction "a Foo object", where Foo is
actually an interface, is sometimes used instead of the more accurate "an
object implementing the interface Foo".
A DOM attribute is said to be getting when its value is being retrieved (e.g. by author script), and is said to be setting when a new value is assigned to it.
If a DOM object is said to be live, then that means that any attributes returning that object must always return the same object (not a new object each time), and the attributes and methods on that object must operate on the actual underlying data, not a snapshot of the data.
The terms fire and dispatch are used interchangeably in the context of events, as in the DOM Events specifications. [DOM3EVENTS]
The term text node refers to any
Text node, including CDATASection nodes (any
Node with node type 3 or 4).
Some of the algorithms in this specification, for historical reasons, require the user agent to pause until some condition has been met. While a user agent is paused, it must ensure that no scripts execute (e.g. no event handlers, no timers, etc). User agents should remain responsive to user input while paused, however.
This section is non-normative.
This specification defines an abstract language for describing documents and applications, and some APIs for interacting with in-memory representations of resources that use this language.
The in-memory representation is known as "DOM5 HTML", or "the DOM" for short.
There are various concrete syntaxes that can be used to transmit resources that use this abstract language, two of which are defined in this specification.
The first such concrete syntax is "HTML5". This is the format
recommended for most authors. It is compatible with all legacy Web
browsers. If a document is transmitted with the MIME type text/html, then it will be processed as an "HTML5"
document by Web browsers.
The second concrete syntax uses XML, and is known as "XHTML5". When a
document is transmitted with an XML MIME type, such as application/xhtml+xml, then it is processed by an XML
processor by Web browsers, and treated as an "XHTML5" document. Generally
speaking, authors are discouraged from trying to use XML on the Web,
because XML has much stricter syntax rules than the "HTML5" variant
described above, and is relatively newer and therefore less mature.
The "DOM5 HTML", "HTML5", and "XHTML5" representations cannot all
represent the same content. For example, namespaces cannot be represented
using "HTML5", but they are supported in "DOM5 HTML" and "XHTML5".
Similarly, documents that use the noscript feature can be represented using
"HTML5", but cannot be represented with "XHTML5" and "DOM5 HTML". Comments
that contain the string "-->" can be represented
in "DOM5 HTML" but not in "HTML5" and "XHTML5". And so forth.
The Document Object Model (DOM) is a representation — a model — of a document and its content. [DOM3CORE] The DOM is not just an API; the conformance criteria of HTML implementations are defined, in this specification, in terms of operations on the DOM.
This specification defines the language represented in the DOM by
features together called DOM5 HTML. DOM5 HTML consists of DOM Core
Document nodes and DOM Core Element nodes, along
with text nodes and other content.
Elements in the DOM represent things; that is, they have intrinsic meaning, also known as semantics.
For example, a p element
represents a paragraph.
In addition, documents and elements in the DOM host APIs that extend the DOM Core APIs, providing new features to application developers using DOM5 HTML.
Every XML and HTML document in an HTML UA is represented by a
Document object. [DOM3CORE]
Document objects are assumed to be XML documents unless they are flagged as being HTML documents when they are created. Whether a document is
an HTML document or an XML document affects the
behaviour of certain APIs, as well as a few CSS rendering rules. [CSS21]
A Document object created by the createDocument() API on the DOMImplementation
object is initially an XML
document, but can be made into an HTML document by calling document.open() on it.
All Document objects (in user agents implementing this
specification) must also implement the HTMLDocument interface, available using
binding-specific methods. (This is the case whether or not the document in
question is an HTML document
or indeed whether it contains any HTML
elements at all.) Document objects must also implement
the document-level interface of any other namespaces found in the document
that the UA supports. For example, if an HTML implementation also supports
SVG, then the Document object must implement HTMLDocument and SVGDocument.
Because the HTMLDocument interface is now obtained
using binding-specific casting methods instead of simply being the primary
interface of the document object, it is no longer defined as inheriting
from Document.
interface HTMLDocument {
// Resource metadata management
readonly attribute Location location;
readonly attribute DOMString URL;
attribute DOMString domain;
readonly attribute DOMString referrer;
attribute DOMString cookie;
readonly attribute DOMString lastModified;
readonly attribute DOMString compatMode;
// DOM tree accessors
attribute DOMString title;
attribute DOMString dir;
attribute HTMLElement body;
readonly attribute HTMLCollection images;
readonly attribute HTMLCollection links;
readonly attribute HTMLCollection forms;
readonly attribute HTMLCollection anchors;
NodeList getElementsByName(in DOMString elementName);
NodeList getElementsByClassName(in DOMString classNames);
// Dynamic markup insertion
attribute DOMString innerHTML;
HTMLDocument open();
HTMLDocument open(in DOMString type);
HTMLDocument open(in DOMString type, in DOMString replace);
Window open(in DOMString url, in DOMString name, in DOMString features);
Window open(in DOMString url, in DOMString name, in DOMString features, in boolean replace);
void close();
void write(in DOMString text);
void writeln(in DOMString text);
// Interaction
readonly attribute Element activeElement;
readonly attribute boolean hasFocus;
// Commands
readonly attribute HTMLCollection commands;
// Editing
attribute boolean designMode;
boolean execCommand(in DOMString commandId);
boolean execCommand(in DOMString commandId, in boolean doShowUI);
boolean execCommand(in DOMString commandId, in boolean doShowUI, in DOMString value);
Selection getSelection();
};
Since the HTMLDocument
interface holds methods and attributes related to a number of disparate
features, the members of this interface are described in various different
sections.
User agents must raise a security exception
whenever any of the members of an HTMLDocument object are accessed by
scripts whose origin is not the same as the
Document's origin.
The URL attribute
must return the document's address.
The domain
attribute must be initialised to the document's domain, if it
has one, and null otherwise. On getting, the attribute must return its
current value. On setting, if the new value is an allowed value (as
defined below), the attribute's value must be changed to the new value. If
the new value is not an allowed value, then a security exception must be raised instead.
A new value is an allowed value for the document.domain
attribute if it is equal to the attribute's current value, or if the new
value, prefixed by a U+002E FULL STOP ("."), exactly matches the end of
the current value. If the current value is null, new values other than
null will never be allowed.
If the Document object's address is hierarchical and uses a
server-based naming authority, then its domain is the <host>/<ihost> part of that
address. Otherwise, it has no domain.
The domain attribute is used to enable pages on
different hosts of a domain to access each others' DOMs, though this is not yet defined by this
specification.
we should handle IP addresses here
The referrer attribute must
return either the URI of the page which navigated the browsing context
to the current document (if any), or the empty string (if there is no such
originating page, or if the UA has been configured not to report
referrers).
In the case of HTTP, the referrer DOM attribute will match the
Referer (sic) header that was sent when fetching the current
page.
The cookie
attribute must, on getting, return the same string as the value of the
Cookie HTTP header it would include if fetching the
resource indicated by the document's address over HTTP, as
per RFC 2109 section 4.3.4. [RFC2109]
On setting, the cookie attribute must cause the user agent to
act as it would when processing cookies if it had just attempted to fetch
the document's address over HTTP, and had received a response
with a Set-Cookie header whose value was the specified value,
as per RFC 2109 sections 4.3.1, 4.3.2, and 4.3.3. [RFC2109]
Since the cookie attribute is accessible across frames,
the path restrictions on cookies are only a tool to help manage which
cookies are sent to which parts of the site, and are not in any way a
security feature.
The lastModified attribute,
on getting, must return the date and time of the Document's
source file's last modification, in the user's local timezone, in the
following format:
All the numeric components above, other than the year, must be given as two digits in the range U+0030 DIGIT ZERO to U+0039 DIGIT NINE representing the number in base ten, zero-padded if necessary.
The Document's source file's last modification date and
time must be derived from relevant features of the networking protocols
used, e.g. from the value of the HTTP Last-Modified
header of the document, or from metadata in the filesystem for local
files. If the last modification date and time are not known, the attribute
must return the string 01/01/1970 00:00:00.
The compatMode DOM attribute
must return the literal string "CSS1Compat" unless
the document has been set to quirks mode by the HTML parser, in which case it must instead return the
literal string "BackCompat". The document can also
be set to limited quirks mode (also known as "almost
standards" mode). By default, the document is set to no
quirks mode (also known as "standards mode").
As far as parsing goes, the quirks I know of are:
The nodes representing HTML elements in the DOM must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. This includes XHTML elements in XML documents, even when those documents are in another context (e.g. inside an XSLT transform).
The basic interface, from which all the HTML
elements' interfaces inherit, and which must be used by elements that
have no additional requirements, is the HTMLElement interface.
interface HTMLElement : Element { // DOM tree accessors NodeList getElementsByClassName(in DOMString classNames); // dynamic markup insertion attribute DOMString innerHTML; // metadata attributes attribute DOMString id; attribute DOMString title; attribute DOMString lang; attribute DOMString dir; attribute DOMString className; readonly attribute DOMTokenList classList; // interaction attribute boolean irrelevant; attribute long tabIndex; void click(); void focus(); void blur(); void scrollIntoView(); void scrollIntoView(in boolean top); // commands attribute HTMLMenuElement contextMenu; // editing attribute boolean draggable; attribute DOMString contentEditable; // data templates attribute DOMString template; readonly attribute HTMLDataTemplateElement templateElement; attribute DOMString ref; readonly attribute Node refNode; attribute DOMString registrationMark; readonly attribute DocumentFragment originalContent; // event handler DOM attributes attribute EventListener onabort; attribute EventListener onbeforeunload; attribute EventListener onblur; attribute EventListener onchange; attribute EventListener onclick; attribute EventListener oncontextmenu; attribute EventListener ondblclick; attribute EventListener ondrag; attribute EventListener ondragend; attribute EventListener ondragenter; attribute EventListener ondragleave; attribute EventListener ondragover; attribute EventListener ondragstart; attribute EventListener ondrop; attribute EventListener onerror; attribute EventListener onfocus; attribute EventListener onkeydown; attribute EventListener onkeypress; attribute EventListener onkeyup; attribute EventListener onload; attribute EventListener onmessage; attribute EventListener onmousedown; attribute EventListener onmousemove; attribute EventListener onmouseout; attribute EventListener onmouseover; attribute EventListener onmouseup; attribute EventListener onmousewheel; attribute EventListener onresize; attribute EventListener onscroll; attribute EventListener onselect; attribute EventListener onsubmit; attribute EventListener onunload; };
As with the HTMLDocument
interface, the HTMLElement
interface holds methods and attributes related to a number of disparate
features, and the members of this interface are therefore described in
various different sections of this specification.
Some DOM attributes are defined to reflect a particular content attribute. This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value.
If a reflecting DOM attribute is a DOMString attribute
whose content attribute is defined to contain a URI, then on getting, the
DOM attribute must return the value of the content attribute, resolved to
an absolute URI, and on setting, must set the content attribute to the
specified literal value. If the content attribute is absent, the DOM
attribute must return the default value, if the content attribute has one,
or else the empty string.
If a reflecting DOM attribute is a DOMString whose content
attribute is an enumerated attribute, and the
DOM attribute is limited to only known values,
then, on getting, the DOM attribute must return the value associated with
the state the attribute is in (in its canonical case), or the empty string
if the attribute is in a state that has no associated keyword value; and
on setting, if the new value case-insensitively matches one of the
keywords given for that attribute, then the content attribute must be set
to that value, otherwise, if the new value is the empty string, then the
content attribute must be removed, otherwise, the setter must raise a
SYNTAX_ERR exception.
If a reflecting DOM attribute is a DOMString but doesn't
fall into any of the above categories, then the getting and setting must
be done in a transparent, case-preserving manner.
If a reflecting DOM attribute is a boolean attribute, then the DOM attribute must return true if the attribute is set, and false if it is absent. On setting, the content attribute must be removed if the DOM attribute is set to false, and must be set to have the same value as its name if the DOM attribute is set to true. (This corresponds to the rules for boolean content attributes.)
If a reflecting DOM attribute is a signed integer type
(long) then the content attribute must be parsed according to
the rules for parsing
signed integers first. If that fails, or if the attribute is absent,
the default value must be returned instead, or 0 if there is no default
value. On setting, the given value must be converted to a string
representing the number as a valid integer in base
ten and then that string must be used as the new content attribute value.
If a reflecting DOM attribute is an unsigned integer type
(unsigned long) then the content attribute must be parsed
according to the rules for parsing unsigned integers first. If that
fails, or if the attribute is absent, the default value must be returned
instead, or 0 if there is no default value. On setting, the given value
must be converted to a string representing the number as a valid non-negative integer in base ten and then that
string must be used as the new content attribute value.
If a reflecting DOM attribute is an unsigned integer type
(unsigned long) that is limited to only
positive non-zero numbers, then the behavior is similar to the
previous case, but zero is not allowed. On getting, the content attribute
must first be parsed according to the rules for parsing unsigned
integers, and if that fails, or if the attribute is absent, the
default value must be returned instead, or 1 if there is no default value.
On setting, if the value is zero, the user agent must fire an
INDEX_SIZE_ERR exception. Otherwise, the given value must be
converted to a string representing the number as a valid
non-negative integer in base ten and then that string must be used as
the new content attribute value.
If a reflecting DOM attribute is a floating point number type
(float) and the content attribute is defined to contain a
time offset, then the content attribute must be parsed according to the rules for
parsing time ofsets first. If that fails, or if the attribute is
absent, the default value must be returned instead, or the not-a-number
value (NaN) if there is no default value. On setting, the given value must
be converted to a string using the time offset
serialisation rules, and that string must be used as the new content
attribute value.
If a reflecting DOM attribute is of the type DOMTokenList, then on getting it must
return a DOMTokenList object
whose underlying string is the element's corresponding content attribute.
When the DOMTokenList object
mutates its underlying string, the attribute must itself be immediately
mutated. When the attribute is absent, then the string represented by the
DOMTokenList object is the empty
string; when the object mutates this empty string, the user agent must
first add the corresponding content attribute, and then mutate that
attribute instead. DOMTokenList
attributes are always read-only. The same DOMTokenList object must be returned
every time for each attribute.
If a reflecting DOM attribute has the type HTMLElement, or an interface that descends
from HTMLElement, then, on
getting, it must run the following algorithm (stopping at the first point
where a value is returned):
document.getElementById() method would find if it was
passed as its argument the current value of the corresponding content
attribute.
On setting, if the given element has an id attribute, then the content attribute must be set
to the value of that id
attribute. Otherwise, the DOM attribute must be set to the empty string.
The HTMLCollection, HTMLFormControlsCollection,
and HTMLOptionsCollection interfaces
represent various lists of DOM nodes. Collectively, objects implementing
these interfaces are called collections.
When a collection is created, a filter and a root are associated with the collection.
For example, when the HTMLCollection object for the document.images
attribute is created, it is associated with a filter that selects only
img elements, and rooted at the root of
the document.
The collection then represents a live view of the subtree rooted at the collection's root, containing only nodes that match the given filter. The view is linear. In the absence of specific requirements to the contrary, the nodes within the collection must be sorted in tree order.
The rows list is not in tree order.
An attribute that returns a collection must return the same object every time it is retrieved.
The HTMLCollection interface
represents a generic collection of elements.
interface HTMLCollection {
readonly attribute unsigned long length;
Element item(in unsigned long index);
Element namedItem(in DOMString name);
};
The length attribute must
return the number of nodes represented by the collection.
The item(in