Web Applications 1.0

Early Working Draft — 1 January 2006 September 2005

You can take part in this work. Join the working group's discussion list.

This version:
http://www.whatwg.org/specs/web-apps/2006-01-01/ http://www.whatwg.org/specs/web-apps/current-work/
Latest version:
http://www.whatwg.org/specs/web-apps/current-work/
Previous versions:
http://www.whatwg.org/specs/web-apps/2005-09-01/ ( diffs )
Editor:
Ian Hickson, Google, Opera Software, ian@hixie.ch

Abstract

This specification introduces features to HTML and the DOM that ease the authoring of Web-based applications. Additions include the context menus, a direct-mode graphics canvas, inline popup windows, server-sent events, and more.

Status of this document

This is an archive copy of a working draft of Web Apps 1.0. It will be used as a milestone against which diffs can be generated, so that it is easier to track progress. Comments on this draft are very welcome, but it is suggested that you first check to see if the latest version has changed. If you do have comments, please send them to whatwg@whatwg.org . Thank you.

To find the latest version of this working draft, please follow the "Latest version" link above.

This draft may contain namespaces that use the uuid: URI scheme. These are temporary and will be changed before those parts of the specification are ready to be implemented in shipping products.

Sections marked [TBW] were placeholders for text not yet written. Sections marked [WIP] were very early drafts that needed much more work. Other sections were first drafts that were ready for substantial comments.

Sections marked [SCS] are sections intended to be self-contained (Self Contained Section). Such sections are considered logical units that it would make sense to implement independent of most of the rest of the specification, provided that enough of the infrastructure is already implemented.

Table of contents


1. Introduction

The World Wide Web's markup language has always been HTML. HTML was primarily designed as a language for semantically describing scientific documents, although its general design and adaptations over the years has enabled it to be used to describe a number of other types of documents.

The main area that has not been adequately addressed by HTML is a vague subject referred to as Web Applications. This specification attempts to rectify this, while at the same time updating the HTML specifications to address issues raised in the past few years.

1.1. Scope

This specification is limited to providing a semantic-level markup language and associated semantic-level scripting APIs for authoring accessible pages on the Web ranging from static documents to dynamic applications.

The scope of this specification does not include addressing presentation concerns.

The scope of this specification does not include documenting every HTML or DOM feature supported by Web browsers. Browsers support many features that are considered to be very bad for accessibility or that are otherwise inappropriate. For example, the blink element is clearly presentational and authors wishing to cause text to blink should instead use CSS.

The scope of this specification is not to describe an entire operating system. In particular, office productivity applications, image manipulation, and other applications that users would be expected to use with high-end workstations on a daily basis are out of scope. In terms of applications, this specification is targetted specifically at applications that would be expected to be used by users on an occasional basis, or regularly but from disparate locations. For instance online purchasing systems, searching systems, games (especially multiplayer online games), public telephone books or address books, communications software (e-mail clients, instant messaging clients, discussion software), etc.

For sophisticated cross-platform applications, there already exist several proprietary solutions (such as Mozilla's XUL and Macromedia's Flash). These solutions are evolving faster than any standards process could follow, and the requirements are evolving even faster. These systems are also significantly more complicated to specify, and are orders of magnitude more difficult to achieve interoperability with, than the solutions described in this document. Platform-specific solutions for such sophisticated applications (for example the MacOS X Core APIs) are even further ahead.

1.2. Structure of this specification [TBW]

This spec is probably big enough to need a guide as to where to look for various things. Hence once the structure is stable we should probably fill out this section.

1.3. Requirements and ideas

This section will probably be dropped in due course.

HTML, CSS, DOM, and JavaScript provide enough power that Web developers have managed to base entire businesses on them. What is required are extensions to these technologies to provide much-needed features such as:

Some less important features would be good to have as well:

Several of the features in these two lists have been supported in non-standard ways by some user agents for some time.

1.4. Relationship to HTML 4.01, XHTML 1.1, DOM2 HTML

This specification represents a new version of HTML4 and XHTML1, along with a new version of the associated DOM2 HTML API. Migration from HTML4 or XHTML1 to the format and APIs described in this specification should in most cases be straightforward, as care has been taken to ensure that backwards-compatibility is retained.

1.5. Relationship to XHTML2

XHTML2 [XHTML2] defines a new HTML vocabulary with better features for hyperlinks, multimedia content, annotating document edits, rich metadata, declarative interactive forms, and describing the semantics of human literary works such as poems and scientific papers.

However, it lacks elements to express the semantics of many of the non-document types of content often seen on the Web. For instance, forum sites, auction sites, search engines, online shops, and the like, do not fit the document metaphor well, and are not covered by XHTML2.

This specification aims to extend HTML so that it is also suitable in these contexts.

XHTML2 and this specification use different namespaces and therefore can both be implemented in the same XML processor.

1.6. Relationship to Web Forms 2.0

This specification is designed to complement Web Forms 2.0. [WF2] Where Web Forms concentrates on input controls, data validation, and form submission, this specification concentrates on client-side user interface features needed to create modern applications.

Eventually WF2 will simply be folded into this spec.

1.7. Relationship to XUL, Avalon/XAML, and other proprietary UI languages

This specification is independent of the various proprietary UI languages that various vendors provide.

1.8. Conformance requirements

As well as sections marked as non-normative, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this document are to be interpreted as described in [RFC2119] . For readability, these words do not appear in all uppercase letters in this specification.

This specification describes the conformance criteria for user agents (implementations and their implementors) and documents (and their authors).

Conformance requirements phrased as requirements on elements, attributes, methods or objects are conformance requirements on user agents.

User agents fall into several (overlapping) categories with different conformance requirements.

Web browsers and other interactive user agents

Web browsers that support XHTML must process elements and attributes from the XHTML namespace found in XML documents as described in this specification, so that users can interact with them, unless the semantics of those elements have been overridden by other specifications.

A conforming XHTML processor would, upon finding an XHTML script element in an XML document, execute the script contained in that element. However, if the element is found within an XSLT transformation sheet (assuming the UA also supports XSLT), then the processor would instead treat the script element as an opaque element that forms part of the transform.

Web browsers that support HTML must process documents labelled as text/html as described in this specification, so that users can interact with them.

Non-interactive presentation user agents

User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.

Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support .

A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.

User agents with no scripting support

Implementations that do not support scripting (or which have their scripting features disabled) are exempt from supporting the events and DOM interfaces mentioned in this specification. For the parts of this specification that are defined in terms of an events model or in terms of the DOM, such user agents must still act as if events and the DOM were supported.

Scripting can form an integral part of an application. Web browsers that do not support scripting, or that have scripting disabled, might be unable to fully convey the author's intent.

Conformance checkers

Conformance checkers must verify that a document conforms to the applicable conformance criteria described in this specification. Conformance checkers are exempt from detecting errors that require interpretation of the author's intent (for example, while a document is non-conforming if the content of a blockquote element is not a quote, conformance checkers do not have to check that blockquote elements only contain quoted material).

The term "validation" specifically refers to a subset of conformance checking that only verifies that a document complies with the requirements given by an SGML or XML DTD. Conformance checkers that only perform validation are non-conforming, as there are many conformance requirements described in this specification that cannot be checked by SGML or XML DTDs.

To put it another way, there are three types of conformance criteria:

  1. Criteria that can be expressed in a DTD.
  2. Criteria that cannot be expressed by a DTD, but can still be checked by a machine.
  3. Criteria that can only be checked by a human.

A conformance checker must check for the first two. A simple DTD-based validator only checks for the first class of errors and is therefore not a conforming conformance checker according to this specification.

Data mining tools

Applications and tools that process HTML and XHTML documents for reasons other than to either render the documents or check them for conformance should act in accordance to the semantics of the documents that they process.

A tool that generates document outlines but increases the nesting level for each paragraph and does not increase the nesting level for each section would not be conforming.

Authoring tools and markup generators

Authoring tools and markup generators must generate conforming documents. Conformance criteria that apply to authors also apply to authoring tools, where appropriate.

Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

There is no implied relationship between document conformance requirements and implementation conformance requirements. User agents are not free to handle non-conformant documents as they please; the processing model described in this specification applies to implementations regardless of the conformity of the input documents.

For compatibility with existing content and prior specifications, this specification describes two authoring formats: one based on XML (referred to as XHTML ), and one using a custom format inspired by SGML (referred to as HTML ). Implementations may support only one of these two formats, although supporting both is encouraged.

XML documents using elements from the XHTML namespace that use the new features described in this specification and that are served over the wire (e.g. by HTTP) must be sent using an XML MIME type such as application/xml or application/xhtml+xml and must not be served as text/html . [RFC3023]

These XML documents may contain a DOCTYPE if desired, but this is not required to conform to this specification.

HTML documents that use the new features described in this specification and that are served over the wire (e.g. by HTTP) must be sent as text/html and must start with the following DOCTYPE: <!DOCTYPE html> .

1.9. Terminology

This specification refers to both HTML and XML attributes and DOM attributes, often in the same context. When it is not clear which is being referred to, they are referred to as content attributes for HTML and XML attributes, and DOM attributes for those from the DOM. Similarly, the term "properties" is used for both ECMAScript object properties and CSS properties. When these are ambiguous they are qualified as object properties and CSS properties respectively.

To ease migration from HTML to XHTML, UAs conforming to this specification will must place elements in HTML in the http://www.w3.org/1999/xhtml namespace, at least for the purposes of the DOM and CSS. The term " elements in the HTML namespace ", when used in this specification, thus refers to both HTML and XHTML elements.

Unless otherwise stated, all elements defined or mentioned in this specification are in the http://www.w3.org/1999/xhtml namespace, and all attributes defined or mentioned in this specification have no namespace (they are in the per-element partition).

Generally, when the specification states that a feature applies to HTML or XHTML, it also includes the other. When a feature specifically only applies to one of the two languages, it is called out by explicitly stating that it does not apply to the other format, as in "for HTML, ... (this does not apply to XHTML)".

The readability, the term URI is used to refer to both ASCII URIs and Unicode IRIs, as those terms are defined by [RFC3986] and [RFC3987] respectively. On the rare occasions where IRIs are not allowed but ASCII URIs are, this is called out explicitly.

The term root element , when not qualified to explicitly refer to the document's root element, means the furthest ancestor element node of whatever node is being discussed, or the node itself is there is none. When the node is a part of the document, then that is indeed the document's root element. However, if the node is not currently part of the document tree, the root element will be an orphaned node.

When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent must not mutate the DOM in such situations.

When an XML name, such as an attribute or element name, is referred to in the form prefix : localName , as in xml:id or svg:rect , it refers to a name with the local name localName and the namespace given by the prefix, as defined by the following table:

xml
http://www.w3.org/XML/1998/namespace
html
http://www.w3.org/1999/xhtml

For simplicity, terms such as shown , displayed , and visible might sometimes be used when referring to the way a document is rendered to the user. These terms are not meant to imply a visual medium; they must be considered to apply to other media in equivalent ways.

This specification uses the term HTML documents to generally refer to any use of HTML, ranging from short static documents to long essays or reports with rich multimedia, as well as to fully-fledged interactive applications.

Various DOM interfaces are defined in this specification using pseudo-IDL. This looks like OMG IDL but isn't. For instance, method overloading is used, and types from the W3C DOM specifications are used without qualification. Language-specific bindings for these abstract interface definitions must be derived in the way consistent with W3C DOM specifications. Some interface-specific binding information for ECMAScript is included in this specification.

The construction "a Foo object", where Foo is actually an interface, is sometimes used instead of the more accurate "an object implementing the interface Foo ".

The terms fire and dispatch are used interchangeably in the context of events, as in the DOM Events specifications. [DOM3EVENTS]

1.10. Miscellaneous

As the specification evolves, these conformance requirements will most likely be moved to more appropriate places.

When a UA needs to convert a string to a number, algorithms equivalent to those specified in ECMA262 sections 9.3.1 ("ToNumber Applied to the String Type") and 8.5 ("The Number type") should be used (possibly after suitably altering the algorithms to handle numbers of the range that the UA can support). [ECMA262]

The alt attribute on images must not be shown in a tooltip in visual browsers.

DOM mutation events must not fire for changes caused by the UA parsing the document. (Conceptually, the parser is not mutating the DOM, it is constructing it.) This includes the parsing of any content inserted using document.write() and document.writeln() calls. Other changes, including fragment insertions involving innerHTML and similar attributes, must fire mutation events. [DOM3EVENTS]

The default value of Content-Style-Type and the default value of the type attribute of the style element is is text/css .

The default value of Content-Script-Type and the default value of the type attribute of the script element is the ECMAScript MIME type.

User agents must follow the rules given by XML Base to resolve relative URIs in HTML and XHTML fragments. [XMLBASE]

It is possible for xml:base attributes to be present even in HTML fragments, as such attributes can be added dynamically using script.

2. Semantics and structure of HTML elements

2.1. Introduction [TBW]

This section is non-normative.

An introduction to marking up a document.

2.2. The DOM

The Document Object Model (DOM) is a representation — a model — of the document and its content. [DOM3CORE] The DOM is not just an API; operations on the in-memory document are defined, in this specifiation, in terms of the DOM.

2.2.1. DOM feature strings

DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings . [DOM3CORE]

A DOM application can use the hasFeature( feature , version ) method of the DOMImplementation interface with parameter values " HTML " and " 5.0 " (respectively) to determine whether or not this module is supported by the implementation. In addition to the feature string " HTML ", the feature string " XHTML " (with version string " 5.0 ") can be used to check if the implementation supports XHTML. User agents should respond with a true value when the hasFeature method is queried with these values. Authors are cautioned, however, that UAs returning true might not be perfectly compliant, and that UAs returning false might well have support for features in this specification; in general, therefore, use of this method is discouraged.

The values " HTML " and " XHTML " (both with version " 5.0 ") should also be supported in the context of the getFeature() and isSupported() methods, as defined by DOM3 Core.

The interfaces defined in this specification are not always supersets of the interfaces defined in DOM2 HTML; some features that were formerly deprecated, poorly supported, rarely used or considered unnecessary have been removed. Therefore it is not guarenteed that an implementation that supports " HTML " " 5.0 " also supports " HTML " " 2.0 ".

2.2.2. Reflecting content attributes in DOM attributes

Some DOM attributes are defined to reflect a particular content attribute . This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value.

If a reflecting DOM attribute is a DOMString attribute defined to contain a URI, then on getting, the DOM attribute returns the value of the content attribute, resolved to an absolute URI, and on setting, sets the content attribute to the specified literal value. If the content attribute is absent, the DOM attribute must return the default value, if the content attribute has one, or else the empty string.

If a reflecting DOM attribute is a DOMString attribute that is not defined to contain a URI, then the getting and setting is done in a transparent, case-sensitive manner, except if the content attribute is defined to only allow a specific set of values. In this latter case, the attribute's value is first converted to lowercase before being returned. If the content attribute is absent, the DOM attribute must return the default value, if the content attribute has one, or else the empty string.

If a reflecting DOM attribute is a boolean attribute, then the DOM attribute returns true if the attribute is set, and false if it is absent. On setting, the content attribute is removed if the DOM attribute is set to false, and is set to have the same value as its name if the DOM attribute is set to true.

If a reflecting DOM attribute is a numeric type ( long ) then the content attribute must be converted to a numeric type first (truncating any fractional part). If that fails, or if the attribute is absent, the default value should be returned instead, or 0 if there is no default value. On setting, the given value is converted to a string representing the number in base ten and then that string should be used as the new content attribute value.

2.2.3. Event listeners

In the ECMAScript DOM binding, the ECMAScript native Function type must implement the EventListener interface such that invoking the handleEvent() method of that interface on the object from another language binding invokes the function itself, with the event argument as its only argument. In the ECMAScript binding itself, however, the handleEvent() method of the interface is not directly accessible on Function objects. Such functions must be called in the global scope. If the function returns false, the event's preventDefault() method must then invoked. Exception: for historical reasons, for the HTML mouseover event, the preventDefault() method must be called when the function returns true instead.

In HTML, event handler attributes (such as onclick ) are invoked as if they were functions implementing EventListener , with the argument called event . Such attributes are added as non-capture event listeners of the type given by their name (without the leading on prefix). Only attributes actually defined to exist by specifications implemented by the UA (e.g. HTML, Web Forms 2, Web Apps) are actually registered, however; for example if an author created an onfoo attribute, it would not be fired for foo events.

The scope chain for ECMAScript executed in HTML event handler attributes must link from the activation object for the handler, to its this parameter (the event target), to the element's form element if it is a form control, to the document, to the default view (the Window object).

This definition is compatible with how most browsers implemented DOM Level 0, but does not exactly describe IE's behaviour. See also ECMA262 Edition 3, sections 10.1.6 and 10.2.3, for more details on activation objects. [ECMA262]

2.2.4. Event firing

Certain operations and methods are defined as firing events on elements. For example, the click() method on the HTMLCommandElement is defined as firing a click event on the element. [DOM3EVENTS]

Firing a click event means that a click event in the http://www.w3.org/2001/xml-events namespace, which bubbles and is cancelable, and which uses the MouseEvent interface, must be dispatched at the given element. The event object must have its screenX , screenY , clientX , clientY , and button attributes set to 0, its ctrlKey , shiftKey , altKey , and metaKey attributes set according to the current state of the key input device, if any (false for any keys that are not available), its detail attribute set to 1, and its relatedTarget attribute set to null. The getModifierState() method on the object must return values appropriately describing the state of the key input device at the time the event is created.

Firing a change event means that a change event in the http://www.w3.org/2001/xml-events namespace, which bubbles but is not cancelable, and which uses the Event interface, must be dispatched at the given element. The event object must have its detail attribute set to 0.

Firing a contextmenu event means that a contextmenu event in the http://www.w3.org/2001/xml-events namespace, which bubbles and is cancelable, and which uses the Event interface, must be dispatched at the given element. The event object must have its detail attribute set to 0.

Firing a show event means that a show event in the http://www.w3.org/2001/xml-events namespace, which does not bubble but is cancelable, and which uses the Event interface, must be dispatched at the given element. The event object must have its detail attribute set to 0.

The default action of these event is to do nothing unless otherwise stated.

If you dispatch a custom "click" event at an element that would normally have default actions, they should get triggered. We need to go through the entire spec and make sure that any default actions are defined in terms of any event of the right type on that element, not those that are dispatched in expected ways.

2.2.5. The textContent attribute

Some elements are defined in terms of their DOM textContent attribute. This is an attribute defined on the Node interface in DOM3 Core. [DOM3CORE]

Should textContent be defined differently for dir="" and <bdo>? Should we come up with an alternative to textContent that handles those and other things, like alt=""?

2.2.6. Common DOM interfaces [TBW]

Still need to define HTMLCollection .

 interface   DOMTokenString   { bool   has   (in DOMString token); void   add   (in DOMString token); void   remove   (in DOMString token); };  

Need to define those members.

2.2.7. The document [TBW]

Every XML and HTML document in an HTML UA must be represented by a Document object. [DOM3CORE]

All Document objects (in user agents implementing this specification) must also implement the HTMLDocument interface, available using binding-specific methods.

Document objects must also implement the document-level interface of any other namespaces found in the document that the UA supports. For example, if an HTML implementation also supports SVG, then the Document object must implement HTMLDocument and SVGDocument .

 interface   HTMLDocument   :   Document   { attribute DOMString   title   ; readonly attribute DOMString   referrer   ; readonly attribute DOMString   domain   ; readonly attribute DOMString   URL   ; attribute    HTMLElement     body   ; readonly attribute   HTMLCollection    images   ; readonly attribute   HTMLCollection    applets   ; readonly attribute   HTMLCollection    links   ; readonly attribute   HTMLCollection    forms   ; readonly attribute   HTMLCollection    anchors   ; attribute DOMString   cookie   ; void   open   (); void   close   (); void   write   (in DOMString text); void   writeln   (in DOMString text); NodeList   getElementsByName   (in DOMString elementName); NodeList    getElementsByClassName    (in DOMString className1 [, in DOMString className2, ...] ); };  

The Document objects of documents that are being rendered in a browsing context will also implement the DocumentWindow and DocumentStyle interfaces.

Need to define those members; the body attribute will be used to define the body element .

The getElementsByClassName() method takes one or more strings representing classes and must return all the elements in that document that are of all those classes. HTML, XHTML, SVG and MathML elements define which classes they are in by having an attribute in the per-element partition with the name class containing a space-separated list of classes to which the element belongs. Other specifications may also allow elements in their namespaces to be labelled as being in specific classes. UAs must not assume that all attributes of the name class for elements in any namespace work in this way, however, and must not assume that such attributes, when used as global attributes, label other elements as being in specific classes.

There is an open issue on whether we should use multiple arguments or just one argument that needs to be split on spaces.

The space character (U+0020) is not special in the method's arguments. In HTML, XHTML, SVG and MathML it is impossible for an element to belong to a class whose name contains a space character, however, and so typically the method would return no nodes if one of its arguments contained a space.

Similarly, if the method is passed an argument consisting of the empty string, it will typically not return any nodes since in HTML, XHTML, SVG and MathML it is impossible to assign an element to the "" class.

Given the following XHTML fragment:

 <div id="example"> <p id="p1" class="aaa bbb"/> <p id="p2" class="aaa ccc"/> <p id="p3" class="bbb ccc"/> </div>  

A call to document.getElementById('example').getElementsByClassName('aaa') would return a NodeList with the two paragraphs p1 and p2 in it. A call to getElementsByClassName('ccc', 'bbb') would only return one node, however, namely p3 .

A call to getElementsByClassName('aaa bbb') would return no nodes; none of the elements above are in the "aaa bbb" class.

We could also have a getElementBySelector() method, but it seems that it would be best to let the CSSWG define that.

2.2.8. The elements [TBW]

The nodes representing HTML elements in the DOM must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. This includes XHTML elements in XML documents, even when those documents are in another context (e.g. inside an XSLT transform).

The basic interface, from which all the HTML elements' interfaces inherit, and which is used by elements that have no additional requirements, is the HTMLElement interface.

Define HTMLElement here.

In HTML documents, for HTML elements, the DOM APIs must return tag names and attributes names in uppercase, regardless of the case with which they were created. This does not apply to XML documents; in XML documents, the DOM APIs must always return tag names and attribute names in the original case used to create those nodes.

2.3. HTML documents and document fragments

2.3.1. 2.1.1. Semantics

Elements, attributes, and attribute values in HTML are defined (by this specification) to have certain meanings (semantics). For example, the ol element represents an ordered list, and the lang attribute represents the language of the content.

Authors must only use elements, attributes, and attribute values for their appropriate semantic purposes.

For example, the following document is non-conforming, despite being syntactically correct:

 <!DOCTYPE html> <html lang="en-GB"> <head> <title> Demonstration </title> </head> <body> <table> <tr> <td> My favourite animal is the cat. </td> </tr> <tr> <td> —<a href="http://example.org/~ernest/"><cite>Ernest</cite></a>, in an essay from 1992 </td> </tr> </table> </body> </html> 

...because the data placed in the cells is clearly not tabular data. A corrected version of this document might be:

 <!DOCTYPE html> <html lang="en-GB"> <head> <title> Demonstration </title> </head> <body> <blockquote> <p> My favourite animal is the cat. </p> </blockquote> <p> —<a href="http://example.org/~ernest/"><cite>Ernest</cite></a>, in an essay from 1992 </p> </body> </html> 

This next document fragment, intended to represent the heading of a corporate site, is similarly non-conforming because the second line is not intended to be a heading of a subsection, but merely a subheading or subtitle (a subordinate heading for the same section).

 <body> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> ... 

The header element should be used in these kinds of situations:

 <body> <header> <h1>ABC Company</h1> <h2>Leading the way in widget design since 1432</h2> </header> ... 

2.3.2. 2.1.2. Structure

All the elements in this specification have a defined content model, which describes what nodes are allowed inside the elements, and thus what the structure of an HTML document or fragment must look like. Authors must only put elements inside an element if that element allows them to be there according to its content model.

For the purposes of determining if an element matches its content model or not, CDATA nodes in the DOM must be treated as text nodes, and character entity reference nodes must be treated as if they were expanded in place.

The whitespace characters U+0020 SPACE, U+000A LINE FEED, and U+000D CARRIAGE RETURN are always allowed between elements. User agents must always represent these characters between elements in the source markup as text nodes in the DOM. Empty text nodes and text nodes consisting of just sequences of those characters are considered inter-element whitespace and must be ignored when establishing whether an element matches its content model or not.

Authors must only use elements from the HTML namespace in the contexts where they are allowed, as defined for each element. For XML compound documents, these contexts could be inside elements from other namespaces, if those elements are defined as providing the relevant contexts.

The SVG specification defines the SVG foreignObject element as allowing foreign namespaces to be included, thus allowing compound documents to be created by inserting subdocument content under that element. This specification defines the XHTML html element as being allowed where subdocument fragments are allowed in a compound document. Together, these two definitions mean that placing an XHTML html element as a child of an SVG foreignObject element is conforming.

2.1.3. The DOM The Document Object Model (DOM) is a representation — a model — of the document and its content. [DOM3CORE] The DOM is not just an API; operations on the in-memory document are defined, in this specifiation, in terms of the DOM. HTML elements in the DOM, including XHTML elements in XML documents, even when those documents are in another context (e.g. inside an XSLT transform), must implement, and expose to scripts, the interfaces listed for them in the relevant sections of this specification. The basic interface, from which all the HTML elements' interfaces inherit, and which is used by elements that have no additional requirements, is the HTMLElement interface (defined below). To ease migration from HTML to XHTML, UAs must assign the http://www.w3.org/1999/xhtml namespace to elements in that are parsed in documents labelled as text/html , at least for the purposes of the DOM and CSS. In HTML documents, for HTML elements, the DOM APIs must return tag names and attributes names in uppercase, regardless of the case with which they were created. This does not apply to XML documents; in XML documents, the DOM APIs must always return tag names and attribute names in the original case used to create those nodes. 2.1.3.1. DOM feature strings DOM3 Core defines mechanisms for checking for interface support, and for obtaining implementations of interfaces, using feature strings . [DOM3CORE] A DOM application can use the hasFeature( feature , version ) method of the DOMImplementation interface with parameter values " HTML " and " 5.0 " (respectively) to determine whether or not this module is supported by the implementation. In addition to the feature string " HTML ", the feature string " XHTML " (with version string " 5.0 ") can be used to check if the implementation supports XHTML. User agents should respond with a true value when the hasFeature method is queried with these values. Authors are cautioned, however, that UAs returning true might not be perfectly compliant, and that UAs returning false might well have support for features in this specification; in general, therefore, use of this method is discouraged. The values " HTML " and " XHTML " (both with version " 5.0 ") should also be supported in the context of the getFeature() and isSupported() methods, as defined by DOM3 Core. The interfaces defined in this specification are not always supersets of the interfaces defined in DOM2 HTML; some features that were formerly deprecated, poorly supported, rarely used or considered unnecessary have been removed. Therefore it is not guarenteed that an implementation that supports " HTML " " 5.0 " also supports " HTML " " 2.0 ". 2.1.3.2. Common DOM interfaces Still need to define HTMLCollection . interface DOMTokenString { bool has (in DOMString token); void add (in DOMString token); void remove (in DOMString token); } Need to define those members. 2.1.3.3. The document Every XML and HTML document in an HTML UA must be represented by a Document object. [DOM3CORE] This object must also implement the document-level interface of any other namespaces found in the document that the UA supports. For example, if the implementation supports both HTML and SVG, then the Document object must also implement HTMLDocument and SVGDocument . The Document object of documents that are being rendered in a browsing context must also implement the DocumentWindow interface. interface HTMLDocument : Document { attribute DOMString title ; readonly attribute DOMString referrer ; readonly attribute DOMString domain ; readonly attribute DOMString URL ; attribute HTMLElement body ; readonly attribute HTMLCollection images ; readonly attribute HTMLCollection applets ; readonly attribute HTMLCollection links ; readonly attribute HTMLCollection forms ; readonly attribute HTMLCollection anchors ; attribute DOMString cookie ; void open (); void close (); void write (in DOMString text); void writeln (in DOMString text); NodeList getElementsByName (in DOMString elementName); }; Need to define those members. 2.1.3.4. Reflecting content attributes in DOM attributes Some DOM attributes are defined to reflect a particular content attribute. This means that on getting, the DOM attribute returns the current value of the content attribute, and on setting, the DOM attribute changes the value of the content attribute to the given value. If a reflecting DOM attribute is a DOMString attribute defined to contain a URI, then on getting, the DOM attribute returns the value of the content attribute, resolved to an absolute URI, and on setting, sets the content attribute to the specified literal value. If the content attribute is absent, the DOM attribute must return the default value, if the content attribute has one, or else the empty string. If a reflecting DOM attribute is a DOMString attribute that is not defined to contain a URI, then the getting and setting is done in a transparent, case-sensitive manner, except if the content attribute is defined to only allow a specific set of values. In this latter case, the attribute's value is first converted to lowercase before being returned. If the content attribute is absent, the DOM attribute must return the default value, if the content attribute has one, or else the empty string. If a reflecting DOM attribute is a boolean attribute, then the DOM attribute returns true if the attribute is set, and false if it is absent. On setting, the content attribute is removed if the DOM attribute is set to false, and is set to have the same value as its name if the DOM attribute is set to true. If a reflecting DOM attribute is a numeric type ( long ) then the content attribute must be converted to a numeric type first (truncating any fractional part). If that fails, or if the attribute is absent, the default value should be returned instead, or 0 if there is no default value. On setting, the given value is converted to a string representing the number in base ten and then that string should be used as the new content attribute value. 2.1.3.5. The textContent attribute Some elements are defined in terms of their DOM textContent attribute. This is an attribute defined on the Node interface in DOM3 Core. [DOM3CORE] Should textContent be defined differently for dir="" and <bdo>? Should we come up with an alternative to textContent that handles those and other things, like alt=""?

2.3.3. 2.1.4. Kinds of elements

Each element in HTML falls into zero or more categories that group elements with similar characteristics together. This specification uses the following categories:

Some elements have unique requirements and do not fit into any particular category.

2.3.3.1. 2.1.4.1. Block-level elements

Block-level elements are used for structural grouping of page content.

There are several kinds of block-level elements:

There are also elements that seem to be block-level but aren't, such as body , li , dt , dd , and td . These elements are allowed only in specific places, not simply anywhere that block-level elements are allowed.

Some block-level elements play multiple roles. For instance, the script elements is allowed inside head elements and can also be used as inline-level content . Similarly, the ul , ol , dl , table , and blockquote elements play dual roles as both block-level and inline-level elements.

2.3.3.2. 2.1.4.2. Inline-level content

Inline-level content consists of text and various elements to annotate the text, as well as some embedded content (such as images or sound clips).

Inline-level content comes in various types:

Strictly inline-level content
Text, embedded content, and elements that annotate the text without introducing structural grouping. For example: a , i , noscript . Elements used in contexts allowing only strictly inline-level content must not contain anything other than strictly inline-level content.
Structured inline-level elements
Block-level elements that can also be used as inline-level content. For example: ol , blockquote , table .

Unless an element's content model explicitly states that it must contain significant inline content , simply having no text nodes and no elements satisfies an element whose content model is some kind of inline content. contet.

Some elements are defined to have as a content model significant inline content . This means that at least one descendant of the element must be significant text or embedded content .

Significant text , for the purposes of determining the presence of significant inline content , consists of any character other than those falling in the Unicode categories Zs, Zl, Zp, Cc, and Cf. [UNICODE]

The following three paragraphs are non-conforming because their content model is not satisfied (they all count as empty).

 <p></p> <p><em>&#x00A0;</em></p> <p> <ol> <li></li> </ol> </p> 
2.3.3.3. 2.1.4.3. Determining if a particular element contains block-level elements or inline-level content

Some elements are defined to have content models that allow either block-level elements or inline-level content , but not both. For example, the aside and li elements.

To establish whether such an element is being used as a block-level container or as an inline-level container, for example in order to determine if a document conforms to these requirements, user agents must look at the element's child nodes. If any of the child nodes are not allowed in block-level contexts, then the element is being used for inline-level content . If all the child nodes are allowed in a block-level context, then the element is being used for block-level elements .

For instance, in the following (non-conforming) fragment, the li element is being used as an inline-level element container, because the style element is not allowed in a block-level context. (It doesn't matter, for the purposes of determining whether it is an inline-level or block-level context, that the style element is not allowed in inline-level contexts either.)

 <ol> <li> <p> Hello World </p> <style> /* This example is illegal. */ </style> </li> </ol> 

In the following fragment, the aside element is being used as a block-level container, because even though all the elements it contains could be considered inline-level elements, there are no nodes that can only be considered inline-level.

 <aside> <ol> <li> ... </li> </ol> <ul> <li> ... </li> </ul> </aside> 

On the other hand, in the following similar fragment, the aside element is an inline-level container, because the text ("Foo") can only be considered inline-level.

 <aside> <ol> <li> ... </li> </ol> Foo </aside> 
2.3.3.4. 2.1.4.4. Interactive elements

Certain elements in HTML can be activated, for instance a elements, button elements, or input elements when their type attribute is set to radio . Activation of those elements can happen in various (UA-defined) ways, for instance via the mouse or keyboard.

When activation is performed via some method other than clicking the pointing device, the default action of the event that triggers the activation must, instead of being activating the element directly, be to fire a the dispatching of a new event, click event , on the same element. element, with the mouse-specific fields ( button , screenX , etc) set to zero, and the key fields set according to the current state of the key input device, if any (false for any keys that are not available). [DOM3EVENTS]

The default action of this click event, or of the real click event if the element was activated by clicking a pointing device, must shall be to dispatch yet another event, namely DOMActivate . It is the default action of that event that then performs the actual action.

For certain form controls, this process is complicated further by changes that must happen around the click event . [WF2]

Most interactive elements have content models that disallowed nesting interactive elements.

Need to define how default actions actually work. For instance, if you click an event inside a link, the event is triggered on that element, but then we'd like a click is sent on the link itself. So how does that happen? Does the link have a bubbling listener that triggers that second click event? what if there are multiple nested links, which one should we send that event to?

2.3.4. 2.1.5. Global attributes [WIP]

User agents must support the following common attributes on all elements in the HTML namespace (including elements that are not defined to exist by this specification).

id

The element's unique identifier. The value must be unique in the document and must contain at least one character.

If the value is not the empty string, user agents must associate the element with the given value (exactly) for the purposes of ID matching (e.g. for selectors in CSS or for the getElementById() method in the DOM).

Identifiers are opaque strings. Particular meanings should not be derived from the value of the id attribute.

When an element has an ID set through multiple methods (for example, if it has both id and xml:id attributes simultaneously [XMLID] ), then the element has multiple identifiers. User agents must use all of an HTML element's identifiers (including those that are in error according to their relevant specification) for the purposes of ID matching.

title

Advisory information for the element, such as would be appropriate for a tooltip. On a link, this could be the title or a description of the target resource; on an image, it could be the caption or a description of the image; on a paragraph, it could be a footnote or commentary on the text; on a citation, it could be further information about the source; and so forth. The value is text.

If this attribute is omitted from an element, then it implies that the title attribute of the nearest ancestor with a title attribute set is also relevant to this element. Setting the attribute overrides this, explicitly stating that the advisory information of any ancestors is not relevant to this element. Setting the attribute to the empty string indicates that the element has no advisory information.

Some elements, such as The link , style , abbr , and dfn , elements define their own title attributes instead of using the global title attribute.

lang (HTML only) and xml:lang (XML only)

The primary language for the element's contents and for any of the element's attributes that contain text. The value must be a valid RFC 3066 language code, or the empty string. RFC3066

If this attribute is omitted from an element, then it implies that the language of this element is the same as the language of the parent element. Setting the attribute to the empty string indicates that the primary language is unknown.

The lang attribute only applies to HTML documents. Authors must not use the lang attribute in XML documents. Authors must instead use the xml:lang attribute, defined in XML. [XML]

To determine the language of a node, user agents must look at the nearest ancestor element (including the element itself if the node is an element) that has a lang or xml:lang attribute set. That specifies the language of the node.

If both the xml:lang attribute and the lang attribute are set, user agents must use the xml:lang attribute, and the lang attribute must be ignored for the purposes of determining the element's language.

If no explicit language is given for the root element , then language information from a higher-level protocol (such as HTTP), if any, must be used as the final fallback language. In the absence of any language information, the default value is unknown (the empty string).

User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronounciations, or for dictionary selection).

dir

The element's text directionality. The attribute, if specified, must have either the literal value ltr or the literal value rtl .

If the attribute has the literal value ltr , the element's directionality is left-to-right. If the attribute has the literal value rtl , the element's directionality is right-to-left. If the attribute is omitted or has another value, then the directionality is unchanged.

The processing of this attribute depends on the presentation layer. For example, CSS 2.1 defines a mapping from this attribute to the CSS 'direction' and 'unicode-bidi' properties, and defines rendering in terms of those property.

class

The element's classes. The value must be a list of zero or more words (consisting of one or more non-space characters) separated by one or more spaces.

User agents must assign all the given classes to the element, for the purposes of class matching (e.g. for selectors in CSS or for the getElementsByClassName() method in the DOM).

Unless defined by one of the URIs given in the profile attribute, classes are opaque strings. Particular meanings must not be derived from undefined values in the class attribute.

Authors should bear in mind that using the class attribute does not convey any additional meaning to the element (unless using classes defined by a profile ). There is no semantic difference between an element with a class attribute and one without . Authors that use classes that are not defined in a profile should make sure, therefore, that their documents make as much sense once all class attributes have been removed as they do with the attributes present.

contextmenu

The element's context menu . The value must be the ID of a menu element in the DOM. If the node that would be obtained by the invoking the getElementById() method using the attribute's value as the only argument is null or not a menu element, then the element has no assigned context menu. Otherwise, the element's assigned context menu is the element so identified.

Event handler attributes aren't handled yet.

The following DOM interface, common to elements in the HTML namespace, provides scripts with convenient access to the content attributes listed above:

 interface  HTMLElement  :  Element  { attribute DOMString   id   ; attribute DOMString   title   ; attribute DOMString   lang   ; attribute DOMString   dir   ; attribute DOMString   className   ; NodeList    getElementsByClassName    (in DOMString className1 [, in DOMString className2, ...] );  }; 

The id attribute must reflect the content id attribute.

The title attribute must reflect the content title attribute.

The lang attribute must reflect the content lang attribute.

The dir attribute must reflect the content dir attribute.

The className attribute must reflect the content class attribute.

should also introduce a DOMTokenString accessor for the class attribute

The getElementsByClassName() method must return the nodes that the HTMLDocument getElementsByClassName() method would return, excluding any elements that are not descendants of the HTMLElement on which the method was invoked.

2.3.5. 2.1.6. The html element

Contexts in which this element may be used:
As the root element of a document.
Wherever a subdocument fragment is allowed in a compound document.
Content model:
A head element followed by a body element.
Element-specific attributes:
None.
DOM interface:
No difference from HTMLElement .

The html element represents the root of an HTML document.

2.4. 2.2. Document metadata

Document metadata is represented by metadata elements in the document's head element.

2.4.1. 2.2.1. The head element

Contexts in which this element may be used:
As the first element in an html element.
Content model:
In any order, exactly one title element, optionally one base element (HTML only), and zero or more other metadata elements (in particular, link , meta , style , and script ).
Element-specific attributes:
profile (optional)
DOM interface:
 interface  HTMLHeadElement  :   HTMLElement   { attribute DOMString   profile   ; }; 

The head element collects the document's metadata.

The profile attribute must, if specified, contain a list of zero or more URIs (or IRIs) representing definitions of classes, metadata names, and link relations. These URIs are opaque strings, like namespaces; user agents are not expected to determine any useful information from the resources that they reference.

Each time a class, metadata, or link relationship name that is not defined by this specification is found in a document, the UA must check whether any of the URIs in the profile attribute are known (to the UA) to define that name. The class, metadata, or link relationship shall then be interpreted using the semantics given by the first URI that is known to define the name. If the name is not defined by this specification and none of the specified URIs defines the name either, then the class, metadata, or link relationship is meaningless and the UA must not assign special meaning to that name.

If two profiles define the same name, then the semantic is given by the first URI specified in the profile attribute. There is no way to use the names from both profiles in one document.

User agents must ignore all the URIs given in the profile attribute that follow a URI that the UA does not recognise. (Otherwise, if a name is defined in two profiles, UAs would assign meanings to the document differently based on which profiles they supported.)

If a profile's definition introduces new definitions over time, documents that use multiple profiles can change defined meaning over time. So as to avoid this problem, authors are encouraged to avoid using multiple profiles.

The profile DOM attribute must reflect the profile content attribute on getting and setting.

2.4.2. 2.2.2. The title element

Metadata element .

Contexts in which this element may be used:
In a head element containing no other title elements.
Content model:
Text (for details, see prose).
Element-specific attributes:
None.
DOM interface:
No difference from HTMLElement .

The title element represents the document's title or name. Authors should use titles that identify their documents even when they are used out of context, for example in a user's history or bookmarks, or in search results. The document's title is often different from its first header, since the first header does not have to stand alone when taken out of context.

Here are some examples of appropriate titles, contrasted with the top-level headers that might be used on those same pages.

 <title>Introduction to The Mating Rituals of Bees</title> ... <h1>Introduction</h1> <p>This companion guide to the highly successful <cite>Introduction to Medieval Bee-Keeping</cite> book is... 

The next page might be a part of the same site. Note how the title describes the subject matter unambiguously, while the first header assumes the reader knowns what the context is and therefore won't wonder if the dances are Salsa or Waltz: Waltz.

 <title>Da