The document.write() family of methods and
the innerHTML
family of DOM attributes enable script authors to dynamically insert
markup into the document.
bz argues that innerHTML should be called something else on XML documents and XML elements. Is the sanity worth the migration pain?
Because these APIs interact with the parser, their behaviour varies depending on whether they are used with HTML documents (and the HTML parser) or XHTML in XML documents (and the XML parser). The following table cross-references the various versions of these APIs.
document.write()
| innerHTML
| |
|---|---|---|
| For documents that are HTML documents | document.write() in HTML
| innerHTML in HTML
|
| For documents that are XML documents | document.write() in XML
| innerHTML
in XML
|
Regardless of the parsing mode, the document.writeln(...) method
must call the document.write() method with the same
argument(s), and then call the document.write() method with, as its
argument, a string consisting of a single line feed character (U+000A).
The open()
method comes in several variants with different numbers of arguments.
When called with two or fewer arguments, the method must act as follows:
Let type be the value of the first argument, if
there is one, or "text/html" otherwise.
Let replace be true if there is a second argument and it has the value "replace", and false otherwise.
If the document has an active parser
that isn't a script-created parser, and
the insertion point associated with that
parser's input stream is not undefined (that is,
it does point to somewhere in the input stream), then the
method does nothing. Abort these steps and return the
Document object on which the method was invoked.
This basically causes document.open() to be ignored when it's called
in an inline script found during the parsing of data sent over the
network, while still letting it have an effect when called
asynchronously or on a document that is itself being spoon-fed using
these APIs.
onbeforeunload, onunload
If the document has an active parser, then stop that parser, and throw away any pending content in the input stream. what about if it doesn't, because it's either like a text/plain, or Atom, or PDF, or XHTML, or image document, or something?
Remove all child nodes of the document.
Create a new HTML parser and associate it with
the document. This is a script-created
parser (meaning that it can be closed by the document.open() and
document.close() methods, and that the
tokeniser will wait for an explicit call to document.close()
before emitting an end-of-file token).
If type does not have the value
"text/html", then act as if the
tokeniser had emitted a pre element
start tag, then set the HTML parser's tokenisation stage's content model flag to PLAINTEXT.
If replace is false, then:
Document's
History object
Document
Document object, as well as the state of the document at
the start of these steps. (This allows the user to step backwards in
the session history to see the page before it was blown away by the
document.open() call.)
Finally, set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Return the Document on which the method was invoked.
We shouldn't hard-code text/plain there. We
should do it some other way, e.g. hand off to the section on
content-sniffing and handling of incoming data streams, the part that
defines how this all works when stuff comes over the network.
When called with three or more arguments, the open() method on the
HTMLDocument object must call the
open() method on the
Window interface of the object returned
by the defaultView attribute
of the DocumentView interface of the HTMLDocument object, with the same
arguments as the original call to the open() method, and return whatever that method
returned. If the defaultView
attribute of the DocumentView interface of the HTMLDocument object is null, then the
method must raise an INVALID_ACCESS_ERR exception.
The close()
method must do nothing if there is no script-created parser associated with the
document. If there is such a parser, then, when the method is called, the
user agent must insert an explicit "EOF"
character at the insertion point of the
parser's input stream.
In HTML, the document.write(...)
method must act as follows:
If the insertion point is undefined, the
open() method
must be called (with no arguments) on the document object. The insertion point will point at just before the end
of the (empty) input stream.
The string consisting of the concatenation of all the arguments to the method must be inserted into the input stream just before the insertion point.
If there is a script that will execute as soon as the parser resumes, then the method must now return without further processing of the input stream.
Otherwise, the tokeniser must process the characters that were
inserted, one at a time, processing resulting tokens as they are
emitted, and stopping when the tokeniser reaches the insertion point or
when the processing of the tokeniser is aborted by the tree construction
stage (this can happen if a script
start tag token is emitted by the tokeniser).
If the document.write() method was called
from script executing inline (i.e. executing because the parser parsed a
set of script tags), then this is a
reentrant invocation of the parser.
Finally, the method must return.
In HTML, the innerHTML DOM attribute of all
HTMLElement and HTMLDocument nodes returns a serialisation
of the node's children using the HTML syntax.
On setting, it replaces the node's children with new nodes that result
from parsing the given value. The formal definitions follow.
On getting, the innerHTML DOM attribute must return the
result of running the HTML fragment serialisation
algorithm on the node.
On setting, if the node is a document, the innerHTML DOM
attribute must run the following algorithm:
If the document has an active parser, then stop that parser, and throw away any pending content in the input stream. what about if it doesn't, because it's either like a text/plain, or Atom, or PDF, or XHTML, or image document, or something?
Remove the children nodes of the Document whose innerHTML
attribute is being set.
Create a new HTML parser, in its initial state,
and associate it with the Document node.
Place into the input stream for the HTML parser just created the string being assigned
into the innerHTML attribute.
Start the parser and let it run until it has consumed all the
characters just inserted into the input stream. (The
Document node will have been populated with elements and a
load event will have
fired on its body
element.)
Otherwise, if the node is an element, then setting the innerHTML DOM
attribute must cause the following algorithm to run instead:
Invoke the HTML fragment parsing
algorithm, with the element whose innerHTML attribute is being set as the
context and the string being assigned into the innerHTML
attribute as the input. Let new
children be the result of this algorithm.
Remove the children of the element whose innerHTML
attribute is being set.
Let target document be the ownerDocument of the Element node whose
innerHTML attribute is being set.
Set the ownerDocument of all the nodes in new children to the target document.
Append all the new children nodes to the node
whose innerHTML attribute is being set,
preserving their order.
script elements inserted
using innerHTML do not execute when they are
inserted.
In an XML context, the document.write() method
must raise an INVALID_ACCESS_ERR exception.
On the other hand, however, the innerHTML attribute is indeed
usable in an XML context.
In an XML context, the innerHTML DOM attribute on HTMLElements and HTMLDocuments, on getting, must return a
string in the form of an internal general parsed
entity that is XML namespace-well-formed, the string being an
isomorphic serialisation of all of that node's child nodes, in document
order. User agents may adjust prefixes and namespace declarations in the
serialisation (and indeed might be forced to do so in some cases to obtain
namespace-well-formed XML). [XML] [XMLNS]
If any of the following cases are found in the DOM being serialised, the
user agent must raise an INVALID_STATE_ERR exception:
DocumentType node that has an external subset public
identifier or an external subset system identifier that contains both a
U+0022 QUOTATION MARK ('"') and a U+0027 APOSTROPHE ("'").
Text node whose data contains characters that are not
matched by the XML Char production. [XML]
CDATASection node whose data contains the string "]]>".
Comment node whose data contains two adjacent U+002D
HYPHEN-MINUS (-) characters or ends with such a character.
ProcessingInstruction node whose target name is the
string "xml" (case insensitively).
ProcessingInstruction node whose target name contains a
U+003A COLON (":").
ProcessingInstruction node whose data contains the
string "?>".
These are the only ways to make a DOM unserialisable. The DOM
enforces all the other XML constraints; for example, trying to set an
attribute with a name that contains an equals sign (=) will raised an
INVALID_CHARACTER_ERR exception.
On setting, in an XML context, the innerHTML DOM attribute on HTMLElements and HTMLDocuments must run the following
algorithm:
The user agent must create a new XML parser.
If the innerHTML attribute is being set on an
element, the user agent must feed the parser just created
the string corresponding to the start tag of that element, declaring all
the namespace prefixes that are in scope on that element in the DOM, as
well as declaring the default namespace (if any) that is in scope on
that element in the DOM.
The user agent must feed the parser just created the
string being assigned into the innerHTML attribute.
If the innerHTML attribute is being set on an
element, the user agent must feed the parser the string
corresponding to the end tag of that element.
If the parser found a well-formedness error, the attribute's setter
must raise a SYNTAX_ERR exception and abort these steps.
The user agent must remove the children nodes of the node whose innerHTML
attribute is being set.
If the attribute is being set on a Document node, let
new children be the children of the document,
preserving their order. Otherwise, the attribute is being set on an
Element node; let new children be the
children of the the document's root element, preserving their order.
If the attribute is being set on a Document node, let
target document be that Document node.
Otherwise, the attribute is being set on an Element node;
let target document be the ownerDocument of that Element.
Set the ownerDocument of all the nodes in new children to the target document.
Append all the new children nodes to the node
whose innerHTML attribute is being set,
preserving their order.
script elements inserted
using innerHTML do not execute when they are
inserted.