4.10.18.3 Association of controls and forms

A form-associated element can have a relationship with a form element, which is called the element's form owner. If a form-associated element is not associated with a form element, its form owner is said to be null.

A form-associated element is, by default, associated with its nearest ancestor form element (as described below), but, if it is reassociateable, may have a form attribute specified to override this.

This feature allows authors to work around the lack of support for nested form elements.

If a reassociateable form-associated element has a form attribute specified, then that attribute's value must be the ID of a form element in the element's owner Document.

The rules in this section are complicated by the fact that although conforming documents will never contain nested form elements, it is quite possible (e.g. using a script that performs DOM manipulation) to generate documents that have such nested elements. They are also complicated by rules in the HTML parser that, for historical reasons, can result in a form-associated element being associated with a form element that is not its ancestor.

When a form-associated element is created, its form owner must be initialized to null (no owner).

When a form-associated element is to be associated with a form, its form owner must be set to that form.

When a form-associated element or one of its ancestors is inserted into a Document, then the user agent must reset the form owner of that form-associated element. The HTML parser overrides this requirement when inserting form controls.

When an element changes its parent node resulting in a form-associated element and its form owner (if any) no longer being in the same home subtree, then the user agent must reset the form owner of that form-associated element.

When a reassociateable form-associated element's form attribute is set, changed, or removed, then the user agent must reset the form owner of that element.

When a reassociateable form-associated element has a form attribute and the ID of any of the elements in the Document changes, then the user agent must reset the form owner of that form-associated element.

When a reassociateable form-associated element has a form attribute and an element with an ID is inserted into or removed from the Document, then the user agent must reset the form owner of that form-associated element.

When the user agent is to reset the form owner of a form-associated element, it must run the following steps:

  1. If the element's form owner is not null, and either the element is not reassociateable or its form content attribute is not present, and the element's form owner is its nearest form element ancestor after the change to the ancestor chain, then do nothing, and abort these steps.

  2. Let the element's form owner be null.

  3. If the element is reassociateable, has a form content attribute, and is itself in a Document, then run these substeps:

    1. If the first element in the Document to have an ID that is case-sensitively equal to the element's form content attribute's value is a form element, then associate the form-associated element with that form element.

    2. Abort the "reset the form owner" steps.

  4. Otherwise, if the form-associated element in question has an ancestor form element, then associate the form-associated element with the nearest such ancestor form element.

  5. Otherwise, the element is left unassociated.

In the following non-conforming snippet:

...
 <form id="a">
  <div id="b"></div>
 </form>
 <script>
  document.getElementById('b').innerHTML =
     '<table><tr><td><form id="c"><input id="d"></table>' +
     '<input id="e">';
 </script>
...

The form owner of "d" would be the inner nested form "c", while the form owner of "e" would be the outer form "a".

This happens as follows: First, the "e" node gets associated with "c" in the HTML parser. Then, the innerHTML algorithm moves the nodes from the temporary document to the "b" element. At this point, the nodes see their ancestor chain change, and thus all the "magic" associations done by the parser are reset to normal ancestor associations.

This example is a non-conforming document, though, as it is a violation of the content models to nest form elements.

element . form

Returns the element's form owner.

Returns null if there isn't one.

Reassociateable form-associated elements have a form IDL attribute, which, on getting, must return the element's form owner, or null if there isn't one.

4.10.19 Attributes common to form controls

4.10.19.1 Naming form controls: the name attribute

The name content attribute gives the name of the form control, as used in form submission and in the form element's elements object. If the attribute is specified, its value must not be the empty string.

Any non-empty value for name is allowed, but the names "_charset_" and "isindex" are special:

isindex

This value, if used as the name of a Text control that is the first control in a form that is submitted using the application/x-www-form-urlencoded mechanism, causes the submission to only include the value of this control, with no name.

_charset_

This value, if used as the name of a Hidden control with no value attribute, is automatically given a value during submission consisting of the submission character encoding.

The name IDL attribute must reflect the name content attribute.

4.10.19.2 Submitting element directionality: the dirname attribute

The dirname attribute on a form control element enables the submission of the directionality of the element, and gives the name of the field that contains this value during form submission. If such an attribute is specified, its value must not be the empty string.

In this example, a form contains a text field and a submission button:

<form action="addcomment.cgi" method=post>
 <p><label>Comment: <input type=text name="comment" dirname="comment.dir" required></label></p>
 <p><button name="mode" type=submit value="add">Post Comment</button></p>
</form>

When the user submits the form, the user agent includes three fields, one called "comment", one called "comment.dir", and one called "mode"; so if the user types "Hello", the submission body might be something like:

comment=Hello&comment.dir=ltr&mode=add

If the user manually switches to a right-to-left writing direction and enters "مرحبا", the submission body might be something like:

comment=%D9%85%D8%B1%D8%AD%D8%A8%D8%A7&comment.dir=rtl&mode=add
4.10.19.3 Limiting user input length: the maxlength attribute

A form control maxlength attribute, controlled by a dirty value flag, declares a limit on the number of characters a user can input.

If an element has its form control maxlength attribute specified, the attribute's value must be a valid non-negative integer. If the attribute is specified and applying the rules for parsing non-negative integers to its value results in a number, then that number is the element's maximum allowed value length. If the attribute is omitted or parsing its value results in an error, then there is no maximum allowed value length.

Constraint validation: If an element has a maximum allowed value length, its dirty value flag is true, its value was last changed by a user edit (as opposed to a change made by a script), and the code-unit length of the element's value is greater than the element's maximum allowed value length, then the element is suffering from being too long.

User agents may prevent the user from causing the element's value to be set to a value whose code-unit length is greater than the element's maximum allowed value length.

In the case of textarea elements, this is the value, not the raw value, so the textarea wrapping transformation is applied before the maximum allowed value length is checked.

4.10.19.4 Setting minimum input length requirements: the minlength attribute

A form control minlength attribute, controlled by a dirty value flag, declares a lower bound on the number of characters a user can input.

The minlength attribute does not imply the required attribute. If the form control has no minlength attribute, then the value can still be omitted; the minlength attribute only kicks in once the user has entered a value at all. If the empty string is not allowed, then the required attribute also needs to be set.

If an element has its form control minlength attribute specified, the attribute's value must be a valid non-negative integer. If the attribute is specified and applying the rules for parsing non-negative integers to its value results in a number, then that number is the element's minimum allowed value length. If the attribute is omitted or parsing its value results in an error, then there is no minimum allowed value length.

If an element has both a maximum allowed value length and a minimum allowed value length, the minimum allowed value length must be smaller than or equal to the maximum allowed value length.

Constraint validation: If an element has a minimum allowed value length, its value is not the empty string, and the code-unit length of the element's value is less than the element's minimum allowed value length, then the element is suffering from being too short.

In this example, there are four text fields. The first is required, and has to be at least 5 characters long. The other three are optional, but if the user fills one in, the user has to enter at least 10 characters.

<form action="/events/menu.cgi" method="post">
 <p><label>Name of Event: <input required minlength=5 maxlength=50 name=event></label></p>
 <p><label>Describe what you would like for breakfast, if anything:
    <textarea name="breakfast" minlength="10"></textarea></label></p>
 <p><label>Describe what you would like for lunch, if anything:
    <textarea name="lunch" minlength="10"></textarea></label></p>
 <p><label>Describe what you would like for dinner, if anything:
    <textarea name="dinner" minlength="10"></textarea></label></p>
 <p><input type=submit value="Submit Request"></p>
</form>
4.10.19.5 Enabling and disabling form controls: the disabled attribute

The disabled content attribute is a boolean attribute.

A form control is disabled if its disabled attribute is set, or if it is a descendant of a fieldset element whose disabled attribute is set and is not a descendant of that fieldset element's first legend element child, if any.

A form control that is disabled must prevent any click events that are queued on the user interaction task source from being dispatched on the element.

Constraint validation: If an element is disabled, it is barred from constraint validation.

The disabled IDL attribute must reflect the disabled content attribute.

4.10.19.6 Form submission

Attributes for form submission can be specified both on form elements and on submit buttons (elements that represent buttons that submit forms, e.g. an input element whose type attribute is in the Submit Button state).

The attributes for form submission that may be specified on form elements are action, enctype, method, novalidate, and target.

The corresponding attributes for form submission that may be specified on submit buttons are formaction, formenctype, formmethod, formnovalidate, and formtarget. When omitted, they default to the values given on the corresponding attributes on the form element.


The action and formaction content attributes, if specified, must have a value that is a valid non-empty URL potentially surrounded by spaces.

The action of an element is the value of the element's formaction attribute, if the element is a submit button and has such an attribute, or the value of its form owner's action attribute, if it has one, or else the empty string.


The method and formmethod content attributes are enumerated attributes with the following keywords and states:

The invalid value default for these attributes is the GET state. The missing value default for the method attribute is also the GET state. (There is no missing value default for the formmethod attribute.)

The method of an element is one of those states. If the element is a submit button and has a formmethod attribute, then the element's method is that attribute's state; otherwise, it is the form owner's method attribute's state.

Here the method attribute is used to explicitly specify the default value, "get", so that the search query is submitted in the URL:

<form method="get" action="/search.cgi">
 <p><label>Search terms: <input type=search name=q></label></p>
 <p><input type=submit></p>
</form>

On the other hand, here the method attribute is used to specify the value "post", so that the user's message is submitted in the HTTP request's body:

<form method="post" action="/post-message.cgi">
 <p><label>Message: <input type=text name=m></label></p>
 <p><input type=submit value="Submit message"></p>
</form>

In this example, a form is used with a dialog. The method attribute's "dialog" keyword is used to have the dialog automatically close when the form is submitted.

<dialog id="ship">
 <form method=dialog>
  <p>A ship has arrived in the harbour.</p>
  <button type=submit value="board">Board the ship</button>
  <button type=submit value="call">Call to the captain</button>
 </form>
</dialog>
<script>
 var ship = document.getElementById('ship');
 ship.showModal();
 ship.onclose = function (event) {
   if (ship.returnValue == 'board') {
     // ...
   } else {
     // ...
   }
 };
</script>

The enctype and formenctype content attributes are enumerated attributes with the following keywords and states:

The invalid value default for these attributes is the application/x-www-form-urlencoded state. The missing value default for the enctype attribute is also the application/x-www-form-urlencoded state. (There is no missing value default for the formenctype attribute.)

The enctype of an element is one of those three states. If the element is a submit button and has a formenctype attribute, then the element's enctype is that attribute's state; otherwise, it is the form owner's enctype attribute's state.


The target and formtarget content attributes, if specified, must have values that are valid browsing context names or keywords.

The target of an element is the value of the element's formtarget attribute, if the element is a submit button and has such an attribute; or the value of its form owner's target attribute, if it has such an attribute; or, if the Document contains a base element with a target attribute, then the value of the target attribute of the first such base element; or, if there is no such element, the empty string.


The novalidate and formnovalidate content attributes are boolean attributes. If present, they indicate that the form is not to be validated during submission.

The no-validate state of an element is true if the element is a submit button and the element's formnovalidate attribute is present, or if the element's form owner's novalidate attribute is present, and false otherwise.

This attribute is useful to include "save" buttons on forms that have validation constraints, to allow users to save their progress even though they haven't fully entered the data in the form. The following example shows a simple form that has two required fields. There are three buttons: one to submit the form, which requires both fields to be filled in; one to save the form so that the user can come back and fill it in later; and one to cancel the form altogether.

<form action="editor.cgi" method="post">
 <p><label>Name: <input required name=fn></label></p>
 <p><label>Essay: <textarea required name=essay></textarea></label></p>
 <p><input type=submit name=submit value="Submit essay"></p>
 <p><input type=submit formnovalidate name=save value="Save essay"></p>
 <p><input type=submit formnovalidate name=cancel value="Cancel"></p>
</form>

The action IDL attribute must reflect the content attribute of the same name, except that on getting, when the content attribute is missing or its value is the empty string, the document's address must be returned instead. The target IDL attribute must reflect the content attribute of the same name. The method and enctype IDL attributes must reflect the respective content attributes of the same name, limited to only known values. The encoding IDL attribute must reflect the enctype content attribute, limited to only known values. The noValidate IDL attribute must reflect the novalidate content attribute. The formAction IDL attribute must reflect the formaction content attribute, except that on getting, when the content attribute is missing or its value is the empty string, the document's address must be returned instead. The formEnctype IDL attribute must reflect the formenctype content attribute, limited to only known values. The formMethod IDL attribute must reflect the formmethod content attribute, limited to only known values. The formNoValidate IDL attribute must reflect the formnovalidate content attribute. The formTarget IDL attribute must reflect the formtarget content attribute.

4.10.19.6.1 Autofocusing a form control: the autofocus attribute

The autofocus content attribute allows the author to indicate that a control is to be focused as soon as the page is loaded or as soon as the dialog within which it finds itself is shown, allowing the user to just start typing without having to manually focus the main control.

The autofocus attribute is a boolean attribute.

An element's nearest ancestor autofocus scoping root element is the element itself if the element is a dialog element, or else is the element's nearest ancestor dialog element, if any, or else is the element's root element.

There must not be two elements with the same nearest ancestor autofocus scoping root element that both have the autofocus attribute specified.

When an element with the autofocus attribute specified is inserted into a document, user agents should run the following steps:

  1. Let target be the element's Document.

  2. If target has no browsing context, abort these steps.

  3. If target's browsing context has no top-level browsing context (e.g. it is a nested browsing context with no parent browsing context), abort these steps.

  4. If target's active sandboxing flag set has the sandboxed automatic features browsing context flag, abort these steps.

  5. If target's origin is not the same as the origin of the Document of the currently focused element in target's top-level browsing context, abort these steps.

  6. If target's origin is not the same as the origin of the active document of target's top-level browsing context, abort these steps.

  7. If the user agent has already reached the last step of this list of steps in response to an element being inserted into a Document whose top-level browsing context's active document is the same as target's top-level browsing context's active document, abort these steps.

  8. If the user has indicated (for example, by starting to type in a form control) that he does not wish focus to be changed, then optionally abort these steps.

  9. Queue a task that runs the focusing steps for the element. User agents may also change the scrolling position of the document, or perform some other action that brings the element to the user's attention. The task source for this task is the user interaction task source.

This handles the automatic focusing during document load. The show() and showModal() methods of dialog elements also processes the autofocus attribute.

Focusing the control does not imply that the user agent must focus the browser window if it has lost focus.

The autofocus IDL attribute must reflect the content attribute of the same name.

In the following snippet, the text control would be focused when the document was loaded.

<input maxlength="256" name="q" value="" autofocus>
<input type="submit" value="Search">
4.10.19.7 Input modalities: the inputmode attribute

The inputmode content attribute is an enumerated attribute that specifies what kind of input mechanism would be most helpful for users entering content into the form control.

User agents must recognise all the keywords and corresponding states given below, but need not support all of the corresponding states. If a keyword's state is not supported, the user agent must act as if the keyword instead mapped to the given state's fallback state, as defined below. This fallback behaviour is transitive.

For example, if a user agent with a QWERTY keyboard layout does not support text prediction and automatic capitalization, then it could treat the latin-prose keyword in the same way as the verbatim keyword, following the chain Latin Prose &srarr; Latin Text &srarr; Latin Verbatim.

The possible keywords and states for the attributes are listed in the following table. The keywords are listed in the first column. Each maps to the state given in the cell in the second column of that keyword's row, and that state has the fallback state given in the cell in the third column of that row.

Keyword State Fallback state Description
verbatim Latin Verbatim Default Alphanumeric Latin-script input of non-prose content, e.g. usernames, passwords, product codes.
latin Latin Text Latin Verbatim Latin-script input in the user's preferred language(s), with some typing aids enabled (e.g. text prediction). Intended for human-to-computer communications, e.g. free-form text search fields.
latin-name Latin Name Latin Text Latin-script input in the user's preferred language(s), with typing aids intended for entering human names enabled (e.g. text prediction from the user's contact list and automatic capitalisation at every word). Intended for situations such as customer name fields.
latin-prose Latin Prose Latin Text Latin-script input in the user's preferred language(s), with aggressive typing aids intended for human-to-human communications enabled (e.g. text prediction and automatic capitalisation at the start of sentences). Intended for situations such as e-mails and instant messaging.
full-width-latin Full-width Latin Latin Prose Latin-script input in the user's secondary language(s), using full-width characters, with aggressive typing aids intended for human-to-human communications enabled (e.g. text prediction and automatic capitalisation at the start of sentences). Intended for latin text embedded inside CJK text.
kana Kana Default Kana or romaji input, typically hiragana input, using full-width characters, with support for converting to kanji. Intended for Japanese text input.
kana-name Kana Name Kana Kana or romaji input, typically hiragana input, using full-width characters, with support for converting to kanji, and with typing aids intended for entering human names enabled (e.g. text prediction from the user's contact list). Intended for situations such as customer name fields.
katakana Katakana Kana Katakana input, using full-width characters, with support for converting to kanji. Intended for Japanese text input.
numeric Numeric Default Numeric input, including keys for the digits 0 to 9, the user's preferred thousands separator character, and the character for indicating negative numbers. Intended for numeric codes, e.g. credit card numbers. (For numbers, prefer "<input type=number>".)
tel Telephone Numeric Telephone number input, including keys for the digits 0 to 9, the "#" character, and the "*" character. In some locales, this can also include alphabetic mnemonic labels (e.g. in the US, the key labeled "2" is historically also labeled with the letters A, B, and C). Rarely necessary; use "<input type=tel>" instead.
email E-mail Default Text input in the user's locale, with keys for aiding in the input of e-mail addresses, such as that for the "@" character and the "." character. Rarely necessary; use "<input type=email>" instead.
url URL Default Text input in the user's locale, with keys for aiding in the input of Web addresses, such as that for the "/" and "." characters and for quick input of strings commonly found in domain names such as "www." or ".co.uk". Rarely necessary; use "<input type=url>" instead.

The last three keywords listed above are only provided for completeness, and are rarely necessary, as dedicated input controls exist for their usual use cases (as described in the table above).

User agents must all support the Default input mode state, which corresponds to the user agent's default input modality. This specification does not define how the user agent's default modality is to operate. The missing value default is the default input mode state.

User agents should use the input modality corresponding to the state of the inputmode attribute when exposing a user interface for editing the value of a form control to which the attribute applies. An input modality corresponding to a state is one designed to fit the description of the state in the table above. This value can change dynamically; user agents should update their interface as the attribute changes state, unless that would go against the user's wishes.

4.10.19.8 Autofill
4.10.19.8.1 Autofilling form controls: the autocomplete attribute

User agents sometimes have features for helping users fill forms in, for example prefilling the user's address based on earlier user input. The autocomplete content attribute can be used to hint to the user agent how to, or indeed whether to, provide such a feature.

The attribute, if present, must have a value that is a set of space-separated tokens consisting of either a single token that is an ASCII case-insensitive match for the string "off", or a single token that is an ASCII case-insensitive match for the string "on", or the following, in the order given below:

  1. Optionally, a token whose first eight characters are an ASCII case-insensitive match for the string "section-", meaning that the field belongs to the named group.

    For example, if there are two shipping addresses in the form, then they could be marked up as:

    <fieldset>
     <legend>Ship the blue gift to...</legend>
     <p> <label> Address:     <input name=ba autocomplete="section-blue shipping street-address"> </label>
     <p> <label> City:        <input name=bc autocomplete="section-blue shipping region"> </label>
     <p> <label> Postal Code: <input name=bp autocomplete="section-blue shipping postal-code"> </label>
    </fieldset>
    <fieldset>
     <legend>Ship the red gift to...</legend>
     <p> <label> Address:     <input name=ra autocomplete="section-red shipping street-address"> </label>
     <p> <label> City:        <input name=rc autocomplete="section-red shipping region"> </label>
     <p> <label> Postal Code: <input name=rp autocomplete="section-red shipping country"> </label>
    </fieldset>
  2. Optionally, a token that is an ASCII case-insensitive match for one of the following strings:

  3. Either of the following two options:

The "off" keyword indicates either that the control's input data is particularly sensitive (for example the activation code for a nuclear weapon); or that it is a value that will never be reused (for example a one-time-key for a bank login) and the user will therefore have to explicitly enter the data each time, instead of being able to rely on the UA to prefill the value for him; or that the document provides its own autocomplete mechanism and does not want the user agent to provide autocompletion values.

The "on" keyword indicates that the user agent is allowed to provide the user with autocompletion values, but does not provide any further information about what kind of data the user might be expected to enter. User agents would have to use heuristics to decide what autocompletion values to suggest.

The autofill fields names listed above indicate that the user agent is allowed to provide the user with autocompletion values, and specifies what kind of value is expected. The keywords relate to each other as described in the table below. Each field name listed on a row of this table corresponds to the meaning given in the cell for that row in the column labeled "Meaning". Some fields correspond to subparts of other fields; for example, a credit card expiry date can be expressed as one field giving both the month and year of expiry ("cc-exp"), or as two fields, one giving the month ("cc-exp-month") and one the year ("cc-exp-year"). In such cases, the names of the broader fields cover multiple rows, in which the narrower fields are defined.

Generally, authors are encouraged to use the broader fields rather than the narrower fields, as the narrower fields tend to expose Western biases. For example, while it is common in some Western cultures to have a given name and a family name, in that order (and thus often referred to as a first name and a surname), many cultures put the family name first and the given name second, and many others simply have one name (a mononym). Having a single field is therefore more flexible.

Some fields are only appropriate for certain form controls. An autofill field name is inappropriate for a control if the control does not belong to the group listed for that autofill field in the fifth column of the first row describing that autofill field in the table below. What controls fall into each group is described below the table.

Field name Meaning Canonical Format Canonical Format Example Control group
"name" Full name Free-form text, no newlines Sir Timothy John Berners-Lee, OM, KBE, FRS, FREng, FRSA Text
"honorific-prefix" Prefix or title (e.g. "Mr.", "Ms.", "Dr.", "Mlle") Free-form text, no newlines Sir Text
"given-name" Given name (in some Western cultures, also known as the first name) Free-form text, no newlines Timothy Text
"additional-name" Additional names (in some Western cultures, also known as middle names, forenames other than the first name) Free-form text, no newlines John Text
"family-name" Family name (in some Western cultures, also known as the last name or surname) Free-form text, no newlines Berners-Lee Text
"honorific-suffix" Suffix (e.g. "Jr.", "B.Sc.", "MBASW", "II") Free-form text, no newlines OM, KBE, FRS, FREng, FRSA Text
"nickname" Nickname, screen name, handle: a typically short name used instead of the full name Free-form text, no newlines Tim Text
"organization-title" Job title (e.g. "Software Engineer", "Senior Vice President", "Deputy Managing Director") Free-form text, no newlines Professor Text
"username" A username Free-form text, no newlines timbl Text
"new-password" A new password (e.g. when creating an account or changing a password) Free-form text, no newlines GUMFXbadyrS3 Password
"current-password" The current password for the account identified by the username field (e.g. when logging in) Free-form text, no newlines qwerty Password
"organization" Company name corresponding to the person, address, or contact information in the other fields associated with this field Free-form text, no newlines World Wide Web Consortium Text
"street-address" Street address (multiple lines, newlines preserved) Free-form text 32 Vassar Street
MIT Room 32-G524
Multiline
"address-line1" Street address (one line per field) Free-form text, no newlines 32 Vassar Street Text
"address-line2" Free-form text, no newlines MIT Room 32-G524 Text
"address-line3" Free-form text, no newlines Text
"locality" City, town, village, post town, or other locality within which the relevant street address is found Free-form text, no newlines Cambridge Text
"region" Province such as a state, county, or canton within which the locality is found Free-form text, no newlines MA Text
"country" Country code Valid ISO 3166-1-alpha-2 country code [ISO3166] US Text
"country-name" Country name Free-form text, no newlines; derived from country in some cases US Text
"postal-code" Postal code, post code, ZIP code, CEDEX code (if CEDEX, append "CEDEX", and the arrondissement if relevant, to the locality field) Free-form text, no newlines 02139 Text
"cc-name" Full name as given on the payment instrument Free-form text, no newlines Tim Berners-Lee Text
"cc-given-name" Given name as given on the payment instrument (in some Western cultures, also known as the first name) Free-form text, no newlines Tim Text
"cc-additional-name" Additional names given on the payment instrument (in some Western cultures, also known as middle names, forenames other than the first name) Free-form text, no newlines Text
"cc-family-name" Family name given on the payment instrument (in some Western cultures, also known as the last name or surname) Free-form text, no newlines Berners-Lee Text
"cc-number" Code identifying the payment instrument (e.g. the credit card number, bank account number) ASCII digits 4114360123456785 Text
"cc-exp" Expiration date of the payment instrument Valid month string 2014-12 Month
"cc-exp-month" Month component of the expiration date of the payment instrument Valid integer in the range 1..12 12 Numeric
"cc-exp-year" Year component of the expiration date of the payment instrument Valid integer greater than zero 2014 Numeric
"cc-csc" Security code for the payment instrument (also known as the card security code (CSC), card validation code (CVC), card verification value (CVV), signature panel code (SPC), credit card ID (CCID), etc) ASCII digits 419 Text
"cc-type" Type of payment instrument Free-form text, no newlines Visa Text
"language" Preferred language Valid BCP 47 language tag [BCP47] en Text
"bday" Birthday Valid date string 1955-06-08 Date
"bday-day" Day component of birthday Valid integer in the range 1..31 8 Numeric
"bday-month" Month component of birthday Valid integer in the range 1..12 6 Numeric
"bday-year" Year component of birthday Valid integer greater than zero 1955 Numeric
"sex" Gender identity (e.g. Female, Fa'afafine) Free-form text, no newlines Male Text
"url" Home page or other Web page corresponding to the company, person, address, or contact information in the other fields associated with this field Valid URL http://www.w3.org/People/Berners-Lee/ URL
"photo" Photograph, icon, or other image corresponding to the company, person, address, or contact information in the other fields associated with this field Valid URL http://www.w3.org/Press/Stock/Berners-Lee/2001-europaeum-eighth.jpg URL
"tel" Full telephone number, including country code ASCII digits and U+0020 SPACE characters, prefixed by a U+002B PLUS SIGN character (+) +1 617 253 5702 Tel
"tel-country-code" Country code component of the telephone number ASCII digits prefixed by a U+002B PLUS SIGN character (+) +1 Text
"tel-national" Telephone number without the county code component, with a country-internal prefix applied if applicable ASCII digits and U+0020 SPACE characters 617 253 5702 Text
"tel-area-code" Area code component of the telephone number, with a country-internal prefix applied if applicable ASCII digits 617 Text
"tel-local" Telephone number without the country code and area code components ASCII digits 2535702 Text
"tel-local-prefix" First part of the component of the telephone number that follows the area code, when that component is split into two components ASCII digits 253 Text
"tel-local-suffix" Second part of the component of the telephone number that follows the area code, when that component is split into two components ASCII digits 5702 Text
"tel-extension" Telephone number internal extension code ASCII digits 1000 Text
"email" E-mail address Valid e-mail address timbl@w3.org E-mail
"impp" URL representing an instant messaging protocol endpoint (for example, "aim:goim?screenname=example" or "xmpp:fred@example.net") Valid URL irc://example.org/timbl,isuser URL

The groups correspond to controls as follows:

Text
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
textarea elements
select elements
Multiline
textarea elements
select elements
Password
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the Password state
textarea elements
select elements
URL
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the URL state
textarea elements
select elements
E-mail
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the E-mail state
textarea elements
select elements
Tel
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the Telephone state
textarea elements
select elements
Numeric
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the Number state
textarea elements
select elements
Month
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the Month state
textarea elements
select elements
Date
input elements whose type attribute is in the Text state
input elements whose type attribute is in the Search state
input elements whose type attribute is in the Date state
textarea elements
select elements

If the autocomplete attribute is omitted, the default value corresponding to the state of the element's form owner's autocomplete attribute is used instead (either "on" or "off"). If there is no form owner, then the value "on" is used.

4.10.19.8.2 Processing model

Each input, select, and textarea element has an autofill hint set, an autofill scope, an autofill field name, and an IDL-exposed autofill value.

The autofill field name specifies the specific kind of data expected in the field, e.g. "street-address" or "cc-exp".

The autofill hint set identifies what address or contact information type the user agent is to look at, e.g. "shipping fax" or "billing".

The autofill scope identifies the group of fields that are to be filled with the information from the same source, and consists of the autofill hint set with, if applicable, the "section-*" prefix, e.g. "billing", "section-parent shipping", or "section-child home shipping".

These values are defined as the result of running the following algorithm:

  1. If the element has no autocomplete attribute, then jump to the step labeled default.

  2. Let tokens be the result of splitting the attribute's value on spaces.

  3. If tokens is empty, then jump to the step labeled default.

  4. Let index be the index of the last token in tokens.

  5. If the indexth token in tokens is not an ASCII case-insensitive match for one of the tokens given in the first column of the following table, or if the number of tokens in tokens is greater than the maximum number given in the cell in the second column of that token's row, then jump to the step labeled default. Otherwise, let field be the string given in the cell of the first column of the matching row, and let category be the value of the cell in the third column of that same row.

    Token Maximum number of tokens Category
    "off" 1 Off
    "on" 1 Automatic
    "name" 3 Normal
    "honorific-prefix" 3 Normal
    "given-name" 3 Normal
    "additional-name" 3 Normal
    "family-name" 3 Normal
    "honorific-suffix" 3 Normal
    "nickname" 3 Normal
    "organization-title" 3 Normal
    "username" 3 Normal
    "new-password" 3 Normal
    "current-password" 3 Normal
    "organization" 3 Normal
    "street-address" 3 Normal
    "address-line1" 3 Normal
    "address-line2" 3 Normal
    "address-line3" 3 Normal
    "locality" 3 Normal
    "region" 3 Normal
    "country" 3 Normal
    "country-name" 3 Normal
    "postal-code" 3 Normal
    "cc-name" 3 Normal
    "cc-given-name" 3 Normal
    "cc-additional-name" 3 Normal
    "cc-family-name" 3 Normal
    "cc-number" 3 Normal
    "cc-exp" 3 Normal
    "cc-exp-month" 3 Normal
    "cc-exp-year" 3 Normal
    "cc-csc" 3 Normal
    "cc-type" 3 Normal
    "language" 3 Normal
    "bday" 3 Normal
    "bday-day" 3 Normal
    "bday-month" 3 Normal
    "bday-year" 3 Normal
    "sex" 3 Normal
    "url" 3 Normal
    "photo" 3 Normal
    "tel" 4 Contact
    "tel-country-code" 4 Contact
    "tel-national" 4 Contact
    "tel-area-code" 4 Contact
    "tel-local" 4 Contact
    "tel-local-prefix" 4 Contact
    "tel-local-suffix" 4 Contact
    "tel-extension" 4 Contact
    "email" 4 Contact
    "impp" 4 Contact
  6. If category is Off, let the element's autofill field name be the string "off", let its autofill hint set be empty, and let its IDL-exposed autofill value be the string "off". Then, abort these steps.

  7. If category is Automatic, let the element's autofill field name be the string "on", let its autofill hint set be empty, and let its IDL-exposed autofill value be the string "on". Then, abort these steps.

  8. Let scope tokens be an empty list.

  9. Let hint tokens be an empty set.

  10. Let IDL value have the same value as field.

  11. If the indexth token in tokens is the first entry, then skip to the step labeled done.

  12. Decrement index by one.

  13. If category is Contact and the indexth token in tokens is an ASCII case-insensitive match for one of the strings in the following list, then run the substeps that follow:

    The substeps are:

    1. Let contact be the matching string from the list above.

    2. Insert contact at the start of scope tokens.

    3. Add contact to hint tokens.

    4. Let IDL value be the concatenation of contact, a U+0020 SPACE character, and the previous value of IDL value (which at this point will always be field).

    5. If the indexth entry in tokens is the first entry, then skip to the step labeled done.

    6. Decrement index by one.

  14. If the indexth token in tokens is an ASCII case-insensitive match for one of the strings in the following list, then run the substeps that follow:

    The substeps are:

    1. Let mode be the matching string from the list above.

    2. Insert mode at the start of scope tokens.

    3. Add mode to hint tokens.

    4. Let IDL value be the concatenation of mode, a U+0020 SPACE character, and the previous value of IDL value (which at this point will either be field or the concatenation of contact, a space, and field).

    5. If the indexth entry in tokens is the first entry, then skip to the step labeled done.

    6. Decrement index by one.

  15. If the indexth entry in tokens is not the first entry, then jump to the step labeled default.

  16. If the first eight characters of the indexth token in tokens are not an ASCII case-insensitive match for the string "section-", then jump to the step labeled default.

  17. Let section be the indexth token in tokens, converted to ASCII lowercase.

  18. Insert section at the start of scope tokens.

  19. Let IDL value be the concatenation of section, a U+0020 SPACE character, and the previous value of IDL value.

  20. Done: Let the element's autofill hint set be hint tokens.

  21. Let the element's autofill scope be scope tokens.

  22. Let the element's autofill field name be field.

  23. Let the element's IDL-exposed autofill value be IDL value.

  24. Abort these steps.

  25. Default: Let the element's IDL-exposed autofill value be the empty string, and its autofill hint set and autofill scope be empty.

  26. Let form be the element's form owner, if any, or null otherwise.

  27. If form is not null and form's autocomplete attribute is in the off state, then let the element's autofill field name be "off".

    Otherwise, let the element's autofill field name be "on".


For the purposes of autofill, a control's data depends on the kind of control:

An input element with its type attribute in the E-mail state and with the multiple attribute specified
The element's values.
Any other input element
A textarea element
The element's value.
A select element with its multiple attribute specified
The option elements in the select element's list of options that have their selectedness set to true.
Any other select element
The option element in the select element's list of options that has its selectedness set to true.

When an element's autofill field name is "off", the user agent should not remember the control's data, and should not offer past values to the user.

In addition, when an element's autofill field name is "off", values are reset when traversing the history.

Banks frequently do not want UAs to prefill login information:

<p><label>Account: <input type="text" name="ac" autocomplete="off"></label></p>
<p><label>PIN: <input type="password" name="pin" autocomplete="off"></label></p>

When an element's autofill field name is not "off", the user agent may store the control's data, and may offer previously stored values to the user.

For example, suppose a user visits a page with this control:

<select name="country">
 <option>Afghanistan
 <option>Albania
 <option>Algeria
 <option>Andorra
 <option>Angola
 <option>Antigua and Barbuda
 <option>Argentina
 <option>Armenia
 <!-- ... -->
 <option>Yemen
 <option>Zambia
 <option>Zimbabwe
</select>

This might render as follows:

A drop-down control with a long alphabetical list of countries.

Suppose that on the first visit to this page, the user selects "Zambia". On the second visit, the user agent could duplicate the entry for Zambia at the top of the list, so that the interface instead looks like this:

The same drop-down control with the alphabetical list of countries, but with Zambia as an entry at the top.

When the autofill field name is "on", the user agent should attempt to use heuristics to determine the most appropriate values to offer the user, e.g. based on the element's name value, the position of the element in the document's DOM, what other fields exist in the form, and so forth.

When the autofill field name is one of the names of the autofill fields described above, the user agent should provide suggestions that match the meaning of the field name as given in the table earlier in this section. The autofill hint set should be used to select amongst multiple possible suggestions.

For example, if a user once entered one address into fields that used the "shipping" keyword, and another address into fields that used the "billing" keyword, then in subsequent forms only the first address would be suggested for form controls whose autofill hint set contains the keyword "shipping". Both addresses might be suggested, however, for address-related form controls whose autofill hint set does not contain either keyword.

When the user agent autofills form controls, elements with the same form owner and the same autofill scope must use data relating to the same person, address, payment instrument, and contact details. When a user agent autofills "country" and "country-name" fields with the same form owner and autofill scope, and the user agent has a value for the country" field(s), then the "country-name" field(s) must be filled using a human-readable name for the same country. When a user agent fills in multiple fields at once, all fields with the same autofill field name, form owner and autofill scope must be filled with the same value.

Suppose a user agent knows of two phone numbers, +1 555 123 1234 and +1 555 666 7777. It would not be conforming for the user agent to fill a field with autocomplete="shipping tel-local-prefix" with the value "123" and another field in the same form with autocomplete="shipping tel-local-suffix" with the value "7777". The only valid prefilled values given the aforementioned information would be "123" and "1234", or "666" and "7777", respectively.

Similarly, if a form for some reason contained both a "cc-exp" field and a "cc-exp-month" field, and the user agent prefilled the form, then the month component of the former would have to match the latter.

The autocompletion mechanism must be implemented by the user agent acting as if the user had modified the control's data, and must be done at a time where the element is mutable (e.g. just after the element has been inserted into the document, or when the user agent stops parsing). User agents must only prefill controls using values that the user could have entered.

For example, if a select element only has option elements with values "Steve" and "Rebecca", "Jay", and "Bob", and has an autofill field name "given-name", but the user agent's only idea for what to prefill the field with is "Evan", then the user agent cannot prefill the field. It would not be conforming to somehow set the select element to the value "Evan", since the user could not have done so themselves.

A user agent prefilling a form control's value must not cause that control to suffer from a type mismatch, suffer from being too long, suffer from being too short, suffer from an underflow, suffer from an overflow, or suffer from a step mismatch. Except when autofilling for requestAutocomplete(), a user agent prefilling a form control's value must not cause that control to suffer from a pattern mismatch either. Where possible given the control's constraints, user agents must use the format given as canonical in the aforementioned table. Where it's not possible for the canonical format to be used, user agents should use heuristics to attempt to convert values so that they can be used.

For example, if the user agent knows that the user's middle name is "Ines", and attempts to prefill a form control that looks like this:

<input name=middle-initial maxlength=1 autocomplete="additional-name">

...then the user agent could convert "Ines" to "I" and prefill it that way.

A more elaborate example would be with month values. If the user agent knows that the user's birthday is the 27th of July 2012, then it might try to prefill all of the following controls with slightly different values, all driven from this information:

<input name=b type=month autocomplete="bday">
2012-07 The day is dropped since the Month state only accepts a month/year combination.
<select name=c autocomplete="bday">
 <option>Jan
 <option>Feb
 ...
 <option>Jul
 <option>Aug
 ...
</select>
July The user agent picks the month from the listed options, either by noticing there are twelve options and picking the 7th, or by recognising that one of the strings (three characters "Jul" followed by a newline and a space) is a close match for the name of the month (July) in one of the user agent's supported languages, or through some other similar mechanism.
<input name=a type=number min=1 max=12 autocomplete="bday-month">
7 User agent converts "July" to a month number in the range 1..12, like the field.
<input name=a type=number min=0 max=11 autocomplete="bday-month">
6 User agent converts "July" to a month number in the range 0..11, like the field.
<input name=a type=number min=1 max=11 autocomplete="bday-month">
User agent doesn't fill in the field, since it can't make a good guess as to what the form expects.

A user agent may allow the user to override an element's autofill field name, e.g. to change it from "off" to "on" to allow values to be remembered and prefilled despite the page author's objections, or to always "off", never remembering values. However, user agents should not allow users to trivially override the autofill field name from "off" to "on" or other values, as there are significant security implications for the user if all values are always remembered, regardless of the site's preferences.

The autocomplete IDL attribute, on getting, must return the element's IDL-exposed autofill value, and on setting, must reflect the content attribute of the same name.

4.10.19.8.3 User interface for bulk autofill

When the requestAutocomplete() method on a form element is invoked, the user agent must run the following steps:

  1. Let form be the element on which the method was invoked.

  2. If any of the following conditions are met, then queue a task to fail the autofill request on form with the reason "disabled", and abort these steps:

    • the algorithm is not allowed to show a popup

    • form is not in a Document

    • form's Document is not fully active

    • form's autocomplete attribute is in the off state

    • the user has disabled this feature for this form's Document's origin

    • the user agent does not support this form's fields

    • the form was obtained via unencrypted channels and the user agent does not support autofill in such situations

    • another instance of this algorithm is already being run for form

    User agents are encouraged to log the precise cause in a developer console, to aid debugging.

  3. Let pending autofills be an initially empty list of submittable elements, each annotated with a string known as the original autocomplete value.

  4. For each element that matches the following criteria, add the element to pending autofills, with the original autocomplete value annotation being the value of the element's autocomplete attribute:

  5. Return, but continue running these steps asynchronously.

  6. Provide an interface for the user to efficiently fill in some or all of the fields listed in pending autofills. Await the user's input.

  7. Queue a task to run the following steps:

    1. If any of the following conditions are met, then fail the autofill request on form with the reason "disabled", and abort these steps:

      Again, user agents are encouraged to log the precise cause in a developer console, to aid debugging.

    2. If the user canceled the operation, fail the autofill request on form with the reason "cancel", and abort these steps.

    3. For each element in pending autofills, run the following steps:

      1. Let candidate be the element in question.

      2. Let old autocomplete value be the original autocomplete value annotation associated with candidate in pending autofills.

      3. If all of the following conditions are met, then autofill candidate:

    4. Statically validate the constraints of form. If the result was negative, then fail the autofill request on form with the reason "invalid", and abort these steps.

      Statically validating the constraints of a form involves firing invalid events to each control that does not satisfy its contraints.

    5. Fire a simple event that bubbles named autocomplete at form.

When the user agent is required to fail the autofill request on a form element target with a reason reason, the user agent must dispatch an event that uses the AutocompleteErrorEvent interface, with the event type autocompleteerror, which bubbles, is not cancelable, has no default action, has its reason attribute set to reason, and which is trusted, at target.

The task source for the tasks mentioned in this section is the DOM manipulation task source.

4.10.19.8.4 The AutocompleteErrorEvent interface
enum AutocompleteErrorReason { "" /* empty string */, "cancel", "disabled", "invalid" };

[Constructor(DOMString type, optional AutocompleteErrorEventInit eventInitDict)]
interface AutocompleteErrorEvent : Event {
  readonly attribute AutocompleteErrorReason reason;
};

dictionary AutocompleteErrorEventInit : EventInit {
  AutocompleteErrorReason reason;
};
event . reason

For the autocompleteerror event, returns the general reason for the failure of the requestAutocomplete() method, from the list below.

The defined reason codes are:

"" (the empty string)

Reason is unknown.

"cancel"

The user canceled the autofill interface.

"disabled"

The autofill interface is disabled for this form.

There are many reasons why this might be the case; the precise reason is not given, to protect the user's privacy. Amongst these reasons are such factors as:

"invalid"

The fields have been prefilled, but at least one of the controls in the form does not satisfy its constraints.

The reason attribute must return the value it was initialized to. When the object is created, this attribute must be initialized to the empty string. It represents the context information for the event.

4.10.20 APIs for the text field selections

The input and textarea elements define the following members in their DOM interfaces for handling their selection:

  void select();
           attribute unsigned long selectionStart;
           attribute unsigned long selectionEnd;
           attribute DOMString selectionDirection;
  void setRangeText(DOMString replacement);
  void setRangeText(DOMString replacement, unsigned long start, unsigned long end, optional SelectionMode selectionMode = "preserve");
  void setSelectionRange(unsigned long start, unsigned long end, optional DOMString direction = "preserve");

The setRangeText method uses the following enumeration:

enum SelectionMode {
  "select",
  "start",
  "end",
  "preserve", // default
};

These methods and attributes expose and control the selection of input and textarea text fields.

element . select()

Selects everything in the text field.

element . selectionStart [ = value ]

Returns the offset to the start of the selection.

Can be set, to change the start of the selection.

element . selectionEnd [ = value ]

Returns the offset to the end of the selection.

Can be set, to change the end of the selection.

element . selectionDirection [ = value ]

Returns the current direction of the selection.

Can be set, to change the direction of the selection.

The possible values are "forward", "backward", and "none".

element . setSelectionRange(start, end [, direction] )

Changes the selection to cover the given substring in the given direction. If the direction is omitted, it will be reset to be the platform default (none or forward).

element . setRangeText(replacement [, start, end [, selectionMode ] ] )

Replaces a range of text with the new text. If the start and end arguments are not provided, the range is assumed to be the selection.

The final argument determines how the selection should be set after the text has been replaced. The possible values are:

"select"
Selects the newly inserted text.
"start"
Moves the selection to just before the inserted text.
"end"
Moves the selection to just after the selected text.
"preserve"
Attempts to preserve the selection. This is the default.

For input elements, calling these methods while they don't apply, and getting or setting these attributes while they don't apply, must throw an InvalidStateError exception. Otherwise, they must act as described below.

For input elements, these methods and attributes must operate on the element's value. For textarea elements, these methods and attributes must operate on the element's raw value.

Where possible, user interface features for changing the text selection in input and textarea elements must be implemented in terms of the DOM API described in this section, so that, e.g., all the same events fire.

The selections of input and textarea elements have a direction, which is either forward, backward, or none. This direction is set when the user manipulates the selection. The exact meaning of the selection direction depends on the platform.

On Windows, the direction indicates the position of the caret relative to the selection: a forward selection has the caret at the end of the selection and a backward selection has the caret at the start of the selection. Windows has no none direction. On Mac, the direction indicates which end of the selection is affected when the user adjusts the size of the selection using the arrow keys with the Shift modifier: the forward direction means the end of the selection is modified, and the backwards direction means the start of the selection is modified. The none direction is the default on Mac, it indicates that no particular direction has yet been selected. The user sets the direction implicitly when first adjusting the selection, based on which directional arrow key was used.

The select() method must cause the contents of the text field to be fully selected, with the selection direction being none, if the platform support selections with the direction none, or otherwise forward. The user agent must then queue a task to fire a simple event that bubbles named select at the element, using the user interaction task source as the task source.

The selectionStart attribute must, on getting, return the offset (in logical order) to the character that immediately follows the start of the selection. If there is no selection, then it must return the offset (in logical order) to the character that immediately follows the text entry cursor.

On setting, it must act as if the setSelectionRange() method had been called, with the new value as the first argument; the current value of the selectionEnd attribute as the second argument, unless the current value of the selectionEnd is less than the new value, in which case the second argument must also be the new value; and the current value of the selectionDirection as the third argument.

The selectionEnd attribute must, on getting, return the offset (in logical order) to the character that immediately follows the end of the selection. If there is no selection, then it must return the offset (in logical order) to the character that immediately follows the text entry cursor.

On setting, it must act as if the setSelectionRange() method had been called, with the current value of the selectionStart attribute as the first argument, the new value as the second argument, and the current value of the selectionDirection as the third argument.

The selectionDirection attribute must, on getting, return the string corresponding to the current selection direction: if the direction is forward, "forward"; if the direction is backward, "backward"; and otherwise, "none".

On setting, it must act as if the setSelectionRange() method had been called, with the current value of the selectionStart attribute as the first argument, the current value of the selectionEnd attribute as the second argument, and the new value as the third argument.

The setSelectionRange(start, end, direction) method must set the selection of the text field to the sequence of characters starting with the character at the startth position (in logical order) and ending with the character at the (end-1)th position. Arguments greater than the length of the value of the text field must be treated as pointing at the end of the text field. If end is less than or equal to start then the start of the selection and the end of the selection must both be placed immediately before the character with offset end. In UAs where there is no concept of an empty selection, this must set the cursor to be just before the character with offset end. The direction of the selection must be set to backward if direction is a case-sensitive match for the string "backward", forward if direction is a case-sensitive match for the string "forward" or if the platform does not support selections with the direction none, and none otherwise (including if the argument is omitted). The user agent must then queue a task to fire a simple event that bubbles named select at the element, using the user interaction task source as the task source.

The setRangeText(replacement, start, end, selectMode) method must run the following steps:

  1. If the method has only one argument, then let start and end have the values of the selectionStart attribute and the selectionEnd attribute respectively.

    Otherwise, let start, end have the values of the second and third arguments respectively.

  2. If start is greater than end, then throw an IndexSizeError exception and abort these steps.

  3. If start is greater than the length of the value of the text field, then set it to the length of the value of the text field.

  4. If end is greater than the length of the value of the text field, then set it to the length of the value of the text field.

  5. Let selection start be the current value of the selectionStart attribute.

  6. Let selection end be the current value of the selectionEnd attribute.

  7. If start is less than end, delete the sequence of characters starting with the character at the startth position (in logical order) and ending with the character at the (end-1)th position.

  8. Insert the value of the first argument into the text of the value of the text field, immediately before the startth character.

  9. Let new length be the length of the value of the first argument.

  10. Let new end be the sum of start and new length.

  11. Run the appropriate set of substeps from the following list:

    If the fourth argument's value is "select"

    Let selection start be start.

    Let selection end be new end.

    If the fourth argument's value is "start"

    Let selection start and selection end be start.

    If the fourth argument's value is "end"

    Let selection start and selection end be new end.

    If the fourth argument's value is "preserve" (the default)
    1. Let old length be end minus start.

    2. Let delta be new length minus old length.

    3. If selection start is greater than end, then increment it by delta. (If delta is negative, i.e. the new text is shorter than the old text, then this will decrease the value of selection start.)

      Otherwise: if selection start is greater than start, then set it to start. (This snaps the start of the selection to the start of the new text if it was in the middle of the text that it replaced.)

    4. If selection end is greater than end, then increment it by delta in the same way.

      Otherwise: if selection end is greater than start, then set it to new end. (This snaps the end of the selection to the end of the new text if it was in the middle of the text that it replaced.)

  12. Set the selection of the text field to the sequence of characters starting with the character at the selection startth position (in logical order) and ending with the character at the (selection end-1)th position. In UAs where there is no concept of an empty selection, this must set the cursor to be just before the character with offset end. The direction of the selection must be set to forward if the platform does not support selections with the direction none, and none otherwise.

  13. Queue a task to fire a simple event that bubbles named select at the element, using the user interaction task source as the task source.

All elements to which this API applies have either a selection or a text entry cursor position at all times (even for elements that are not being rendered). User agents should follow platform conventions to determine their initial state.

Characters with no visible rendering, such as U+200D ZERO WIDTH JOINER, still count as characters. Thus, for instance, the selection can include just an invisible character, and the text insertion cursor can be placed to one side or another of such a character.

To obtain the currently selected text, the following JavaScript suffices:

var selectionText = control.value.substring(control.selectionStart, control.selectionEnd);

...where control is the input or textarea element.

To add some text at the start of a text control, while maintaining the text selection, the three attributes must be preserved:

var oldStart = control.selectionStart;
var oldEnd = control.selectionEnd;
var oldDirection = control.selectionDirection;
var prefix = "http://";
control.value = prefix + control.value;
control.setSelectionRange(oldStart + prefix.length, oldEnd + prefix.length, oldDirection);

...where control is the input or textarea element.

4.10.21 Constraints

4.10.21.1 Definitions

A submittable element is a candidate for constraint validation except when a condition has barred the element from constraint validation. (For example, an element is barred from constraint validation if it is an object element.)

An element can have a custom validity error message defined. Initially, an element must have its custom validity error message set to the empty string. When its value is not the empty string, the element is suffering from a custom error. It can be set using the setCustomValidity() method. The user agent should use the custom validity error message when alerting the user to the problem with the control.

An element can be constrained in various ways. The following is the list of validity states that a form control can be in, making the control invalid for the purposes of constraint validation. (The definitions below are non-normative; other parts of this specification define more precisely when each state applies or does not.)

Suffering from being missing

When a control has no value but has a required attribute (input required, textarea required); or, in the case of an element in a radio button group, any of the other elements in the group has a required attribute; or, for select elements, none of the option elements have their selectedness set (select required).

Suffering from a type mismatch

When a control that allows arbitrary user input has a value that is not in the correct syntax (E-mail, URL).

Suffering from a pattern mismatch

When a control has a value that doesn't satisfy the pattern attribute.

Suffering from being too long

When a control has a value that is too long for the form control maxlength attribute (input maxlength, textarea maxlength).

Suffering from being too short

When a control has a value that is too short for the form control minlength attribute (input minlength, textarea minlength).

Suffering from an underflow

When a control has a value that is too low for the min attribute.

Suffering from an overflow

When a control has a value that is too high for the max attribute.

Suffering from a step mismatch

When a control has a value that doesn't fit the rules given by the step attribute.

Suffering from bad input

When a control has incomplete input and the user agent does not think the user ought to be able to submit the form in its current state.

Suffering from a custom error

When a control's custom validity error message (as set by the element's setCustomValidity() method) is not the empty string.

An element can still suffer from these states even when the element is disabled; thus these states can be represented in the DOM even if validating the form during submission wouldn't indicate a problem to the user.

An element satisfies its constraints if it is not suffering from any of the above validity states.

4.10.21.2 Constraint validation

When the user agent is required to statically validate the constraints of form element form, it must run the following steps, which return either a positive result (all the controls in the form are valid) or a negative result (there are invalid controls) along with a (possibly empty) list of elements that are invalid and for which no script has claimed responsibility:

  1. Let controls be a list of all the submittable elements whose form owner is form, in tree order.

  2. Let invalid controls be an initially empty list of elements.

  3. For each element field in controls, in tree order, run the following substeps:

    1. If field is not a candidate for constraint validation, then move on to the next element.

    2. Otherwise, if field satisfies its constraints, then move on to the next element.

    3. Otherwise, add field to invalid controls.

  4. If invalid controls is empty, then return a positive result and abort these steps.

  5. Let unhandled invalid controls be an initially empty list of elements.

  6. For each element field in invalid controls, if any, in tree order, run the following substeps:

    1. Fire a simple event named invalid that is cancelable at field.

    2. If the event was not canceled, then add field to unhandled invalid controls.

  7. Return a negative result with the list of elements in the unhandled invalid controls list.

If a user agent is to interactively validate the constraints of form element form, then the user agent must run the following steps:

  1. Statically validate the constraints of form, and let unhandled invalid controls be the list of elements returned if the result was negative.

  2. If the result was positive, then return that result and abort these steps.

  3. Report the problems with the constraints of at least one of the elements given in unhandled invalid controls to the user. User agents may focus one of those elements in the process, by running the focusing steps for that element, and may change the scrolling position of the document, or perform some other action that brings the element to the user's attention. User agents may report more than one constraint violation. User agents may coalesce related constraint violation reports if appropriate (e.g. if multiple radio buttons in a group are marked as required, only one error need be reported). If one of the controls is not being rendered (e.g. it has the hidden attribute set) then user agents may report a script error.

  4. Return a negative result.

4.10.21.3 The constraint validation API
element . willValidate

Returns true if the element will be validated when the form is submitted; false otherwise.

element . setCustomValidity(message)

Sets a custom error, so that the element would fail to validate. The given message is the message to be shown to the user when reporting the problem to the user.

If the argument is the empty string, clears the custom error.

element . validity . valueMissing

Returns true if the element has no value but is a required field; false otherwise.

element . validity . typeMismatch

Returns true if the element's value is not in the correct syntax; false otherwise.

element . validity . patternMismatch

Returns true if the element's value doesn't match the provided pattern; false otherwise.

element . validity . tooLong

Returns true if the element's value is longer than the provided maximum length; false otherwise.

element . validity . tooShort

Returns true if the element's value, if it is not the empty string, is shorter than the provided minimum length; false otherwise.

element . validity . rangeUnderflow

Returns true if the element's value is lower than the provided minimum; false otherwise.

element . validity . rangeOverflow

Returns true if the element's value is higher than the provided maximum; false otherwise.

element . validity . stepMismatch

Returns true if the element's value doesn't fit the rules given by the step attribute; false otherwise.

element . validity . badInput

Returns true if the user has provided input in the user interface that the user agent is unable to convert to a value; false otherwise.

element . validity . customError

Returns true if the element has a custom error; false otherwise.

element . validity . valid

Returns true if the element's value has no validity problems; false otherwise.

valid = element . checkValidity()

Returns true if the element's value has no validity problems; false otherwise. Fires an invalid event at the element in the latter case.

valid = element . reportValidity()

Returns true if the element's value has no validity problems; otherwise, returns false, fires an invalid event at the element, and (if the event isn't canceled) reports the problem to the user.

element . validationMessage

Returns the error message that would be shown to the user if the element was to be checked for validity.

The willValidate attribute must return true if an element is a candidate for constraint validation, and false otherwise (i.e. false if any conditions are barring it from constraint validation).

The setCustomValidity(message), when invoked, must set the custom validity error message to the value of the given message argument.

In the following example, a script checks the value of a form control each time it is edited, and whenever it is not a valid value, uses the setCustomValidity() method to set an appropriate message.

<label>Feeling: <input name=f type="text" oninput="check(this)"></label>
<script>
 function check(input) {
   if (input.value == "good" ||
       input.value == "fine" ||
       input.value == "tired") {
     input.setCustomValidity('"' + input.value + '" is not a feeling.');
   } else {
     // input is fine -- reset the error message
     input.setCustomValidity('');
   }
 }
</script>

The validity attribute must return a ValidityState object that represents the validity states of the element. This object is live, and the same object must be returned each time the element's validity attribute is retrieved.

interface ValidityState {
  readonly attribute boolean valueMissing;
  readonly attribute boolean typeMismatch;
  readonly attribute boolean patternMismatch;
  readonly attribute boolean tooLong;
  readonly attribute boolean tooShort;
  readonly attribute boolean rangeUnderflow;
  readonly attribute boolean rangeOverflow;
  readonly attribute boolean stepMismatch;
  readonly attribute boolean badInput;
  readonly attribute boolean customError;
  readonly attribute boolean valid;
};

A ValidityState object has the following attributes. On getting, they must return true if the corresponding condition given in the following list is true, and false otherwise.

valueMissing

The control is suffering from being missing.

typeMismatch

The control is suffering from a type mismatch.

patternMismatch

The control is suffering from a pattern mismatch.

tooLong

The control is suffering from being too long.

tooShort

The control is suffering from being too short.

rangeUnderflow

The control is suffering from an underflow.

rangeOverflow

The control is suffering from an overflow.

stepMismatch

The control is suffering from a step mismatch.

badInput

The control is suffering from bad input.

customError

The control is suffering from a custom error.

valid

None of the other conditions are true.

When the checkValidity() method is invoked, if the element is a candidate for constraint validation and does not satisfy its constraints, the user agent must fire a simple event named invalid that is cancelable (but in this case has no default action) at the element and return false. Otherwise, it must only return true without doing anything else.

When the reportValidity() method is invoked, if the element is a candidate for constraint validation and does not satisfy its constraints, the user agent must: fire a simple event named invalid that is cancelable at the element, and if that event is not canceled, report the problems with the constraints of that element to the user; then, return false. Otherwise, it must only return true without doing anything else. When reporting the problem with the constraints to the user, the user agent may run the focusing steps for that element, and may change the scrolling position of the document, or perform some other action that brings the element to the user's attention. User agents may report more than one constraint violation, if the element suffers from multiple problems at once. If the element is not being rendered, then the user agent may, instead of notifying the user, report a script error.

The validationMessage attribute must return the empty string if the element is not a candidate for constraint validation or if it is one but it satisfies its constraints; otherwise, it must return a suitably localized message that the user agent would show the user if this were the only form control with a validity constraint problem. If the user agent would not actually show a textual message in such a situation (e.g. it would show a graphical cue instead), then the attribute must return a suitably localized message that expresses (one or more of) the validity constraint(s) that the control does not satisfy. If the element is a candidate for constraint validation and is suffering from a custom error, then the custom validity error message should be present in the return value.

4.10.21.4 Security

Servers should not rely on client-side validation. Client-side validation can be intentionally bypassed by hostile users, and unintentionally bypassed by users of older user agents or automated tools that do not implement these features. The constraint validation features are only intended to improve the user experience, not to provide any kind of security mechanism.

4.10.22 Form submission

4.10.22.1 Introduction

This section is non-normative.

When a form is submitted, the data in the form is converted into the structure specified by the enctype, and then sent to the destination specified by the action using the given method.

For example, take the following form:

<form action="/find.cgi" method=get>
 <input type=text name=t>
 <input type=search name=q>
 <input type=submit>
</form>

If the user types in "cats" in the first field and "fur" in the second, and then hits the submit button, then the user agent will load /find.cgi?t=cats&q=fur.

On the other hand, consider this form:

<form action="/find.cgi" method=post enctype="multipart/form-data">
 <input type=text name=t>
 <input type=search name=q>
 <input type=submit>
</form>

Given the same user input, the result on submission is quite different: the user agent instead does an HTTP POST to the given URL, with as the entity body something like the following text:

------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="t"

cats
------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="q"

fur
------kYFrd4jNJEgCervE--
4.10.22.2 Implicit submission

A form element's default button is the first submit button in tree order whose form owner is that form element.

If the user agent supports letting the user submit a form implicitly (for example, on some platforms hitting the "enter" key while a text field is focused implicitly submits the form), then doing so for a form whose default button has a defined activation behavior must cause the user agent to run synthetic click activation steps on that default button.

Consequently, if the default button is disabled, the form is not submitted when such an implicit submission mechanism is used. (A button has no activation behavior when disabled.)

There are pages on the Web that are only usable if there is a way to implicitly submit forms, so user agents are strongly encouraged to support this.

If the form has no submit button, then the implicit submission mechanism must do nothing if the form has more than one field that blocks implicit submission, and must submit the form element from the form element itself otherwise.

For the purpose of the previous paragraph, an element is a field that blocks implicit submission of a form element if it is an input element whose form owner is that form element and whose type attribute is in one of the following states: Text, Search, URL, Telephone, E-mail, Password, Date and Time, Date, Month, Week, Time, Local Date and Time, Number

4.10.22.3 Form submission algorithm

When a form element form is submitted from an element submitter (typically a button), optionally with a submitted from submit() method flag set, the user agent must run the following steps:

  1. Let form document be the form's Document.

  2. If form document has no associated browsing context or its active sandboxing flag set has its sandboxed forms browsing context flag set, then abort these steps without doing anything.

  3. Let form browsing context be the browsing context of form document.

  4. If the submitted from submit() method flag is not set, and the submitter element's no-validate state is false, then interactively validate the constraints of form and examine the result: if the result is negative (the constraint validation concluded that there were invalid fields and probably informed the user of this) then fire a simple event named invalid at the form element and then abort these steps.

  5. If the submitted from submit() method flag is not set, then fire a simple event that bubbles and is cancelable named submit, at form. If the event's default action is prevented (i.e. if the event is canceled) then abort these steps. Otherwise, continue (effectively the default action is to perform the submission).

  6. Let form data set be the result of constructing the form data set for form in the context of submitter.

  7. Let action be the submitter element's action.

  8. If action is the empty string, let action be the document's address of the form document.

  9. Resolve the URL action, relative to the submitter element. If this fails, abort these steps.

  10. Let action be the resulting absolute URL.

  11. Let action components be the resulting parsed URL.

  12. Let scheme be the scheme of the resulting parsed URL.

  13. Let enctype be the submitter element's enctype.

  14. Let method be the submitter element's method.

  15. Let target be the submitter element's target.

  16. If the user indicated a specific browsing context to use when submitting the form, then let target browsing context be that browsing context. Otherwise, apply the rules for choosing a browsing context given a browsing context name using target as the name and form browsing context as the context in which the algorithm is executed, and let target browsing context be the resulting browsing context.

  17. If target browsing context was created in the previous step, or, alternatively, if the form document has not yet completely loaded and the submitted from submit() method flag is set, then let replace be true. Otherwise, let it be false.

  18. If the value of method is dialog then jump to the submit dialog steps.

    Otherwise, select the appropriate row in the table below based on the value of scheme as given by the first cell of each row. Then, select the appropriate cell on that row based on the value of method as given in the first cell of each column. Then, jump to the steps named in that cell and defined below the table.

    GET POST
    http Mutate action URL Submit as entity body
    https Mutate action URL Submit as entity body
    ftp Get action URL Get action URL
    javascript Get action URL Get action URL
    data Get action URL Post to data:
    mailto Mail with headers Mail as body

    If scheme is not one of those listed in this table, then the behavior is not defined by this specification. User agents should, in the absence of another specification defining this, act in a manner analogous to that defined in this specification for similar schemes.

    Each form element has a planned navigation, which is either null or a task; when the form is first created, its planned navigation must be set to null. In the behaviours described below, when the user agent is required to plan to navigate to a particular resource destination, it must run the following steps:

    1. If the form has a non-null planned navigation, remove it from its task queue.

    2. Let the form's planned navigation be a new task that consists of running the following steps:

      1. Let the form's planned navigation be null.

      2. Navigate target browsing context to the particular resource destination. If replace is true, then target browsing context must be navigated with replacement enabled.

      For the purposes of this task, target browsing context and replace are the variables that were set up when the overall form submission algorithm was run, with their values as they stood when this planned navigation was queued.

    3. Queue the task that is the form's new planned navigation.

      The task source for this task is the DOM manipulation task source.

    The behaviors are as follows:

    Mutate action URL

    Let query be the result of encoding the form data set using the application/x-www-form-urlencoded encoding algorithm, interpreted as a US-ASCII string.

    Set parsed action's query component to query.

    Let destination be a new URL formed by applying the URL serializer algorithm to parsed action.

    Plan to navigate to destination.

    Submit as entity body

    Let entity body be the result of encoding the form data set using the appropriate form encoding algorithm.

    Let MIME type be determined as follows:

    If enctype is application/x-www-form-urlencoded
    Let MIME type be "application/x-www-form-urlencoded".
    If enctype is multipart/form-data
    Let MIME type be the concatenation of the string "multipart/form-data;", a U+0020 SPACE character, the string "boundary=", and the multipart/form-data boundary string generated by the multipart/form-data encoding algorithm.
    If enctype is text/plain
    Let MIME type be "text/plain".

    Otherwise, plan to navigate to action using the HTTP method given by method and with entity body as the entity body, of type MIME type.

    Get action URL

    Plan to navigate to action.

    The form data set is discarded.

    Post to data:

    Let data be the result of encoding the form data set using the appropriate form encoding algorithm.

    If action contains the string "%%%%" (four U+0025 PERCENT SIGN characters), then percent encode all bytes in data that, if interpreted as US-ASCII, are not characters in the URL default encode set, and then, treating the result as a US-ASCII string, UTF-8 percent encode all the U+0025 PERCENT SIGN characters in the resulting string and replace the first occurrence of "%%%%" in action with the resulting doubly-escaped string. [URL]

    Otherwise, if action contains the string "%%" (two U+0025 PERCENT SIGN characters in a row, but not four), then UTF-8 percent encode all characters in data that, if interpreted as US-ASCII, are not characters in the URL default encode set, and then, treating the result as a US-ASCII string, replace the first occurrence of "%%" in action with the resulting escaped string. [URL]

    Plan to navigate to the potentially modified action (which will be a data: URL).

    Mail with headers

    Let headers be the resulting encoding the form data set using the application/x-www-form-urlencoded encoding algorithm, interpreted as a US-ASCII string.

    Replace occurrences of U+002B PLUS SIGN characters (+) in headers with the string "%20".

    Let destination consist of all the characters from the first character in action to the character immediately before the first U+003F QUESTION MARK character (?), if any, or the end of the string if there are none.

    Append a single U+003F QUESTION MARK character (?) to destination.

    Append headers to destination.

    Plan to navigate to destination.

    Mail as body

    Let body be the resulting of encoding the form data set using the appropriate form encoding algorithm and then percent encoding all the bytes in the resulting byte string that, when interpreted as US-ASCII, are not characters in the URL default encode set. [URL]

    Let destination have the same value as action.

    If destination does not contain a U+003F QUESTION MARK character (?), append a single U+003F QUESTION MARK character (?) to destination. Otherwise, append a single U+0026 AMPERSAND character (&).

    Append the string "body=" to destination.

    Append body, interpreted as a US-ASCII string, to destination.

    Plan to navigate to destination.

    Submit dialog

    Let subject be the nearest ancestor dialog element of form, if any.

    If there isn't one, or if it does not have an open attribute, do nothing. Otherwise, proceed as follows:

    If submitter is an input element whose type attribute is in the Image Button state, then let result be the string formed by concatenating the selected coordinate's x-component, expressed as a base-ten number using ASCII digits, a U+002C COMMA character (,), and the selected coordinate's y-component, expressed in the same way as the x-component.

    Otherwise, if submitter has a value, then let result be that value.

    Otherwise, there is no result.

    Then, close the dialog subject. If there is a result, let that be the return value.

    The appropriate form encoding algorithm is determined as follows:

    If enctype is application/x-www-form-urlencoded
    Use the application/x-www-form-urlencoded encoding algorithm.
    If enctype is multipart/form-data
    Use the multipart/form-data encoding algorithm.
    If enctype is text/plain
    Use the text/plain encoding algorithm.
4.10.22.4 Constructing the form data set

The algorithm to construct the form data set for a form form optionally in the context of a submitter submitter is as follows. If not specified otherwise, submitter is null.

  1. Let controls be a list of all the submittable elements whose form owner is form, in tree order.

  2. Let the form data set be a list of name-value-type tuples, initially empty.

  3. Loop: For each element field in controls, in tree order, run the following substeps:

    1. If any of the following conditions are met, then skip these substeps for this element:

      • The field element has a datalist element ancestor.
      • The field element is disabled.
      • The field element is a button but it is not submitter.
      • The field element is an input element whose type attribute is in the Checkbox state and whose checkedness is false.
      • The field element is an input element whose type attribute is in the Radio Button state and whose checkedness is false.
      • The field element is not an input element whose type attribute is in the Image Button state, and either the field element does not have a name attribute specified, or its name attribute's value is the empty string.
      • The field element is an object element that is not using a plugin.

      Otherwise, process field as follows:

    2. Let type be the value of the type IDL attribute of field.

    3. If the field element is an input element whose type attribute is in the Image Button state, then run these further nested substeps:

      1. If the field element has a name attribute specified and its value is not the empty string, let name be that value followed by a single U+002E FULL STOP character (.). Otherwise, let name be the empty string.

      2. Let namex be the string consisting of the concatenation of name and a single U+0078 LATIN SMALL LETTER X character (x).

      3. Let namey be the string consisting of the concatenation of name and a single U+0079 LATIN SMALL LETTER Y character (y).

      4. The field element is submitter, and before this algorithm was invoked the user indicated a coordinate. Let x be the x-component of the coordinate selected by the user, and let y be the y-component of the coordinate selected by the user.

      5. Append an entry to the form data set with the name namex, the value x, and the type type.

      6. Append an entry to the form data set with the name namey and the value y, and the type type.

      7. Skip the remaining substeps for this element: if there are any more elements in controls, return to the top of the loop step, otherwise, jump to the end step below.

    4. Let name be the value of the field element's name attribute.

    5. If the field element is a select element, then for each option element in the select element's list of options whose selectedness is true and that is not disabled, append an entry to the form data set with the name as the name, the value of the option element as the value, and type as the type.

    6. Otherwise, if the field element is an input element whose type attribute is in the Checkbox state or the Radio Button state, then run these further nested substeps:

      1. If the field element has a value attribute specified, then let value be the value of that attribute; otherwise, let value be the string "on".

      2. Append an entry to the form data set with name as the name, value as the value, and type as the type.

    7. Otherwise, if the field element is an input element whose type attribute is in the File Upload state, then for each file selected in the input element, append an entry to the form data set with the name as the name, the file (consisting of the name, the type, and the body) as the value, and type as the type. If there are no selected files, then append an entry to the form data set with the name as the name, the empty string as the value, and application/octet-stream as the type.

    8. Otherwise, if the field element is an object element: try to obtain a form submission value from the plugin, and if that is successful, append an entry to the form data set with name as the name, the returned form submission value as the value, and the string "object" as the type.

    9. Otherwise, append an entry to the form data set with name as the name, the value of the field element as the value, and type as the type.

    10. If the element has a dirname attribute, and that attribute's value is not the empty string, then run these substeps:

      1. Let dirname be the value of the element's dirname attribute.

      2. Let dir be the string "ltr" if the directionality of the element is 'ltr', and "rtl" otherwise (i.e. when the directionality of the element is 'rtl').

      3. Append an entry to the form data set with dirname as the name, dir as the value, and the string "direction" as the type.

      An element can only have a dirname attribute if it is a textarea element or an input element whose type attribute is in either the Text state or the Search state.

  4. End: For the name of each entry in the form data set, and for the value of each entry in the form data set whose type is not "file" or "textarea", replace every occurrence of a U+000D CARRIAGE RETURN (CR) character not followed by a U+000A LINE FEED (LF) character, and every occurrence of a U+000A LINE FEED (LF) character not preceded by a U+000D CARRIAGE RETURN (CR) character, by a two-character string consisting of a U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair.

    In the case of the value of textarea elements, this newline normalization is already performed during the conversion of the control's raw value into the control's value (which also performs any necessary line wrapping). In the case of input elements type attributes in the File Upload state, the value is not normalized.

  5. Return the form data set.

4.10.22.5 Selecting a form submission encoding

If the user agent is to pick an encoding for a form, optionally with an allow non-ASCII-compatible encodings flag set, it must run the following substeps:

  1. Let input be the value of the form element's accept-charset attribute.

  2. Let candidate encoding labels be the result of splitting input on spaces.

  3. Let candidate encodings be an empty list of character encodings.

  4. For each token in candidate encoding labels in turn (in the order in which they were found in input), get an encoding for the token and, if this does not result in failure, append the encoding to candidate encodings.

  5. If the allow non-ASCII-compatible encodings flag is not set, remove any encodings that are not ASCII-compatible character encodings from candidate encodings.

  6. If candidate encodings is empty, return UTF-8 and abort these steps.

  7. Each character encoding in candidate encodings can represent a finite number of characters. (For example, UTF-8 can represent all 1.1 million or so Unicode code points, while Windows-1252 can only represent 256.)

    For each encoding in candidate encodings, determine how many of the characters in the names and values of the entries in the form data set the encoding can represent (without ignoring duplicates). Let max be the highest such count. (For UTF-8, max would equal the number of characters in the names and values of the entries in the form data set.)

    Return the first encoding in candidate encodings that can encode max characters in the names and values of the entries in the form data set.

4.10.22.6 URL-encoded form data

This form data set encoding is in many ways an aberrant monstrosity, the result of many years of implementation accidents and compromises leading to a set of requirements necessary for interoperability, but in no way representing good design practices. In particular, readers are cautioned to pay close attention to the twisted details involving repeated (and in some cases nested) conversions between character encodings and byte sequences.

The application/x-www-form-urlencoded encoding algorithm is as follows:

  1. Let result be the empty string.

  2. If the form element has an accept-charset attribute, let the selected character encoding be the result of picking an encoding for the form.

    Otherwise, if the form element has no accept-charset attribute, but the document's character encoding is an ASCII-compatible character encoding, then that is the selected character encoding.

    Otherwise, let the selected character encoding be UTF-8.

  3. Let charset be the name of the selected character encoding.

  4. For each entry in the form data set, perform these substeps:

    1. If the entry's name is "_charset_" and its type is "hidden", replace its value with charset.

    2. If the entry's type is "file", replace its value with the file's name only.

    3. For each character in the entry's name and value that cannot be expressed using the selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND character (&), a U+0023 NUMBER SIGN character (#), one or more ASCII digits representing the Unicode code point of the character in base ten, and finally a U+003B SEMICOLON character (;).

    4. Encode the entry's name and value using the encoder for the selected character encoding. The entry's name and value are now byte strings.

    5. For each byte in the entry's name and value, apply the appropriate subsubsteps from the following list:

      If the byte is 0x20 (U+0020 SPACE if interpreted as ASCII)
      Replace the byte with a single 0x2B byte (U+002B PLUS SIGN character (+) if interpreted as ASCII).
      If the byte is in the range 0x2A, 0x2D, 0x2E, 0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A

      Leave the byte as is.

      Otherwise
      1. Let s be a string consisting of a U+0025 PERCENT SIGN character (%) followed by uppercase ASCII hex digits representing the hexadecimal value of the byte in question (zero-padded if necessary).

      2. Encode the string s as US-ASCII, so that it is now a byte string.

      3. Replace the byte in question in the name or value being processed by the bytes in s, preserving their relative order.

    6. Interpret the entry's name and value as Unicode strings encoded in US-ASCII. (All of the bytes in the string will be in the range 0x00 to 0x7F; the high bit will be zero throughout.) The entry's name and value are now Unicode strings again.

    7. If the entry's name is "isindex", its type is "text", and this is the first entry in the form data set, then append the value to result and skip the rest of the substeps for this entry, moving on to the next entry, if any, or the next step in the overall algorithm otherwise.

    8. If this is not the first entry, append a single U+0026 AMPERSAND character (&) to result.

    9. Append the entry's name to result.

    10. Append a single U+003D EQUALS SIGN character (=) to result.

    11. Append the entry's value to result.

  5. Encode result as US-ASCII and return the resulting byte stream.

To decode application/x-www-form-urlencoded payloads, the following algorithm should be used. This algorithm uses as inputs the payload itself, payload, consisting of a Unicode string using only characters in the range U+0000 to U+007F; a default character encoding encoding; and optionally an isindex flag indicating that the payload is to be processed as if it had been generated for a form containing an isindex control. The output of this algorithm is a sorted list of name-value pairs. If the isindex flag is set and the first control really was an isindex control, then the first name-value pair will have as its name the empty string.

Which default character encoding to use can only be determined on a case-by-case basis, but generally the best character encoding to use as a default is the one that was used to encode the page on which the form used to create the payload was itself found. In the absence of a better default, UTF-8 is suggested.

The isindex flag is for legacy use only. Forms in conforming HTML documents will not generate payloads that need to be decoded with this flag set.

  1. Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&).

  2. If the isindex flag is set and the first string in strings does not contain a U+003D EQUALS SIGN character (=), insert a U+003D EQUALS SIGN character (=) at the start of the first string in strings.

  3. Let pairs be an empty list of name-value pairs.

  4. For each string string in strings, run these substeps:

    1. If string contains a U+003D EQUALS SIGN character (=), then let name be the substring of string from the start of string up to but excluding its first U+003D EQUALS SIGN character (=), and let value be the substring from the first character, if any, after the first U+003D EQUALS SIGN character (=) up to the end of string. If the first U+003D EQUALS SIGN character (=) is the first character, then name will be the empty string. If it is the last character, then value will be the empty string.

      Otherwise, string contains no U+003D EQUALS SIGN characters (=). Let name have the value of string and let value be the empty string.

    2. Replace any U+002B PLUS SIGN characters (+) in name and value with U+0020 SPACE characters.

    3. Replace any escape in name and value with the character represented by the escape. This replacement must not be recursive.

      An escape is a U+0025 PERCENT SIGN character (%) followed by two ASCII hex digits.

      The character represented by an escape is the Unicode character whose code point is equal to the value of the two characters after the U+0025 PERCENT SIGN character (%), interpreted as a hexadecimal number (in the range 0..255).

      So for instance the string "A%2BC" would become "A+C". Similarly, the string "100%25AA%21" becomes the string "100%AA!".

    4. Convert the name and value strings to their byte representation in ISO-8859-1 (i.e. convert the Unicode string to a byte string, mapping code points to byte values directly).

    5. Add a pair consisting of name and value to pairs.

  5. If any of the name-value pairs in pairs have a name component consisting of the string "_charset_" encoded in US-ASCII, and the value component of the first such pair, when decoded as US-ASCII, is the name of a supported character encoding, then let encoding be that character encoding (replacing the default passed to the algorithm).

  6. Convert the name and value components of each name-value pair in pairs to Unicode by interpreting the bytes according to the encoding encoding.

  7. Return pairs.

Parameters on the application/x-www-form-urlencoded MIME type are ignored. In particular, this MIME type does not support the charset parameter.

4.10.22.7 Multipart form data

The multipart/form-data encoding algorithm is as follows:

  1. Let result be the empty string.

  2. If the algorithm was invoked with an explicit character encoding, let the selected character encoding be that encoding. (This algorithm is used by other specifications, which provide an explicit character encoding to avoid the dependency on the form element described in the next paragraph.)

    Otherwise, if the form element has an accept-charset attribute, let the selected character encoding be the result of picking an encoding for the form.

    Otherwise, if the form element has no accept-charset attribute, but the document's character encoding is an ASCII-compatible character encoding, then that is the selected character encoding.

    Otherwise, let the selected character encoding be UTF-8.

  3. Let charset be the name of the selected character encoding.

  4. For each entry in the form data set, perform these substeps:

    1. If the entry's name is "_charset_" and its type is "hidden", replace its value with charset.

    2. For each character in the entry's name and value that cannot be expressed using the selected character encoding, replace the character by a string consisting of a U+0026 AMPERSAND character (&), a U+0023 NUMBER SIGN character (#), one or more ASCII digits representing the Unicode code point of the character in base ten, and finally a U+003B SEMICOLON character (;).

  5. Encode the (now mutated) form data set using the rules described by RFC 2388, Returning Values from Forms: multipart/form-data, and return the resulting byte stream. [RFC2388]

    Each entry in the form data set is a field, the name of the entry is the field name and the value of the entry is the field value.

    The order of parts must be the same as the order of fields in the form data set. Multiple entries with the same name must be treated as distinct fields.

    In particular, this means that multiple files submitted as part of a single <input type=file multiple> element will result in each file having its own field; the "sets of files" feature ("multipart/mixed") of RFC 2388 is not used.

    The parts of the generated multipart/form-data resource that correspond to non-file fields must not have a Content-Type header specified. Their names and values must be encoded using the character encoding selected above (field names in particular do not get converted to a 7-bit safe encoding as suggested in RFC 2388).

    File names included in the generated multipart/form-data resource (as part of file fields) must use the character encoding selected above, though the precise name may be approximated if necessary (e.g. newlines could be removed from file names, quotes could be changed to "%22", and characters not expressible in the selected character encoding could be replaced by other characters). User agents must not use the RFC 2231 encoding suggested by RFC 2388.

    The boundary used by the user agent in generating the return value of this algorithm is the multipart/form-data boundary string. (This value is used to generate the MIME type of the form submission payload generated by this algorithm.)

For details on how to interpret multipart/form-data payloads, see RFC 2388. [RFC2388]

4.10.22.8 Plain text form data

The text/plain encoding algorithm is as follows:

  1. Let result be the empty string.

  2. If the form element has an accept-charset attribute, let the selected character encoding be the result of picking an encoding for the form, with the allow non-ASCII-compatible encodings flag unset.

    Otherwise, if the form element has no accept-charset attribute, then that is the selected character encoding.

  3. Let charset be the name of the selected character encoding.

  4. If the entry's name is "_charset_" and its type is "hidden", replace its value with charset.

  5. If the entry's type is "file", replace its value with the file's name only.

  6. For each entry in the form data set, perform these substeps:

    1. Append the entry's name to result.

    2. Append a single U+003D EQUALS SIGN character (=) to result.

    3. Append the entry's value to result.

    4. Append a U+000D CARRIAGE RETURN (CR) U+000A LINE FEED (LF) character pair to result.

  7. Encode result using the encoder for the selected character encoding and return the resulting byte stream.

Payloads using the text/plain format are intended to be human readable. They are not reliably interpretable by computer, as the format is ambiguous (for example, there is no way to distinguish a literal newline in a value from the newline at the end of the value).

4.10.23 Resetting a form

When a form element form is reset, the user agent must fire a simple event named reset, that bubbles and is cancelable, at form, and then, if that event is not canceled, must invoke the reset algorithm of each resettable element whose form owner is form.

Each resettable element defines its own reset algorithm. Changes made to form controls as part of these algorithms do not count as changes caused by the user (and thus, e.g., do not cause input events to fire).