It is imperative that the rules in this section be followed exactly. When two user agents use different heuristics for content type detection, security problems can occur. For example, if a server believes a contributed file to be an image (and thus benign), but a Web browser believes the content to be HTML (and thus capable of executing script), the end user can be exposed to malicious content, making the user vulnerable to cookie theft attacks and other cross-site scripting attacks.
The sniffed type of a resource must be found as follows:
If the resource was fetched over an HTTP protocol, and there is no HTTP Content-Encoding header, but there is an HTTP Content-Type header and it has a value whose bytes exactly match one of the following three lines:
| Bytes in Hexadecimal | Textual representation |
|---|---|
| 74 65 78 74 2f 70 6c 61 69 6e | text/plain
|
| 74 65 78 74 2f 70 6c 61 69 6e 3b 20 63 68 61 72 73 65 74 3d 49 53 4f 2d 38 38 35 39 2d 31 | text/plain; charset=ISO-8859-1
|
| 74 65 78 74 2f 70 6c 61 69 6e 3b 20 63 68 61 72 73 65 74 3d 69 73 6f 2d 38 38 35 39 2d 31 | text/plain; charset=iso-8859-1
|
...then jump to the text or binary section below.
Let official type be the type given by the Content-Type metadata for the resource (in lowercase, ignoring any parameters). If there is no such type, jump to the unknown type step below.
If official type is "unknown/unknown" or "application/unknown", jump to the unknown type step below.
If official type ends in "+xml", or if it is either "text/xml" or "application/xml", then the the sniffed type of the resource is official type; return that and abort these steps.
If official type is an image type supported by the user agent (e.g. "image/png", "image/gif", "image/jpeg", etc), then jump to the images section below.
If official type is "text/html", then jump to the feed or HTML section below.
Otherwise, the sniffed type of the resource is official type.
The user agent may wait for 512 or more bytes of the resource to be available.
Let n be the smaller of either 512 or the number of bytes already available.
If n is 4 or more, and the first bytes of the file match one of the following byte sets:
| Bytes in Hexadecimal | Description |
|---|---|
| FE FF | UTF-16BE BOM or UTF-32LE BOM |
| FF FE | UTF-16LE BOM |
| 00 00 FE FF | UTF-32BE BOM |
| EF BB BF | UTF-8 BOM |
...then the sniffed type of the resource is "text/plain".
Should we remove UTF-32 from the above?
Otherwise, if any of the first n bytes of the resource are in one of the following byte ranges:
...then the sniffed type of the resource is "application/octet-stream".
maybe we should invoke the "Content-Type sniffing: image" section now, falling back on "application/octet-stream".
Otherwise, the sniffed type of the resource is "text/plain".
The user agent may wait for 512 or more bytes of the resource to be available.
Let stream length be the smaller of either 512 or the number of bytes already available.
For each row in the table below:
Let indexpattern be an index into the mask and pattern byte strings of the row.
Let indexstream be an index into the byte stream being examined.
Loop: If indexstream points beyond the end of the byte stream, then this row doesn't match, skip this row.
Examine the indexstreamth byte of the byte stream as follows:
If the "and" operator, applied to the indexstreamth byte of the stream and the indexpatternth byte of the mask, yield a value different that the indexpatternth byte of the pattern, then skip this row.
Otherwise, increment indexpattern to the next byte in the mask and pattern and indexstream to the next byte in the byte stream.
"WS" means "whitespace", and allows insignificant whitespace to be skipped when sniffing for a type signature.
If the indexstreamth byte of the stream is one of 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII FF), 0x0D (ASCII CR), or 0x20 (ASCII space), then increment only the indexstream to the next byte in the byte stream.
Otherwise, increment only the indexpattern to the next byte in the mask and pattern.
If indexpattern does not point beyond the end of the mask and pattern byte strings, then jump back to the loop step in this algorithm.
Otherwise, the sniffed type of the resource is the type given in the cell of the third column in that row; abort these steps.
As a last-ditch effort, jump to the text or binary section.
| Bytes in Hexadecimal | Sniffed type | Comment | |
|---|---|---|---|
| Mask | Pattern | ||
| FF FF DF DF DF DF DF DF DF FF DF DF DF DF | 3C 21 44 4F 43 54 59 50 45 20 48 54 4D 4C | text/html | The string "<!DOCTYPE HTML" in US-ASCII or
compatible encodings, case-insensitively.
|
| FF FF DF DF DF DF | WS 3C 48 54 4D 4C | text/html | The string "<HTML" in US-ASCII or
compatible encodings, case-insensitively, possibly with leading spaces.
|
| FF FF DF DF DF DF | WS 3C 48 45 41 44 | text/html | The string "<HEAD" in US-ASCII or
compatible encodings, case-insensitively, possibly with leading spaces.
|
| FF FF DF DF DF DF DF DF | WS 3C 53 43 52 49 50 54 | text/html | The string "<SCRIPT" in US-ASCII or
compatible encodings, case-insensitively, possibly with leading spaces.
|
| FF FF FF FF FF | 25 50 44 46 2D | application/pdf | The string "%PDF-", the PDF signature.
|
| FF FF FF FF FF FF FF FF FF FF FF | 25 21 50 53 2D 41 64 6F 62 65 2D | application/postscript | The string "%!PS-Adobe-", the PostScript
signature.
|
| FF FF FF FF FF FF | 47 49 46 38 37 61 | image/gif | The string "GIF87a", a GIF signature.
|
| FF FF FF FF FF FF | 47 49 46 38 39 61 | image/gif | The string "GIF89a", a GIF signature.
|
| FF FF FF FF FF FF FF FF | 89 50 4E 47 0D 0A 1A 0A | image/png | The PNG signature. |
| FF FF FF | FF D8 FF | image/jpeg | A JPEG SOI marker followed by the first byte of another marker. |
| FF FF | 42 4D | image/bmp | The string "BM", a BMP signature.
|
User agents may support further types if desired, by implicitly adding to the above table. However, user agents should not use any other patterns for types already mentioned in the table above, as this could then be used for privilege escalation (where, e.g., a server uses the above table to determine that content is not HTML and thus safe from XSS attacks, but then a user agent detects it as HTML anyway and allows script to execute).
If the first bytes of the file match one of the byte sequences in the first columns of the following table, then the sniffed type of the resource is the type given in the corresponding cell in the second column on the same row:
| Bytes in Hexadecimal | Sniffed type | Comment |
|---|---|---|
| 47 49 46 38 37 61 | image/gif | The string "GIF87a", a GIF signature.
|
| 47 49 46 38 39 61 | image/gif | The string "GIF89a", a GIF signature.
|
| 89 50 4E 47 0D 0A 1A 0A | image/png | The PNG signature. |
| FF D8 FF | image/jpeg | A JPEG SOI marker followed by the first byte of another marker. |
| 42 4D | image/bmp | The string "BM", a BMP signature.
|
User agents must ignore any rows for image types that they do not support.
Otherwise, the sniffed type of the resource is the same as its official type.
The user agent may wait for 512 or more bytes of the resource to be available.
Let s be the stream of bytes, and let s[i] represent the byte in s with position i, treating s as zero-indexed (so the first byte is at i=0).
If at any point this algorithm requires the user agent to determine the value of a byte in s which is not yet available, or which is past the first 512 bytes of the resource, or which is beyond the end of the resource, the user agent must stop this algorithm, and assume that the sniffed type of the resource is "text/html".
User agents are allowed, by the first step of this algorithm, to wait until the first 512 bytes of the resource are available.
Initialise pos to 0.
Examine s[pos].
<")
If the bytes with positions pos to pos+2 in s are exactly equal
to 0x21, 0x2D, 0x2D respectively (ASCII for "!--"), then:
-->"), then increase pos by 3
and jump back to the previous step (step 5) in the overall algorithm in
this section.
If s[pos] is 0x21
(ASCII "!"):
If s[pos] is 0x3F
(ASCII "?"):
Otherwise, if the bytes in s starting at pos match any of the sequences of bytes in the first column of the following table, then the user agent must follow the steps given in the corresponding cell in the second column of the same row.
| Bytes in Hexadecimal | Requirement | Comment |
|---|---|---|
| 72 73 73 | The sniffed type of the resource is "application/rss+xml"; abort these steps | The three ASCII characters "rss"
|
| 66 65 65 64 | The sniffed type of the resource is "application/atom+xml"; abort these steps | The four ASCII characters "feed"
|
| 72 64 66 3A 52 44 46 | Continue to the next step in this algorithm | The ASCII characters "rdf:RDF"
|
If none of the byte sequences above match the bytes in s starting at pos, then the sniffed type of the resource is "text/html". Abort these steps.
If, before the next ">", you find two xmlns* attributes with http://www.w3.org/1999/02/22-rdf-syntax-ns# and http://purl.org/rss/1.0/ as the namespaces, then the sniffed type of the resource is "application/rss+xml", abort these steps. (maybe we only need to check for http://purl.org/rss/1.0/ actually)
Otherwise, the sniffed type of the resource is "text/html".
For efficiency reaons, implementations may wish to implement this algorithm and the algorithm for detecting the character encoding of HTML documents in parallel.
What explicit Content-Type metadata is associated with the resource (the resource's type information) depends on the protocol that was used to fetch the resource.
For HTTP resources, only the Content-Type HTTP header contributes any data; the explicit type of the resource is then the value of that header, interpreted as described by the HTTP specifications. If the Content-Type HTTP header is present but it cannot be interpreted as described by the HTTP specifications (e.g. because its value doesn't contain a U+002F SOLIDUS ('/') character), then the resource has no type information. [HTTP]
For resources fetched from the filesystem, user agents should use platform-specific conventions, e.g. operating system extension/type mappings.
Extensions must not be used for determining resource types for resources fetched over HTTP.
For resources fetched over most other protocols, e.g. FTP, there is no type information.
The algorithm for extracting an encoding from a Content-Type, given a string s, is as follows. It either returns a encoding or nothing.
Skip characters in s up to and including the first
U+003B SEMICOLON (;) character.
Skip any U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020 characters (i.e. spaces) that immediately follow the semicolon.
If the next six characters are not 'charset', return nothing.
Skip any U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020 characters that immediately follow the word 'charset' (there might not be any).
If the next character is not a U+003D EQUALS SIGN ('='), return nothing.
Skip any U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020 characters that immediately follow the word equals sign (there might not be any).
Process the next character as follows:
Return string between the two quotation marks.
Return the string between the two apostrophes.
Return nothing.
Return the string from this character to the first U+0009, U+000A, U+000B, U+000C, U+000D, or U+0020 character or the end of s, whichever comes first.