4.2.2. Conventional Representation of Newlines
Connected: An Internet Encyclopedia
4.2.2. Conventional Representation of Newlines
Up:
Connected: An Internet Encyclopedia
Up:
Requests For Comments
Up:
RFC 1866
Up:
4. HTML as an Internet Media Type
Up:
4.2. HTML Document Representation
Prev: 4.2.1. Undeclared Markup Error Handling
Next: 5. Document Structure
4.2.2. Conventional Representation of Newlines
4.2.2. Conventional Representation of Newlines
SGML specifies that a text entity is a sequence of records, each
beginning with a record start character and ending with a record end
character (code positions 10 and 13 respectively) (section 7.6.1,
"Record Boundaries" in [SGML]).
[MIME] specifies that a body of type `text/*' is a sequence of lines,
each terminated by CRLF, that is, octets 13, 10.
In practice, HTML documents are frequently represented and
transmitted using an end of line convention that depends on the
conventions of the source of the document; frequently, that
representation consists of CR only, LF only, or a CR LF sequence.
Hence the decoding of the octets will often result in a text entity
with some missing record start and record end characters.
Since there is no ambiguity, HTML user agents are encouraged to infer
the missing record start and end characters.
An HTML user agent should treat end of line in any of its variations
as a word space in all contexts except preformatted text. Within
preformatted text, an HTML user agent should treat any of the three
common representations of end-of-line as starting a new line.
Next: 5. Document Structure
Connected: An Internet Encyclopedia
4.2.2. Conventional Representation of Newlines
|