blank.gif (43 bytes)

Church Of The
Swimming Elephant

Search:
3.7.1 Canonicalization and Text Defaults Connected: An Internet Encyclopedia
3.7.1 Canonicalization and Text Defaults

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 2068
Up: 3 Protocol Parameters
Up: 3.7 Media Types
Prev: 3.7 Media Types
Next: 3.7.2 Multipart Types

3.7.1 Canonicalization and Text Defaults

3.7.1 Canonicalization and Text Defaults

Internet media types are registered with a canonical form. In general, an entity-body transferred via HTTP messages MUST be represented in the appropriate canonical form prior to its transmission; the exception is "text" types, as defined in the next paragraph.

When in canonical form, media subtypes of the "text" type use CRLF as the text line break. HTTP relaxes this requirement and allows the transport of text media with plain CR or LF alone representing a line break when it is done consistently for an entire entity-body. HTTP applications MUST accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP. In addition, if the text is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks. This flexibility regarding line breaks applies only to text media in the entity-body; a bare CR or LF MUST NOT be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries).

If an entity-body is encoded with a Content-Encoding, the underlying data MUST be in a form defined above prior to being encoded. The "charset" parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value.

Some HTTP/1.0 software has interpreted a Content-Type header without charset parameter incorrectly to mean "recipient should guess." Senders wishing to defeat this behavior MAY include a charset parameter even when the charset is ISO-8859-1 and SHOULD do so when it is known that it will not confuse the recipient.

Unfortunately, some older HTTP/1.0 clients did not deal properly with an explicit charset parameter. HTTP/1.1 recipients MUST respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset MUST use the charset from the content-type field if they support that charset, rather than the recipient's preference, when initially displaying a document.


Next: 3.7.2 Multipart Types

Connected: An Internet Encyclopedia
3.7.1 Canonicalization and Text Defaults

Cotse.Net

Protect yourself from cyberstalkers, identity thieves, and those who would snoop on you.
Stop spam from invading your inbox without losing the mail you want. We give you more control over your e-mail than any other service.
Block popups, ads, and malicious scripts while you surf the net through our anonymous proxies.
Participate in Usenet, host your web files, easily send anonymous messages, and more, much more.
All private, all encrypted, all secure, all in an easy to use service, and all for only $5.95 a month!

Service Details

 
.
www.cotse.com
Have you gone to church today?
.
All pages ©1999, 2000, 2001, 2002, 2003 Church of the Swimming Elephant unless otherwise stated
Church of the Swimming Elephant©1999, 2000, 2001, 2002, 2003 Cotse.com.
Cotse.com is a wholly owned subsidiary of Packetderm, LLC.

Packetderm, LLC
210 Park Ave #308
Worcester, MA 01609