1. Introduction
Connected: An Internet Encyclopedia
1. Introduction
Up:
Connected: An Internet Encyclopedia
Up:
Requests For Comments
Up:
RFC 1522
Prev: RFC 1522
Next: 2. Syntax of encoded-words
1. Introduction
1. Introduction
RFC 1521 describes a mechanism for denoting textual body parts which
are coded in various character sets, as well as methods for encoding
such body parts as sequences of printable ASCII characters. This
memo describes similar techniques to allow the encoding of non-ASCII
text in various portions of a RFC 822 [2] message header, in a manner
which is unlikely to confuse existing message handling software.
Like the encoding techniques described in RFC 1521, the techniques
outlined here were designed to allow the use of non-ASCII characters
in message headers in a way which is unlikely to be disturbed by the
quirks of existing Internet mail handling programs. In particular,
some mail relaying programs are known to (a) delete some message
header fields while retaining others, (b) rearrange the order of
addresses in To or Cc fields, (c) rearrange the (vertical) order of
header fields, and/or (d) "wrap" message headers at different places
than those in the original message. In addition, some mail reading
programs are known to have difficulty correctly parsing message
headers which, while legal according to RFC 822, make use of
backslash-quoting to "hide" special characters such as "<", ",", or
":", or which exploit other infrequently-used features of that
specification.
While it is unfortunate that these programs do not correctly
interpret RFC 822 headers, to "break" these programs would cause
severe operational problems for the Internet mail system. The
extensions described in this memo therefore do not rely on little-
used features of RFC 822.
Instead, certain sequences of "ordinary" printable ASCII characters
(known as "encoded-words") are reserved for use as encoded data. The
syntax of encoded-words is such that they are unlikely to
"accidentally" appear as normal text in message headers.
Furthermore, the characters used in encoded-words are restricted to
those which do not have special meanings in the context in which the
encoded-word appears.
Generally, an "encoded-word" is a sequence of printable ASCII
characters that begins with "=?", ends with "?=", and has two "?"s in
between. It specifies a character set and an encoding method, and
also includes the original text encoded as graphic ASCII characters,
according to the rules for that encoding method.
A mail composer that implements this specification will provide a
means of inputting non-ASCII text in header fields, but will
translate these fields (or appropriate portions of these fields) into
encoded-words before inserting them into the message header.
A mail reader that implements this specification will recognize
encoded-words when they appear in certain portions of the message
header. Instead of displaying the encoded-word "as is", it will
reverse the encoding and display the original text in the designated
character set.
NOTES
This memo relies heavily on notation and terms defined STD 11, RFC
822 and RFC 1521. In particular, the syntax for the ABNF used in
this memo is defined in STD 11, RFC 822, as well as many of the
terms used in the grammar for the header extensions defined here.
Successful implementation of this protocol extension requires
careful attention to the details of both STD 11, RFC 822 and RFC
1521.
When the term "ASCII" appears in this memo, it refers to the "7-
Bit American Standard Code for Information Interchange", ANSI
X3.4-1986. The MIME charset name for this character set is "US-
ASCII". When not specifically referring to the MIME charset name,
this document uses the term "ASCII", both for brevity and for
consistency with STD 11, RFC 822. However, implementors are
warned that the character set name must be spelled "US-ASCII" in
MIME message and body part headers.
Next: 2. Syntax of encoded-words
Connected: An Internet Encyclopedia
1. Introduction
|