blank.gif (43 bytes)

Church Of The
Swimming Elephant

3.2.1. Data Characters Connected: An Internet Encyclopedia
3.2.1. Data Characters

Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 1866
Up: 3. HTML as an Application of SGML
Up: 3.2. HTML Lexical Syntax
Prev: 3.2. HTML Lexical Syntax
Next: 3.2.2. Tags

3.2.1. Data Characters

3.2.1. Data Characters

Any sequence of characters that do not constitute markup (see 9.6 "Delimiter Recognition" of [SGML]) are mapped directly to strings of data characters. Some markup also maps to data character strings. Numeric character references map to single-character strings, via the document character set. Each reference to one of the general entities defined in the HTML DTD maps to a single-character string.

For example,

    abc&lt;def    => "abc","<","def"
    abc&#60;def   => "abc","<","def"

The terminating semicolon on entity or numeric character references is only necessary when the character following the reference would otherwise be recognized as part of the name (see 9.4.5 "Reference End" in [SGML]).

    abc &lt def     => "abc ","<"," def"
    abc &#60 def    => "abc ","<"," def"

An ampersand is only recognized as markup when it is followed by a letter or a `#' and a digit:

    abc & lt def    => "abc & lt def"
    abc &# 60 def    => "abc &# 60 def"

A useful technique for translating plain text to HTML is to replace each '<', '&', and '>' by an entity reference or numeric character reference as follows:

                     ENTITY      NUMERIC
           --------- ----------  -----------  ---------------------
             &       &amp;       &#38;        Ampersand
             <       &lt;        &#60;        Less than
             >       &gt;        &#62;        Greater than

    NOTE - There are SGML mechanisms, CDATA and RCDATA declared content, that allow most `<', `>', and `&' characters to be entered without the use of entity references. Because these mechanisms tend to be used and implemented inconsistently, and because they conflict with techniques for reducing HTML to 7 bit ASCII for transport, they are deprecated in this version of HTML. See, "Example and Listing: XMP, LISTING".

Next: 3.2.2. Tags

Connected: An Internet Encyclopedia
3.2.1. Data Characters


Protect yourself from cyberstalkers, identity thieves, and those who would snoop on you.
Stop spam from invading your inbox without losing the mail you want. We give you more control over your e-mail than any other service.
Block popups, ads, and malicious scripts while you surf the net through our anonymous proxies.
Participate in Usenet, host your web files, easily send anonymous messages, and more, much more.
All private, all encrypted, all secure, all in an easy to use service, and all for only $5.95 a month!

Service Details

Have you gone to church today?
All pages ©1999, 2000, 2001, 2002, 2003 Church of the Swimming Elephant unless otherwise stated
Church of the Swimming Elephant©1999, 2000, 2001, 2002, 2003 is a wholly owned subsidiary of Packetderm, LLC.

Packetderm, LLC
210 Park Ave #308
Worcester, MA 01609