5. The News Propagation Algorithm
Connected: An Internet Encyclopedia
5. The News Propagation Algorithm
Up:
Connected: An Internet Encyclopedia
Up:
Requests For Comments
Up:
RFC 850
Prev: 4.3 Batching
5. The News Propagation Algorithm
5. The News Propagation Algorithm
This section describes the overall scheme of USENET and
the algorithm followed by sites in propagating news to the
entire network. Since all sites are affected by
incorrectly formatted articles and by propagation errors,
it is important for the method to be standardized.
USENET is a directed graph. Each node in the graph is a
host computer, each arc in the graph is a transmission
path from one host to another host. Each arc is labelled
with a newsgroup pattern, specifying which newsgroup
classes are forwarded along that link. Most arcs are
bidirectional, that is, if site A sends a class of
newsgroups to site B, then site B usually sends the same
class of newsgroups to site A. This bidirectionality is
not, however, required.
USENET is made up of many subnetworks. Each subnet has a
name, such as "net" or "btl". The special subnet
"net" is defined to be USENET, although the union of all
subnets may be a superset of USENET (because of sites that
get local newsgroup classes but do not get net.all). Each
subnet is a connected graph, that is, a path exists from
every node to every other node in the subnet. In
addition, the entire graph is (theoretically) connected.
(In practice, some political considerations have caused
some sites to be unable to post articles reaching the rest
of the network.)
An article is posted on one machine to a list of
newsgroups. That machine accepts it locally, then
forwards it to all its neighbors that are interested in at
least one of the newsgroups of the message. (Site A deems
site B to be "interested" in a newsgroup if the
newsgroup matches the pattern on the arc from A to B.
This pattern is stored in a file on the A machine.) The
sites receiving the incoming article examine it to make
sure they really want the article, accept it locally, and
then in turn forward the article to all their interest
neighbors. This process continues until the entire
network has seen the article.
An important part of the algorithm is the prevention of
loops. The above process would cause a message to loop
along a cycle forever. In particular, when site A sends
an article to site B, site B will send it back to site A,
which will send it to site B, and so on. One solution to
this is the history mechanism. Each site keeps track of
all articles it has seen (by their message ID) and
whenever an article comes in that it has already seen, the
incoming article is discarded immediately. This solution
is sufficient to prevent loops, but additional
optimizations can be made to avoid sending articles to
sites that will simply throw them away.
One optimization is that an article should never be sent
to a machine listed in the Path line of the header. When
a machine name is in the Path line, the message is known
to have passed through the machine. Another optimization
is that, if the article originated on site A, then site A
has already seen the article. (Origination can be
determined by the Posting-Version line.)
Thus, if an article is posted to newsgroup "net.misc",
it will match the pattern "net.all" (where "all" is a
metasymbol that matches any string), and will be forwarded
to all sites that subscribe to net.all (as determined by
what their neighbors send them). These sites make up the
"net" subnetwork. An article posted to "btl.general"
will reach all sites receiving "btl.all", but will not
reach sites that do not get "btl.all". In effect, the
articles reaches the "btl" subnetwork. An article
posted to newsgroups "net.micro,btl.general" will reach
all sites subscribing to either of the two classes.
Connected: An Internet Encyclopedia
5. The News Propagation Algorithm
|