Network Working Group E. Thomas
Request for Comments: 1429 Swedish University Network
February 1993
Listserv Distribute Protocol
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard. Distribution of this memo is
unlimited.
Abstract
This memo specifies a subset of the distribution protocol used by the
BITNET LISTSERV to deliver mail messages to large amounts of
recipients. This protocol, known as DISTRIBUTE, optimizes the
distribution by sending a single copy of the message over heavily
loaded links, insofar as topological information is available to
guide sUCh decisions, and reduces the average turnaround time for
large mailing lists to 5-15 minutes on the average. This memo
describes a simple interface allowing non-BITNET mailing list
eXPloders (or other bulk-delivery scripts) to take advantage of this
service by letting the BITNET distribution network take care of the
delivery.
Introduction
Running a mailing list of 1,000 subscribers or more with plain
"sendmail" while keeping turnaround time to a reasonable level is no
easy task. Due mostly to its limited bandwidth in the mid-80's,
BITNET has developed an efficient bulk delivery protocol for its
mailing lists. Originally introduced in 1986, this protocol was
refined little by little and now carries 2-6 million mail messages a
day. In fact, this distribution mechanism implements a general-
purpose delivery service which can be used by any user of BITNET or
the Internet. Thus, a simple solution to the "sendmail" turnaround
problem is to wrap the message and recipient list in a DISTRIBUTE
envelope and pass it to a BITNET server for delivery. This may not
be the best possible solution, but it has the advantage of being easy
to implement.
In this document we will use the term "production" to refer to the
normal operation of the mailing list (or bulk delivery application)
you want to pipe through the DISTRIBUTE service. That is, the
"production" options are those you should specify once everything is
tested and you are confident that the setup is working to your
satisfaction. In contrast, "test" and "debug" options can be used to
experiment with the protocol but should not be used for normal
operation because of the additional bandwidth and CPU time required
to generate the various informational reports.
Finally, it should be noted that the DISTRIBUTE protocol was
developed to address a number of issues, some of them relevant only
to BITNET, and has evolved since 1986 while keeping a compatible
syntax. For the sake of brevity, this RFCdescribes only a small
subset of the available options and syntax. This is why the syntax
may appear unnecessarily complicated or even illogical.
1. Selecting an entry point into the DISTRIBUTE backbone
The first thing you have to do is to find a suitable site to submit
your distributions to. For testing, and for testing ONLY, you can
use:
LISTSERV@SEARN.SUNET.SE
For production use, however, you should select a DISTRIBUTE site in
your topological vicinity: it would make no sense to pass your
distributions from California to a server in Sweden if most of your
recipients are in the US. If your organization is connected to BITNET
and your BITNET system is part of the DISTRIBUTE backbone, this ought
to be your best bet. Otherwise you will want to contact someone
knowledgeable about BITNET (or the author of this RFCif you have no
BITNET users). Make sure to run through the following checklist
before sending any production traffic to the site in question:
a. Do you have good connectivity to the host in question? Does the
host, in general, have decent BITNET connectivity? There are still
a few sites that insist on using 9.6k leased lines for BITNET in
spite of having T1 IP Access. You will want to avoid them.
b. Send mail to the server with "show version" in the message body
(not in the subject field, which is ignored). Is the server running
version 1.7f or higher? If so, it should not have given you the
following warning,
>>> This server is configured to use PUNCH format for mail <<<
which means that messages with lines longer than 80 characters
cannot be handled properly. If the software version is less than
1.7f, the warning will not be present; instead, check the first
(bottom) "Received:" field. If it does not say "LMail", do not use
this server as it probably cannot handle messages with long lines.
Finally, make sure that the "Master nodes file" is not older
than 2 months: there are a handful of sites which never update
their tables due to staffing problems. They cannot be prevented
from running LISTSERV, but you will certainly want to avoid them.
c. How big is your workload? If you are planning to use the service
for more than 10,000 daily recipients, you should get permission
from the LISTSERV administrator, both as a matter of courtesy and
to hear about any restrictions or regularly scheduled downtime they
might have. For instance, some universities might not allow large
distributions during prime time, or they may have several
DISTRIBUTE machines and will want to make sure you use the "right"
one. Send mail to "owner-listserv" at the host in question and
give an estimate of the amount of daily messages and recipients you
would like to submit. If your message bounces back with "No such
local user" or the like, it means the server did not pass the above
test (b) and you don't want to use it anyway.
An index of sites/hosts which have the required configuration, good
connectivity, keep their tables up to date and have generally agreed
to provide this service to anyone in their topological area will be
published separately in the future.
2. Physical delivery of the DISTRIBUTE request
The distribution request is delivered via SMTP to the e-mail address
oBTained in step 1 (for instance, LISTSERV@SEARN.SUNET.SE). In fact,
as long as you can somehow get mail to the server's host, you can use
the service; SMTP is just the most convenient way of doing so.
2.1. Contents of MAIL FROM: field
You should set the MAIL FROM: field to the address of the person who
maintains your mailing list or, generally speaking, to the address of
a human being who can take action in case the message fails to reach
the DISTRIBUTE server's host. This is a very rare occurrence.
2.2. Contents of RCPT TO: field
The RCPT TO: field points to the server's address (for instance,
LISTSERV@SEARN.SUNET.SE).
2.3. Contents of the RFC822 header
After the DATA instruction, you must supply a valid RFC822 header
with a "From:" field pointing to the mailbox that should receive
notification of delivery problems, bounced mail, and so on. This can
be the same as the MAIL FROM: field, an address of the type "owner-
xxxx@yourhost", etc. DO NOT PUT THE LIST SUBMISSION ADDRESS THERE,
or you will get mailing loops.
For testing, the "From:" field should point to your own mailbox, so
that you get the responses from the server.
As long as RFC822 syntax is respected, the only field that matters is
the "From:" field (or "Sender:", "Resent-From:", etc.). In practice
this means you can just pipe the distribution request into "mail
listserv@whatever" and let your mail program build all the headers.
3. Format of the DISTRIBUTE request
The body of the message delivered to LISTSERV defines the recipients
of the distribution and the text (header + body) of the RFC822
message you want to have delivered. The request starts with a "job
card", followed by a DISTRIBUTE command, a list of recipients, and
finally the message header and body.
3.1. Syntax of the JOB card
The purpose of the JOB card is to make sure that any spurious text
inserted by mail gateways or the like is flushed and not erroneously
interpreted as a command. It can optionally be used to associate a
"job name" with the request, in case you want to use tools to assist
you in processing the notifications you get from the DISTRIBUTE
servers when running in test mode. The syntax is as follows:
//jobname JOB ECHO=NO
"jobname" can be anything as long as it does not contain blanks, and
can be omitted. LISTSERV generally ignores case when parsing
commands, so you can use "job" or "Job" if you prefer. The ECHO=NO
keyWord is required for production use, to suppress the "resource
usage summary" you would otherwise get upon completion of your
delivery. You may want to omit it when testing.
3.2. Syntax of the DISTRIBUTE command
Below the JOB card, you must supply the following line:
DISTRIBUTE MAIL
For production mode, do not specify anything else on that line. When
testing, you should add ACK=MAIL in order to get an acknowledgement
confirming the delivery. There are two other useful options:
DEBUG=YES, which instructs the server to produce a report showing how
the various recipients will be routed, but without actually
delivering the message; and TRACE=YES, which does the same but does
deliver the message. Before making a "live" test with your actual
recipients list, you should tack the DEBUG=YES option once to make
sure you got all the parameters and syntax right, and get a rough
idea of the efficiency of the distribution (see the section on
performance).
3.3. Giving the list of recipients
The list of recipients follows the DISTRIBUTE line and is specified
as follows:
//To DD *
user1@host1 BSMTP
user2@host2 BSMTP
/*
The two lines starting with a "/" have to be copied as-is. Each of
the lines in between contains the address of one of the recipients,
followed by a blank and by the word "BSMTP", which indicates that you
do not want the header rewritten. There are four restrictions:
a. The address must be a plain "local-part@hostname" - no name string,
no angle bracket, no source route, etc. Bear in mind that the
DISTRIBUTE server is not in the same domain as you: all the
addresses should be fully qualified.
b. If the local-part is quoted, it must be quoted from the first word
on. Technically, RFC822 allows: Joe."Now@Home".Smith@xyz.edu, but
for performance reasons this form is not supported. Just quote the
first word to tell LISTSERV to run the address through the full
parser: you would write "Joe"."Now@Home".Smith@xyz.edu instead.
c. The local-part of the address may not start with an (unquoted)
asterisk. You can bypass this restriction by quoting the local
part and using a %-hack through the server's host:
"***JACK***%jack-ws.xyz.edu"@server-host.
d. Blanks are not allowed anywhere in the address.
You can use the pseudo-domain ".BITNET" for BITNET recipients: it is
always supported within DISTRIBUTE requests.
3.4. Specifying the message text
After the last recipient and the closing "/*", add the following
line,
//Data DD *,EOF
followed by the RFC822 message (header + body) that you want
delivered. The EOF option indicates that the message header and body
will extend until the end of the message you are sending to the
DISTRIBUTE server. If you are worried about extraneous data being
appended by a gateway, remove the EOF option, add a closing "/*" line
after the end of the message, followed by a "// EOJ" card to flush
any remaining text. This, however, will fail if the message itself
contains a "/*" line; you would have to insert a space before any
such line.
4. Examples
Here is an (intentionally short) example to clarify the syntax:
----- cut here -----
//Test JOB
Distribute mail Ack=mail Debug=yes
//To DD *
joe@ws-4.xyz.edu BSMTP
jack@abc.com BSMTP
jim@tamvm1.bitnet BSMTP
jill@alpha.cc.buffalo.edu BSMTP
james@library.rice.edu BSMTP
/*
//Data DD *,EOF
Date: Tue, 19 Jan 1993 10:57:29 -0500
From: Robert H. Smith <RHS@eta.abc.com>
Subject: Re: Problem with V5.41
To: somelist@some.host.edu
I agree with Jack, V5.41 is not a stable release. I had to fall back
to V5.40 within 5 minutes of installation...
Bob Smith
----- cut here -----
Note: some of the hostnames are genuine, but the usernames are all
fictitious.
You would get the following reply:
--------------------------------------------------------------------
Job "Test" started on 20 Feb 1993 01:09:40
> Distribute mail ack=mail debug=yes
Debug trace information:
ABC.COM goes to SEARN (213) - single recipient
ALPHA.CC.BUFFALO.EDU goes to UBVM (027) - single recipient
LIBRARY.RICE.EDU goes to RICEVM1 (022) - single recipient
TAMVM1 goes to TAIVM1 (247) - single recipient
WS-4.XYZ.EDU goes to SEARN (213) - single recipient
Path information:
TAIVM1 : UGA RICEVM1 TAIVM1
UBVM : UGA UBVM
RICEVM1 : UGA RICEVM1
(Debug) Mail forwarded to LISTSERV@UGA for 3 recipients.
(Debug) Mail posted via BSMTP to jack@ABC.COM.
(Debug) Mail posted via BSMTP to joe@WS-4.XYZ.EDU.
Job "Test" ended on 20 Feb 1993 01:09:40
Summary of resource utilization
-------------------------------
CPU time: 0.086 sec Device I/O: 6
Overhead CPU: 0.045 sec Paging I/O: 5
CPU model: 9221 DASD model: 3380
--------------------------------------------------------------------
To actually perform the distribution and get an acknowledgement, you
would change the first two lines as follows:
----- cut here -----
//Test JOB Echo=NO
Distribute mail Ack=mail
--------------------
And you would get the following reply:
--------------------------------------------------------------------
Mail forwarded to LISTSERV@UGA for 3 recipients.
Mail posted via BSMTP to jack@ABC.COM.
Mail posted via BSMTP to joe@WS-4.XYZ.EDU.
--------------------------------------------------------------------
Finally, by removing the "Ack=mail" keyword you would perform a
"silent" distribution without any acknowledgement, suitable for
production mode.
5. Performance
The efficiency of the distribution depends mostly on the quality and
accuracy of the topological information available to the DISTRIBUTE
server (and, in some extreme cases, on system load). For BITNET
recipients, the typical turnaround time for reasonably well connected
systems is 5-15 minutes. Internet recipients fall in two categories:
those which can be routed to a machine within or close to the
recipient's organization (average turnaround time 5-20 minutes), and
those for which no topological information is available at all. In
that case the delivery can take much longer, but usually remains
faster than with a vanilla sendmail setup. At the time being,
topological information is available for most top-level domains
outside the US and for many sub-domains of EDU and GOV.
You can measure the efficiency of the distribution using the
DEBUG=YES option as explained above. Recipients which get forwarded
to another server usually get delivered within 5-20 minutes (except
to poorly connected sites or countries, for which not much can be
done). Recipients which are handled locally are passed to a local
SMTP agent whose efficiency depends very much on the amount of
"burst" queries the local name server can handle in quick succession.
A number of projects are currently underway to investigate the
feasibility of improving the quality of the topological information
available to the DISTRIBUTE servers for the Internet.
Security Considerations
Security issues are not discussed in this memo.
Author's Address
Eric Thomas
Swedish University Network
Dr.Kristinas vaeg 37B
100 44 Stockholm, Sweden
E-mail: ERIC@SEARN.SUNET.SE