DSPAM v3.4.2 README
DSPAM v3.0
Copyright (c) 2003 Network Dweebs Corporation
http://www.nuclearelephant.com/projects/dspam/
LICENSE
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
TABLE OF CONTENTS
目录
DSPAM的一般知识
1.0 DSPAM介绍
1.1 安装
1.2 测试
1.3 故障处理
1.4 DSPAM工具
1.5 代理命令行参数
DSPAM的高级功能
2.0 连接libdspam
2.1 配置组群
2.2 外部播种学理论
其它功能
3.0 故障、端口等问题
3.1 已知的故障
3.2 给您的站点添加dspam标识按钮
3.3 访问CVS
General DSPAM Information
1.0 About DSPAM
1.1 Installation
1.2 Testing
1.3 Troubleshooting
1.4 DSPAM Tools
1.5 Agent Commandline Arguments
Advancced DSPAM functionality
2.0 Linking with libdspam
2.1 Configuring groups
2.2 External Inoculation Theory
2.3 Client/Server Mode
2.4 LMTP
Miscellaneous
3.0 Bugs, Ports, and the like
3.1 Known Bugs
3.2 Adding the dspam logo button to your website
3.3 CVS Access
1.0 ABOUT DSPAM
DSPAM是一个开放源代码,通过使用较高级的统计分析工具加上deobfuscation技术以及其他相关的方法,
来直接用于抗击商业邮件的反垃圾方案中。DSPAM能够学习每个用户的不同邮件的习性:根据这些习性告诉
过滤器什么是垃圾邮件。这就使得即使在一个非常庞大的系统中,DSPAM仍要为每个用户提供高精确度的、智能
的过滤功能。他提供了一个能够学习每个用户的邮件习性的管理维护功能,这些习性可能带有些许的假阳性
(false positives)。DSPAM是非常流行的防垃圾工具之一,他成功地完成了真正精确的垃圾过滤功能,
并且迅速获得一个巨大的支持论坛。
DSPAM is an open-source, freely available anti-spam solution designed to combat
unsolicited commercial email using an advanced implementation of statistical
analysis coupled with deobfuscation techniques and other related approaches.
DSPAM is capable of learning each user's individual mail behavior based on
what they tell the filter spam is and isn't. This allows DSPAM to provide
highly-accurate, personalized filtering for each user on even a large
system. This provides an administratively maintenance free system capable of
learning each user's email behaviors with very few false positives.
DSPAM is among one of the more popular and successful attempts at truly
accurate spam filtering, and is rapidly gaining a large support forum.
Contributions to the project are welcome via the dspam-dev mailing list.
DSPAM可以通过一下两种方式实现:
1.DSPAM邮件代理提供服务器支持的垃圾过滤,隔离箱,和一个促进系统进行自动分析垃圾的机制。支持先
进的特性,例如,进/出选择(opt-in/opt-out)过滤,接种(inoculation),和共享组群。
2.开发人员可以把他们的项目连接到dspam的内核引擎(libdspam)中,并与GPL协议许可证一致。这就使
得开发人员可以把libdspam立即并入自己的垃圾过滤应用软件:例如邮件客户机,其他的反垃圾邮件工具,等等。
许多基本原则合成了这个代理,在http://paulgraham.com/spam.html网站上可以看到PaulGraham抗击垃圾
邮件的白皮书中写到了这些原则。人们对原始内核提出可许多新的方法,有些方法的说明可以在DSPAM主页
的白皮书中看到。
DSPAM can be implemented in one of two ways:
1. The DSPAM mailer-agent provides server-side spam filtering, quarantine
box, and a mechanism for forwarding spams into the system to be automatically
analyzed. Advanced features, such as opt-in/opt-out filtering, inoculation,
and shared groups are supported.
2. Developers may link their projects to the dspam core engine (libdspam) in
accordance with the GPL license agreement. This enables developers to
incorporate libdspam as a "drop-in" for instant spam filtering within their
applications - such as mail clients, other anti-spam tools, and so on.
Many of the foundational principles incorporated into this agent were
contributed by Paul Graham's white paper on combatting SPAM, which can be
found at http://paulgraham.com/spam.html. Many new approaches have been
layered on top of the original core, some of which may be explained in
white papers on the DSPAM home page.
DSPAM可以分解成以下几部分:
DSPAM内核引擎
DSPAM内核引擎,即libdspam,提供几乎所有主要的垃圾过滤函数。该引擎连接到其他的dspam构件(或shell)
上提供功能性。内核引擎能够同任何其他的应用软件连接,并作为”顺便拜访者“向邮件客户机,或其他的
反垃圾工具、或其他类似的项目提供垃圾过滤功能,并使之受益于此。许多静态的和共享的版本就是通过.libs
下的libtool建立的。
libdspam提供一个存储驱动提取层,这使得开发人员可以较容易的理解信息是如何被存储在系统(例如Berkeley
DB, MySQL, Oracle, 等等)中,这就使得他们有足够的灵活性利用石板和凿子写存储驱动。
The DSPAM Solution is split up into the following pieces:
DSPAM CORE ENGINE
The DSPAM core engine, also known as libdspam, provides all major spam
filtering functions. The engine is linked to other dspam components (or
shells) to provide functionality. The core engine is capable of being linked
in with any other application as a "drop-in" to provide spam filtering to
mail clients, other anti-spam tools, and other such type projects that
would benefit from its use. Both static and shared versions are built by
libtool into .libs.
libdspam provides a storage driver abstraction layer, enabling developers to
easily change how information is stored on the system (for example Berkeley
DB, MySQL, Oracle, etc.) with enough flexibility to write a storage
driver utilizing stone tablets and chisels.
DSPAM代理
DSPAM代理是为libdspam提供垃圾过滤邮件服务器或其他服务器支持工具的直接接口的shell。代理通常指以
下两者之一:
1.代理可以化妆成邮件服务器的本地发送代理。然后DSPAM就处理邮件服务器通过管道传来的邮件,接着用
真实发送代理(procmail,mail.local,或通过一代理人把他传到另一个服务器),但若是垃圾邮件就会将
其隔离(DSPAM也可以选择标识并发送垃圾邮件)。
2.作为POP3代理,当用户检查时,DSPAM能被设定可处理电子邮件以及标识垃圾邮件。这就允许DSPAM在没有
综合需求的情况下自动到达任何邮件的前端。
MTA(sendmail,exim,qmail,等等)或POP3代理用参数识别目的地用户和其他操作参数调用DSPAM。DSPAM完成
其内部计算后将基于此结果执行适当的动作。
当邮件发送至终端用户时,代理对每个邮件设置了一系列的数字搜索路径。这些数字代表存储在服务器的临
时信息:包含邮件的原始数据,同时也用于当DSPAM出错时重学原始邮件。这允许DSPAM在不必提供完整邮件
头时精确地获悉————为了终端用户的生活跟方便。
DSPAM AGENT
The DSPAM agent is a shell for libdspam providing a direct interface to
mail servers or other tools for server-side spam filtering. The agent
is normally integrated into one of two places:
1. The agent can masquerade as a mail server's local delivery agent. DSPAM then
processes email piped to it from the mail server and then either delivers it
using the real delivery agent (procmail, mail.local, or a proxy to pass it
along to another server), or will quarantine it if the message is spam (DSPAM
can optionally tag and deliver spams as well).
2. As a POP3 proxy, DSPAM can be configured to processes email when the user
checks theirs, and tags spam accordingly. This allows DSPAM to front-end
any mail system without the need for integration.
The agent is also responsible for correcting misclassifications (missed spams
or false positives). This is critical to the learning operations of DSPAM.
The MTA (sendmail, exim, qmail, etc) or the POP3 proxy calls DSPAM with
parameters identifying the destination user and other operational parameters.
DSPAM performs its internal calculations and will then perform the appropriate
action based on the result.
When an email is delivered to the end-user, the agent appends a serial number
to each email. This serial number references temporary information stored on
the server which contains the original training data for the message, and is
used to re-learn the original message in the event DSPAM made a mistake. This
allows DSPAM to accurately learn without having to provide the full headers
of the message - making life much easier for end-users.
CGI CLIENT
CGI客户机是一个是邮件用户看到其隔离箱的终端用户工具,可以倒转偶然的假阳性,看到用户的历史动作,
而且最重要的是可以永久删除垃圾邮件。CGI客户机用来和DSPAM代理联合一致。在一可选择的解决方案中,
比如客户机-过滤/前进,消除隔离箱是必要的,但是很多用户会感激--要是不用下载所有的垃圾邮件的话,
同时也能够查看使用曲线图和其他有用的信息。
工具
提供管理字典,自动化文集,创造种子[合成的]字典。
CGI CLIENT
The CGI client is an end-user tool enabling a mail user to view their
quarantine box, reverse the occasional false positive, view their historical
activity, and most importantly to delete their spams permanently.
The CGI client works in conjunction with the DSPAM agent. It is possible to
eliminate the quarantine box in lieu of an alternative solution, such as
client-filtering/forwarding, but many users will appreciate not having to
download all of their spam, and being able to view usage graphs and other
useful information.
TOOLS
Some basic tools which have been provided to manage dictionaries, automate
corpus feeding, and create seeded [composite] dictionaries.
1.1 安装
主要步骤
------------------------------------------------------
重要升级步骤: 适用于版本 ------------------------------------------------------
3.0版本合并了用户接口的许多变化,但是保留了常规的数据结构,因此用户没有必要为了升级再重新学。
步骤1:关闭现有的DSPAM安装。
现有的DSPAM安装必须先于升级前关掉。最简单的方法是关掉为DSPAM CGI服务的MTA和web服务器。在升级时不得处理任何邮件。
步骤2:数据存储升级
如果您用的是基于SQL的驱动程序,只需作很小的改动即可。这些改变只对基于SQL的驱动程序是必要
的;如果您用的是BerkeleyDB存储驱动则无须作改动。
下面的SQL代码应该更新MySQL和Oracle数据库的3.0版格式。确定用DSPAM进入schema以便能运用这些改动。
alter table dspam_stats add spam_learned int;
alter table dspam_stats add innocent_learned int;
alter table dspam_stats add spam_classified int;
alter table dspam_stats add innocent_classified int;
update dspam_stats set spam_learned = total_spam;
update dspam_stats set innocent_learned = total_innocent;
update dspam_stats set spam_classified = 0, innocent_classified = 0;
alter table dspam_stats drop column total_spam;
alter table dspam_stats drop column total_innocent;
alter table dspam_stats add spam_misclassified int;
alter table dspam_stats add innocent_misclassified int;
update dspam_stats set spam_misclassified = spam_misses;
update dspam_stats set innocent_misclassified = false_positives;
alter table dspam_stats drop column spam_misses;
alter table dspam_stats drop column false_positives;
步骤3:编译DSPAM V3.0
DSPAM v3.0改动了很多命令行的特征。其他的configure-time参数也有所改动或删除。下面列举了对
configure-timed做的变动:
--with-userdir-* changed 'userdir' to 'dspam-home'
--with-local-delivery-agent changed to --with-delivery-agent
--enable/disable-chained-tokens removed from configure
--enable/disable-bnr removed from configure
--enable/disable-whitelist removed from configure
--enable/disable-toe removed from configure
--enable/disable-tum removed from configure
--enable/disable-spam-delivery removed from configure
--enable/disable-deliver-fp removed from configure
一旦您配置了DSPAM,运行:make %26amp;%26amp; make install
进行编译安装软件。
注:默认的DSPAM路径由changed from /etc/mail/dspam 变为 /var/dspam.如果您想用老的路径,
请用 --with-dspam-home=/etc/mail/dspam 指定。
1.1 INSTALLATION
UPGRADING
------------------------------------------------------
IMPORTANT UPGRADE STEPS FOR USERS UPGRADING FROM ------------------------------------------------------
Version 3.0 incorporates many changes to the user interface, but preserves
the general data structure so that users don't need to re-train in order
to upgrade.
Step 1: Shut down the existing DSPAM installation.
The existing DSPAM installation should be shut down prior to any upgrade
changes. The easiest way to do this is to turn off the MTA and web
server serving the DSPAM CGI. No mail should be processed while the
changes are being made.
Step 2: Data Storage Changes
If you are using a SQL-Based driver, a few minor changes will need to
be made. These changes are only necessary to SQL-Based drivers; no
changes need to be made if you are using the BerkeleyDB storage drivers.
The following SQL code should upgrade both MySQL and Oracle databases to
the v3.0 format. Be sure to log into the schema used by DSPAM to apply
these changes.
alter table dspam_stats add spam_learned int;
alter table dspam_stats add innocent_learned int;
alter table dspam_stats add spam_classified int;
alter table dspam_stats add innocent_classified int;
update dspam_stats set spam_learned = total_spam;
update dspam_stats set innocent_learned = total_innocent;
update dspam_stats set spam_classified = 0, innocent_classified = 0;
alter table dspam_stats drop column total_spam;
alter table dspam_stats drop column total_innocent;
alter table dspam_stats add spam_misclassified int;
alter table dspam_stats add innocent_misclassified int;
update dspam_stats set spam_misclassified = spam_misses;
update dspam_stats set innocent_misclassified = false_positives;
alter table dspam_stats drop column spam_misses;
alter table dspam_stats drop column false_positives;
Step 3: Compile DSPAM v3.0
DSPAM v3.0 has moved many features out to the commandline. Other
configure-time arguments have also been changed or removed. The following
is a list of configure-time changes that have been made:
--with-userdir-* changed 'userdir' to 'dspam-home'
--with-local-delivery-agent changed to --with-delivery-agent
--enable/disable-chained-tokens removed from configure
--enable/disable-bnr removed from configure
--enable/disable-whitelist removed from configure
--enable/disable-toe removed from configure
--enable/disable-tum removed from configure
--enable/disable-spam-delivery removed from configure
--enable/disable-deliver-fp removed from configure
once you have configured DSPAM, run: make %26amp;%26amp; make install
to build and install the software.
NOTE: The default DSPAM home has been changed from /etc/mail/dspam to
/var/dspam. If you would like to use the old path, specify it using
--with-dspam-home=/etc/mail/dspam.
步骤4:更新CGI
DSPAM CGI必须得更新。许多的调用参数已经被改变了。
步骤5:重行配置MTA
DSPAM的命令行参数已被重写。您应该为所有新的命令行成分的圆满解释,考虑'AGENT COMMANDLINE
ARGUMENTS' 部分。一些变动如下:
--addspam, --falsepositive, --corpus, and --inoculate have been
replaced with two flags to specify a pre-classification and a
classification source:
--addspam becomes:
--class=spam --source=error
--falsepositive becomes:
--class=innocent --source=error
--corpus becomes:
--class=innocent --source=corpus
--class=spam --source=corpus
--inoculate becomes:
--class=spam --source=inoculation
指定训练模式(training mode),发送参数选择,还有命令行参数选择是必要的。例如:
--mode=teft --deliver=innocent --feature=chained,bnr
否则,如果您喜欢一起发送合法邮件和垃圾邮件,您应该用
--deliver=innocent,spam。谨记您的这点喜好将被用于您将使用的任何操作系统。例如,假如您重新训练
一个假阳性或接近垃圾的邮件,--deliver参数将指定您是否想发送他,因此您可以享受在MTA,aliases,
和CGI中定义不同的命令行参数。
Step 4: Upgrade the CGI
The DSPAM CGI has been updated, and should be upgraded. Many calling
arguments have been changed.
Step 5: Reconfigure your MTA
DSPAM's commandline arguments have been rewritten. You'll want to
consult the 'AGENT COMMANDLINE ARGUMENTS' section for a full explanation
of all the new commandline components. Some of the basic changes are:
--addspam, --falsepositive, --corpus, and --inoculate have been
replaced with two flags to specify a pre-classification and a
classification source:
--addspam becomes:
--class=spam --source=error
--falsepositive becomes:
--class=innocent --source=error
--corpus becomes:
--class=innocent --source=corpus
--class=spam --source=corpus
--inoculate becomes:
--class=spam --source=inoculation
It will also be necessary to specify the training mode, delivery
preferences, and feature selection on the commandline as well. For
example:
--mode=teft --deliver=innocent --feature=chained,bnr
Or if you prefer delivery of both innocent and spam messages, you should
use --deliver=innocent,spam. Keep in mind that these preferences will
be applied to whatever operation you are calling. For example, if you
are retraining a false positive or forwarding in a spam, the --deliver
argument will specify whether or not you want to deliver it, so you have
the luxury of defining different commandline arguments between your MTA,
aliases, and CGI.
步骤6:目录结构已被改变,因此所有的用户目录都在$DSPAM_HOME/data中。您可以删除所有的用户目录
(.stats等文件将被重新编译,但是隔离箱不会),或者把他们移到 $DSPAM_HOME/data中。
步骤7:打开您的MTA和CGI,做一个全面的测试。
刷新安装
首先您得下载一些必要的工具:
这取决于您想用那种驱动程序,您需要:
libdb4_drv: Berkeley DB-4.
libdb3_drv: Berkeley DB-3.
mysql_drv: MySQL client libraries (and a server to connect to)
ora_drv: Oracle Call Interface (and a server to connect to)
pgsql_drv: PostgreSQL client libraries (and a server to connect to)
MYSQL时被推荐的存储驱动程序,即使是执行小项目,他也比其他驱动稳定且好测试。如果您没有办法运行
一个稳定的服务器,libdb驱动应该满足了,但是请注意,libdb 偶尔会导致一些问题,包括data corruption
和 lock contention。结果,您不得不做一个备份以免出现这些问题。
一般来说,MYSQL是一个较快的解决方案且占用较小的存储器,同时适合小型或大规模的运行。
Step 6: The directory structure has been changed so that all user directories
go into $DSPAM_HOME/data. You'll want to either delete all user
directories (the .stats files and such will be rebuilt, but the
quarantine boxes won't), or move them into $DSPAM_HOME/data.
Step 7: Turn your MTA and CGIs back on, and TEST EVERYTHING.
FRESH INSTALLATION
First you will need to download a few prerequisite tools:
Depending on which storage driver you want to use, you will need:
libdb4_drv: Berkeley DB-4.
libdb3_drv: Berkeley DB-3.
mysql_drv: MySQL client libraries (and a server to connect to)
ora_drv: Oracle Call Interface (and a server to connect to)
pgsql_drv: PostgreSQL client libraries (and a server to connect to)
MySQL is the recommended storage driver, even for small implementations, as
it is more stable and tested than the other drivers. If you are incapable
of running a stateful server, the libdb drivers should suffice, but please
be aware that libdb can occasionally result in some problems including
data corruption and lock contention. As a result, you'll want to maintain
a backup of your dictionary in the event such problems arise.
In general, MySQL is a faster solution with a smaller storage footprint,
and is well suited for both small and large-scale implementations.
You can download Berkeley DB from http://www.sleepycat.com.
You can download MySQL from http://www.mysql.com.
You can download PostgreSQL from http://www.postgresql.com
You can obtain more information about Oracle at http://www.oracle.com.
Be sure the necessary libraries are available to root, the MTA user, and
the CGI user. The easiest way to do this is to copy them to /usr/lib or
/lib.
Documentation for the setup of your selected storage driver can be found
in the tools.[storage driver]/ directory.
NOTE: Some operating system distributions include their own version of
libdb3_drv and libdb4_drv. A majority of these packaged versions
do work correctly with DSPAM, however a few do not. If you experience
problems with one of the libdb storage drivers, consider downloading
and compiling the official source tree from http://www.sleepycat.com.
1. 配置结构
./configure [options]
DSPAM支持下面的配置结构:
PATH SWITCHES
--with-dspam-home=DIR
为dspam用户信息指定一个可选择的存储目录。默认路径是/var/dspam。
--prefix=DIR
安装时指定一个可选的root 路径前缀。默认方式为:/usr/local。这并不影响DSPAM_HOME。
FILESYSTEM SCALE
默认的filesystem scale是:"small-scale",在顶级(top-level)DSPAM_HOME/data目录下把每个用户写
入自己的目录里。以下两个开关允许为了适合比较大型的安装而把scale作一定变动。
--enable-large-scale large-scale执行的开关。用户数据将以$DSPAM_HOME/data/u/s/user方式代替$DSPAM_HOME/data/user
被存储起来。
--enable-domain-scale
domain-scale执行的开关。用的时候,username@domain会阿被当成用户的id,用户数据会被
当作$DSPAM_HOME/data/domain.com/user存储,同时,$DSPAM_HOME/opt-in/domain/user.dspam也
会代替$DSPAM_HOME/data/user。
1. CONFIGURATION
./configure [options]
DSPAM supports the configuration options below.
PATH SWITCHES
--with-dspam-home=DIR
Specify an alternative storage directory for dspam user information. The
default is /var/dspam.
--prefix=DIR
Specify an alternative root prefix for installation. The default is
/usr/local. This does not affect DSPAM_HOME.
FILESYSTEM SCALE
The default filesystem scale is "small-scale", and writes each user to
its own directory in the top-level DSPAM_HOME/data directory.
The following two switches allow the scale to be changed to be more
suitable for larger installations.
--enable-large-scale
Switch for large-scale implementation. User data will be stored as
$DSPAM_HOME/data/u/s/user instead of $DSPAM_HOME/data/user
--enable-domain-scale
Switch for domain-scale implementation. When used, username@domain should
be passed in as the user id and user data will be stored as
$DSPAM_HOME/data/domain.com/user and $DSPAM_HOME/opt-in/domain/user.dspam
instead of $DSPAM_HOME/data/user
INTEGRATION SWITCHES
--with-delivery-agent=PROG
发送代理被称作邮件发送(deliver messages)。
用此来指定一个发送代理,而不是您的操作系统所指定的那个。尤其是您在一个不支持的平台上建立时,
您必须这样指定。如果您想包含额外的命令行标识,可能您会用到引号。DSPAM将自动替换最初给定的的命
令行参数,除了所有的DSPAM-specific参数(比如--user,--process,等等)。这并不要求必须得是一个本
地代理,但是必须得配置成可使某个代理可以通过。
当前,DSPAM已经为Linux,FreeBSD,Solaris,和Cygwin平台搭建了默认的发送代理(delivery agent)。
注:当指定一系列的参数时,您得在PROG周围加上引号。可能您也会用到标志符$u在参数列表的相应位置
用DSPAM指定用户ID的目的文件。例如:
在调用LDA之前$u将会在那里被目的用户所取代.然而,如果您的MTA要求用户参数列表已默认的方式最后出现,
这样就导致了潜在的问题。这就是为什么DSPAM允许您这样设置MTA配置的原因。
注:写$u时千万别忘记写$,只有在命令行中指定$u时可以不写$.这会防止$u被shell的环境变量'u'所覆盖。
您可以选择用%u。
INTEGRATION SWITCHES
--with-delivery-agent=PROG
The delivery agent is the tool called to deliver messages.
Use this to specify an alternative delivery agent, other than the one
specific to your operating system. If you are building on an unsupported
platform, you will need to specify this. You may use quotes if you wish to
include additional commandline flags. DSPAM will automatically relay the
commandline parameters it was initially given, with the exception of any
DSPAM-specific parameters (such as --user, --process, etc.). This does
not necessarily need to be a local agent, but can be configured to call
a proxy pass-through.
Currently, DSPAM has a default delivery agent selected for Linux,
FreeBSD, Solaris, and Cygwin platforms.
NOTE: When specifying a series of arguments, you will need to use quotes
around PROG. You may also use the $u identifier to specify that you
with DSPAM to place the destination user's ID in the corresponding space
in the arguments list. For example:
Where $u will be replaced by the destination user prior to calling the LDA.
This could potentially cause problems, however, if your MTA requires the
user argument list to come last, which is why DSPAM, by default, will allow
you to set this in the MTA configuration.
NOTE: be sure to escape the $ in $u. Only do this when specifying $u on the
commandline. This will prevent $u from being overwritten with the shell's
environment variable 'u'. You may alternatively use %u.
--with-quarantine-agent=PROG 默认情况是,在其内部用户隔离箱里DSPAM会自动隔离垃圾。要是您不想使用默认的方式,您就得指定您自
己的隔离代理。--with-delivery-agent选项也是同理。任何时候当某个邮件被认为是垃圾邮件时,隔离代
理将被调用。
--enable-broken-mta
要是您的MTA(报文传送代理)被破坏了,您就可以用此命令,用CTRL-M把邮件传到DSPAM中。
--enable-broken-return-codes
如果是垃圾邮件则使DSPAM返回99(退出码:exitcode),不是垃圾邮件则返回0,其他的返回值则表示有
错误发生。默认方式下不会考虑结果怎样,只要操作成功,就返回0。只有用这种方式您才能明白您在做什么。
--with-quarantine-agent=PROG
By default, DSPAM automatically quarantines spams in its internal
user quarantine box. If you wish to override this default behavior,
however, you may do so by specifying your own quarantine agent. The same
notes from the --with-delivery-agent option apply here. The quarantine agent
will be called whenever a message is believed to be spam, with the message
provided as stdin into the tool.
--enable-broken-mta
You should enable this if your MTA is broken and passes messages into DSPAM
with CTRL-M's (^M) in them.
--enable-broken-return-codes
Causes DSPAM to return an exit code of 99 if a message is spam, 0 if
innocent, and any other code if an error has occured. The default is to
return 0 whenever the operation is successful, regardless of outcome. Only
use this if you know what you're doing!
--with-storage-driver=DRIVER
指定一个可选择的存储驱动。这个驱动是特地为DSPAM来写存储记号,签名数据,以及其他的私有操作。
通常默认的驱动是libdb4_drv,可以和Berkeley DB v4结合。下面给出了一些驱动:
libdb4_drv: Berkeley DB4 Library
libdb3_drv: Berkeley DB3 Library
mysql_drv: MySQL Drivers
ora_drv: Oracle Drivers (BETA)
pgsql_drv: PostgreSQL Drivers (BETA)
您也许要用到某些特定的驱动来配置标记(以后讨论)。
--enable-client-compression
在用存储驱动之处(目前只有mysql_drv),使客户机数据源能够压缩。中导致数据源和其客户机的数据均被压缩。
如果您的数据源为了节约带宽而在一个与DSPAM代理分离的机器上时,您应该用此选项,但是这样会花费占
用一些CPU。
--disable-trusted-user-security
管理员们可以用此配置标识来使trusted user security 不可用。这样会使DSPAM对每一位用户都很“信任”,允许他们在服务器里通过DSPAM潜在执行任意的命令。
由此,管理员应该只用此于服务器关闭时,或是将其DSPAMbinary配置成只有可“信任”用户执行的形式。
这个选项绝对不应该用来当作解决MTA授权优先于调用DSPAM的办法。相反,请查看本文的TRUSTED SECURITY部分。
--with-storage-driver=DRIVER
Specify an alternative storage driver. A storage driver is a driver
written specifically for DSPAM to store tokens, signature data, and
perform other proprietary operations. The default driver is libdb4_drv,
which incorporates Berkeley DB v4. The following drivers have been provided:
libdb4_drv: Berkeley DB4 Library
libdb3_drv: Berkeley DB3 Library
mysql_drv: MySQL Drivers
ora_drv: Oracle Drivers (BETA)
pgsql_drv: PostgreSQL Drivers (BETA)
You may also need to use some of the driver-specific configure flags
(discussed later).
--enable-client-compression
Enables data source client compression for storage drivers where it is
available (presently only mysql_drv). This causes data between the
data source and its clients to be compressed. You should use this option
if your data source is on a separate machine from the DSPAM agent(s) as it
conserves bandwidth, but at the expense of a few CPU cycles.
--disable-trusted-user-security
Administrators who wish to disable trusted user security may do so by
using this configure flag. This will cause DSPAM to treat each user as
if they were "trusted" which could allow them to potentially execute
arbitrary commands on the server via DSPAM. Because of this, administrators
should only use this option on either a closed server, or configure their
DSPAM binary to be executable only by users who can be trusted. This
option SHOULD NOT be used as a solution to your MTA dropping privileges
prior to calling DSPAM. Instead, see the TRUSTED SECURITY section of this
document.
--enable-homedir-dotfiles
如果选择可用(enabled),DSPAM将在用户的主目录里检查.nodspam|.dspam文件,而不是检查
$DSPAM_HOME/$USER/opt-in/ $USR[.nodspam |.dspam]。这两个dotfiles用来过滤opt-out或opt-in。
--enable-opt-in
使DSPAM为只有.dspam dotfile的文件过滤邮件。默认方式是opt-out,它需要一个有.nodspam 文件回避过滤。
--enable-homedir-dotfiles
When enabled, instead of checking for $DSPAM_HOME/$USER/opt-in/
$USER[.nodspam|.dspam], DSPAM will check for a .nodspam|.dspam file in the
user's home directory. These two dotfiles are used for opt-out or opt-in
filtering.
--enable-opt-in
Causes DSPAM to filter mail only for users with a .dspam dotfile. The
default is opt-out, which requires a .nodspam file to exist to bypass
filtering.
调试开关
--enable-debug
为调试输出DSPAM_HOME/dspam.debug和DSPAM_HOME/dspam.messages(有关DSPAM_HOME的详细资料请参
见--with-dspam-home的desription选项)打开support(Turnsonsupport)。这允许您可以为某个指定的用
户通过下放(drop)DSPAM_HOME/userpath/user.debug文件而打开邮件调试,或者为所有用户下放
DSPAM_HOME/.debug文件。使得调试工具只支持这种特性,而且为了打开邮件dotfile还必须得下放。
--enable-verbose-debug
打开非常详细的DSPAM_HOME/dspam.debug 和
DSPAM_HOME/dspam.messages (有关DSPAM_HOME的详细资料请参见--with-dspam-home的desription 选项)
的调试输出结果。dotfile仍然得下放以激活邮件,就像--enable-debug选项一样。
DEBUGGING SWITCHES
--enable-debug
Turns on support for debugging output to DSPAM_HOME/dspam.debug and
DSPAM_HOME/dspam.messages (see desription of --with-dspam-home option for
details about DSPAM_HOME). This option allows you to turn on debugging
messages for specific users by dropping a DSPAM_HOME/userpath/user.debug
file or for all users by dropping a DSPAM_HOME/.debug file. Enabling
debug only enables support for this feature, dotfiles must still be
dropped in order to turn messages on.
--enable-verbose-debug
Turns on extremely verbose debugging output to DSPAM_HOME/dspam.debug
and DSPAM_HOME/dspam.messages (see desription of --with-dspam-home option
for details about DSPAM_HOME). dotfiles must still be dropped in order to
activate messages, just like with --enable-debug.
训练集辨识开关(TRAINING SET IDENTIFICATION SWITCHES)
DSPAM的默认方式是存储所有的原始training data到作为暂时信息的服务器一边,嵌入一系列的数字到与相关数据有关的每个邮件的主体(body)中。
这用于错误分类以及提供真正的1:1再训练(retraining)。 某些执行或许会对训练集识别的要求有些许不同之处。
--enable-signature-attachments
取代了在服务器上存储DSPAM签名(这会腾出可观的磁盘空间),这个选项会为了包含一个dspam.dat附件而
导致DSPAM重写每个邮件,这包括为了计算原始邮件的所有记号。当垃圾邮件或是假阳性被返回到系统来处
理时,就会读这个签名。每封邮件大概平均会增加2k-32k的带宽,这取决于原始邮件的大小。
注:这个选项会和一些引进了先进邮件(比如某些或是所有的elm版本)的mail client产生冲突,
由此这些选项应该只用于那些所有的客户机都能完全理解embedded multipart message(如Outlook,
Ximian Evolution,Etcetera)的网络中,而且可以把附件当作是附件而不是当作引用文本(quote text)。
换句话说,如果您的客户机网络不是标准的基于GUI的,这会突然导致过多的堵塞。服务器那边的签名仍然
是为所有客户机服务的最可靠的方法。
它总是在您收到的每一封邮件上别一个“曲别针”("paper clip")。
TRAINING SET IDENTIFICATION SWITCHES
The default behavior for DSPAM is to store all original training data
on the server-side as temporary information, and embed a serial number
into the body of each message referencing the data. This is used for
misclassifications and providing a true 1:1 retraining. Some
implementations may call for a slightly different approach to training set
identification.
--enable-signature-attachments
Instead of storing the DSPAM signatures on the server (which could take
up considerable disk space), this option will cause DSPAM to rewrite
each message to include a dspam.dat attachment, which contains all of
the tokens used to calculate the original message. When the spam or
false positive is processed back into the system, this signature will
be read. May increase bandwidth on an average between 2k-32k per
message, depending on the original message's size.
NOTE: This option doesn't work correctly with mail clients that quote an
embedded, forwarded message (such as some or all versions of elm) and
should only be used on networks where all clients can properly understand
an embedded multipart message (Outlook, Ximian Evolution, Etcetera), and
forward the attachment as an attachment instead of quoted text. In othe
words, this breaks a lot of stuff if you're not on a standardized GUI-based
client network. Server-side signatures is still the most reliable method
and works for all known clients.
This also puts a "paper clip" on every message you receive.
--enable-signature-headers
该选项使DSPAM签名写入邮件头而不是邮件体。
重点:该选项要求所有用户把他们的邮件当作附件返回到DSPAM中,或者执行一些宏命令以保留将会被标准
发送而NORMALLY BE DROPPED 的 X-DSPAM-Signature 邮件头。
--enable-webmail
webmail开关是为某些系统而设计的。这些系统的源邮件保留在服务器里,而且为了再训练(retraining)
呈现原始的格式。该选项会导致DSPAM中止所有的签名写入和DSPAM写入到邮件中,而且会尽可能简单的发送
出邮件。这个模式需要(REQUIRES)源邮件显示出最初发送时的格式,由此可以再训练。就像在webmail或
是其他的应用中,在读邮件时邮件通常都是保存在服务器里的。不要(DO NOT)为了再训练(retraining)
而用这个开关,除非原始邮件确实有原始的邮件头而且没有被修改过(ORIGINAL HEADERS and NO
MODIFICATIONS)。
--with-signature-life=DAYS
指定存储在服务器里的签名长度(以天为单位)默认值。默认值为14天。这个值应该准确描绘用户识可能别
和转寄一封丢失了的垃圾或是假阳性邮件的最长时间。要考虑到休假问题。可以在命令行中调用dspam_clean
来改变。
--enable-signature-headers
This option will cause the DSPAM signature to be written to the message
header instead of body.
IMPORTANT: This option requires that all users either bounce their messages
into DSPAM, forward as an attachment, or implement some macro that will
retain the X-DSPAM-Signature header, which will NORMALLY BE DROPPED by
standard forwarding.
--enable-webmail
The webmail switch is designed for systems where the original message
remains server side and can therefore be presented in pristine format for
retraining. This option will cause DSPAM to cease all writing of
signatures and DSPAM headers to the message, and deliver the message in as
pristine format as possible. This mode REQUIRES that the original message
in its pristine format (as of delivery) be presented for retraining, as in
the case of webmail or other applications where the message is actually
kept server-side during reading, and is preserved. DO NOT use this switch
unless the original message can be presented for retraining with the
ORIGINAL HEADERS and NO MODIFICATIONS.
--with-signature-life=DAYS
Specifies the default length (in days) a signature should remain stored on
the server. The default is 14 days. This value should accurately represent
the maximum amount of time a user would need to identify and forward
a missed spam, or mark a false positive. Consider vacations. This can
be changed in calls to dspam_clean on the commandline.
(特征激活)FEATURE ACTIVATION
--enable-neural-networking (EXPERIMENTAL可选)
使中心网络支持可用(参见NEURAL NETWORKING部分)。目前只有mysq_drv 和 pgsql_drv
存储驱动支持该特征,而且也还只是试验性的。
--enable-source-address-tracking
通过syslog把垃圾邮件和正常邮件的源地址记入日志。您可以创建一个包含本地MTA IPs 的DSPAM/meta.whichlist文件,这样就让DSPAM跳到下一个“已收”('Received')邮件头。每行一个IP。
也可以用改进了的Blackhole Server写入SBL blacklist文件 (http://www.nuclearelephant.com/projects/sbl/)。
--enable-spam-subject
预先考虑任何疑似垃圾邮件的邮件头主题部分。有些时候这比X-DSPAM-Result域更有用,因为并不是所有的
邮件客户机都支持带自定义邮件头的邮件规则。
--disable-user-logging
禁止每个用户日志文件的写。禁止后用户不能察看图表或历史日志。
--disable-system-logging
禁止系统日志文件的写。禁止后管理员不能察看图表或历史日志。
FEATURE ACTIVATION
--enable-neural-networking (EXPERIMENTAL)
Enables neural networking support (see the section NEURAL NETWORKING). This
feature is only presently supported by the mysq_drv and pgsql_drv
storage drivers, and is still considered experimental.
--enable-source-address-tracking
Logs the source address of spams and innocent messages via syslog.
You can create a file DSPAM_HOME/mta.whitelist which can contain a list of
local MTA IPs, which will cause DSPAM to skip to the next 'Received' header.
Each IP should be on a new line.
Also writes SBL blacklist files for use with the Streamlined Blackhole
Server (http://www.nuclearelephant.com/projects/sbl/).
--enable-spam-subject
Prepends [SPAM] to the subject header of any messages suspected to be spam.
This is sometimes more useful than the X-DSPAM-Result field, because not
all mail clients support mail rules with custom headers.
--disable-user-logging
Disables the writing of per-user .log files. Users will not be able to
view graphs or history with this feature disabled.
--disable-system-logging
Disables the writing of the system.log. Admins will not be able to view
graphs or other related information with this feature disabled.
算法规则激活(ALGORITHM ACTIVATION)
默认的已激活算法规则已经非常够用,表现了DSPAM中最彻底测试过的算法。没有必要改动任何选项,除非
您对改变DSPAM的默认方式特别感兴趣。
--disable-traditional-bayesian
禁止传统的Bayesian 算法(默认已激活)。
--disable-alternative-bayesian
禁止Brian Burton 的算法,选择Bayesian 算法。不同之处在于:
-用27个例子代替15个例子
-在计算中出现过一次以上的记号会取走两个扩展槽。当数据很有限时,这一点比较理想
(默认已激活)
--enable-robinson
可用Robinson的几何平均数测试。不同之处在于:
-窗口型号25取代了15
-联合算法也有区别。参见:
http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
此算法非常陈旧,不推荐使用于产品成果建立。
ALGORITHM ACTIVATION
The default algorithms enabled are quite sufficient, and represent the most
well-tested algorithms in DSPAM. It is not necessary to change any of
these options unless you are interested in altering DSPAM's default behavior.
--disable-traditional-bayesian
Disables the traditional Bayesian algorithm (it is enabled by default).
--disable-alternative-bayesian
Disables Brian Burton's alternative Bayesian algorithm. The differences are:
- 27 Samples are used instead of 15
- Tokens appearing more than once may take up to 2 slots in the
calculation. This is ideal when there is very limited data
(it is enabled by default)
--enable-robinson
Enables Robinson's geometric mean test. The differences are:
- A window-size of 25 is used instead of 15
- The combination algorithm is different. See:
http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
for more information.
This algorithm is obsolete, and not recommended for production builds.
--enable-chi-square
Fisher-Robinson的 Inverse Chi-Square算法可用。
在libdspam.c中默认的是:
Defaults in libdspam.c:
- Exclusionary radius of 0.45
- Ham/Spam Cutoff of 0.5
- Strength: 0.1
- Assumed probability: 0.5
注: 您可以同时激活多种算法规则;如果某个算法认为某邮件是垃圾邮件,他会直接打上标记。自然地,您也会
发现潜在的问题,即由这些算法产生的假阳性邮件,由此,推荐或者坚持一个算法,或者只用Bayesian
或Robindon的算法。Bayesian+Alt-Bayesian看起来是最有效的联合(根本不用Robinson算法)。
正是这个原因,如果您想用默认禁止的算法时,强烈推荐您同时:
--disable-traditional-bayesian --disable-alternative-bayesian
一般来说,alternative-Bayesian算法有时发现一些传统算法没有发现的垃圾邮件,但是,它相比传统算法
会遗漏更多的垃圾邮件。由此,两个Bayesian算法同时使用看来是最佳的办法。
--enable-chi-square
Enables Fisher-Robinson's Inverse Chi-Square
Defaults in libdspam.c:
- Exclusionary radius of 0.45
- Ham/Spam Cutoff of 0.5
- Strength: 0.1
- Assumed probability: 0.5
NOTE: You may have multiple algorithms enabled simultaneously; if any of
the enabled algorithms believe the message is spam, it will be marked
accordingly. Naturally, you also have the potential problem of any
false positives generated by the enabled algorithms, so it is recommended
to either stick with a single algorithm, or use only Bayesian or only
Robinson's type algorithms. Bayesian+Alt-Bayesian seems to be the most
effective combination (not using Robinson's at all).
For this reason, if you plan on enabling any algorithms which are
disabled by default, it is strongly recommended that you also:
--disable-traditional-bayesian --disable-alternative-bayesian
Generally, the alternative-Bayesian algorithm appears to catch some spams
that the traditional Bayesian algorithm does not, however it also misses
far more spams than the traditional algorithm. Therefore, an
implementation using both Bayesian algorithms appears to be the most
effective in catching spam.
--disable-bias
当偏见被禁止后,dspam不再为了正常邮件而偏爱统计学,而是以平等的计算来平等的评估垃圾和正常邮件。
这或许会对垃圾过滤更有效,但是也提高了假阳性的数量。
--enable-robinson-pvalues
Robinson的联合p-valuse方法可用。这个方法和下面描述的产生单词概率可以二者择一:
http://www.linuxjournal.com/article.php?sid=6467
Robinson的p-values方法目前用于Chi-Square的计算,但是让它们带上标记就会使其用于“所有的”计算,
且有效的取代(或是依赖于)Graham的标记方法。这个标记在Chi-Square禁用时也可用。
--disable-test-conditional
禁用test-conditional训练。Test-conditional训练与传统的相比是一个更加有力的方式,更迅速的提供了
更多的inoculous结果。
默认已激活,训练的模式会自动重新训练用户的垃圾或假阳性词典,直到条件为met(例如直到用户的字典不
再对疑似邮件产生错误的分类) 。这种再训练最多可以迭代5次,当以下情况时才被调用:
-当用户有多于1000封正常邮件时,且报告有垃圾邮件
-用户正在报告有假阳性邮件(有多少邮件可不计)
--disable-bias
When bias is disabled, dspam no longer biases the statistics in favor of
innocent mail, but measures both spam and innocent tokens equally in the
calculation equally. This may provide more effective spam filtering,
but has shown to increase the number of false positives.
--enable-robinson-pvalues
Enable's robinson's technique for combining p-values. This is an alternative
approach to generating word probabilities described here:
http://www.linuxjournal.com/article.php?sid=6467
Robinson's p-values are presently used in Chi-Square calculations, but
enabling them with this flag will use them for *all* calculations effectively
replacing (or rather building upon) Graham's tokenization approach. This
flag may also be used without enabling Chi-Square.
--disable-test-conditional
Disables test-conditional training. Test-conditional training is a more
agressive approach to training than traditional training, and provides more
inoculous results rapidly.
Enabled by default, this mode of training will automatically re-train the
user's dictionary on spam or false positive until the training condition is
met (e.g. until the user's dictionary no longer results in
misclassification of the message being retrained). This training has a
maximum number of 5 iterations, and will only invoke when:
- The user has 1000 innocent messages in their corpus, and is reporting
a spam
- The user is reporting a false positive (regardless of the number of
messages in their corpus)
然而这种training的方式也有争议。所有的论点都是围绕着一个假设:将来这种training的方式很可能导致
您不止一次的接收同一个(或是非常相似的)邮件。
- 既然邮件被重复retrain,那么学习曲线将只基于某一封邮件而不是基于包含不同内容的相似邮件群。
- 很有可能某个用户会重复train一个只收到过一次的垃圾邮件,但是这将会潜在的增加假阳性的风险。
- 如果用户的正常邮件与引进被重复训练的垃圾邮件之间的字典标记非常雷同的话,会使用户潜意识的终止用
垃圾邮件retraining,接着终止用假阳性retraining,然后再终止用垃圾邮件retraining。
尽管有这些争议,但是这种training的方法在许多应用中取得里极大的成功。
This method of training has its controversial points as well. All of these
issues revolve around the assumption this approach to training makes that
you are likely to receive the same (or very similar) again one or more times
in the future.
- Since the message is being retrained repeatedly, the learning curve is
going to be based solely on that one message rather than the natural flow
of similar messages that may contain slightly different text.
- It's possible a user may agressively train a spam they will only receive
once but could potentially increase their risk of false positives by
training this agressively.
- If there is a significant overlap of dictionary tokens between a user's
regular mail and the incoming spams being agressively trained, the user
could potentially end up retraining with spam, then retraining with
false positives, then retraining with spam again.
In spite of these controversial points, this approach to training has had
successful results with several implementations.
驱动程序细节配置开关(DRIVER SPECIFIC CONFIGURE SWITCHES)
DRIVER SPECIFIC CONFIGURE SWITCHES
libdb4_drv:
--with-db4-includes=DIR
Specify a path to the Berkeley db4 includes
--with-db4-libraries=DIR
Specify a path to the Berkeley db4 libraries
libdb3_drv:
--with-db3-includes=DIR
Specify a path to the Berkeley db3 includes
--with-db3-libraries=DIR
Specify a path to the Berkeley db3 libraries
(Currently links to -ldb3, to you may need to symlink libdb-3.3.so to
libdb3.so if it doesn't exist)
mysql_drv:
--with-mysql-includes=DIR
Specify a path to the MySQL includes
--with-mysql-libraries=DIR
Specify a path to the MySQL libraries
(Currently links to -lmysqlclient, also -lcrypto on some systems)
--enable-virtual-users
Tells DSPAM to create virtual user ids. Use this if your users don't
actually exist on the system (e.g. in /etc/passwd if using a password file)
NOTE: Please see the file tools.mysql_drv/README for more information
about configuring the mysql_drv storage driver.
pgsql_drv:
--with-pgsql-includes=DIR
Specify a path to the PgSQL includes
--with-pgsql-libraries=DIR
Specify a path to the PgSQL libraries
(Currently links to -lpq, and netlibs on some systems)
--enable-virtual-users
Tells DSPAM to create virtual user ids. Use this if your users don't
actually exist on the system (e.g. in /etc/passwd if using a password file)
NOTE: Please see the file tools.pgsql_drv/README for more information
about configuring the pgsql_drv storage driver.
ora_drv:
--with-oracle-home=DIR
Specify the Oracle Home (or client home)
--enable-virtual-users
Tells DSPAM to create virtual user ids. Use this if your users don't
actually exist on the system (e.g. in /etc/passwd if using a password file)
NOTE: Please see the file tools.ora_drv/README for more information
about configuring the ora_drv storage driver.
2. BUILDING AND INSTALLING
After you have run configure with the correct options, build and install
DSPAM by performing:
make %26amp;%26amp; make install
If you are a developer wanting to link to the core engine of dspam,
libdspam will be built during this process. Please see the
example.c file for examples of how to link to and use libdspam. Static
and dynamic libraries are built in the .libs directory. Needed headers
will be installed in $prefix$/include/dspam.
3. 权限
安装后,DSPAM_HOME会自动生成(默认路径是/var/dspam)。确保您的MTA 和CGI 用户在这个路径上有写入的权限。
或许您需要在/etc/group下的the directory's [mail] group中添加root 和MTA用户。MTA用户通常是
'daemon' 或 'smmsp',尽管在FreeBSD中默认为'mailnull'。这一点很重要,因为您的MTA用户需要
和文件打交道。
非常重要!!!(IMPORTANT!!!)
FreeBSD的mail.local更改了其有效的uid,因此,为了使它在命令行真正地起作用,dspam必须作为setuid root安装。这在安装过程中自动完成。
如果您发现DSPAM正在错误地为某个用户处理所有的操作,可能是那个用户作为一个administrative user已被加入到trusted.users中。
3. PERMISSIONS
After install, the DSPAM_HOME will have been created for you automatically
(the default is /var/dspam). Insure the permissions of the directory
are writable by both your MTA and CGI user.
You may need to add root and your MTA user to the directory's [mail] group
in /etc/group. The MTA user is usually 'daemon' or 'smmsp' although on
FreeBSD the default is 'mailnull'. This is very important, as your MTA
user needs to be able to lock and work with files.
IMPORTANT!!!
FreeBSD's mail.local changes its effective uid, and so in order to use it
dspam must be installed as setuid root to work on the commandline properly.
This is done automatically on install.
If you find that DSPAM is erroneously processing all operations as a single
user, chances are that user should be added to trusted.users as an
administrative user,
信任用户安全管理(TRUSTED USERS SECURITY)
DSPAM对系统内的不可靠用户有着严格的安全体系,目的是防止他们欺骗其他用户或者指定其自己的通行参
数(passthru arguments)潜在地劫持发送代理。应用这种安全方法是因为执行某些命令(比如使用procmail)
时会要求setuid或是setgid DSPAM代理。
trusted.users文件应该创建在$DSPAM_HOME (默认是 /var/dspam)中。该文件应该包含trusted users的名单,
这些trusted users允许设置或限制垃圾用户,passthru parameters及其他被某些恶意用户设置的具有潜在
危险的信息。该文件一行一个用户名,通常都是MTA和CGI用户的用户名。例如:
root
smmsp
daemon
cgi
mailnull
Where cgi represents the special CGI user you configure Apache to
run your dspam.cgi as.
TRUSTED USERS SECURITY
DSPAM has tighter security for untrusted users on the system, to prevent
them from being able to spoof other users or specify their own passthru
arguments to potentially hijack the delivery agent. This method
of security has been implemented due to the fact that some implementations
(such as those using procmail) may require the DSPAM agent to be setuid or
setgid.
The trusted.users file should be created in $DSPAM_HOME (defaulted to
/var/dspam). This file should contain a list of trusted users who
should be allowed to set the dspam user, passthru parameters, and other
information that would be potentially dangerous for a malicious user to
be able to set. The file should contain one username per line, and will
generally the usernames of the MTA and CGI users. Example:
root
smmsp
daemon
cgi
mailnull
Where cgi represents the special CGI user you configure Apache to
run your dspam.cgi as.
一定要检查DSPAM_HOME/dspam.debug以确保当提交垃圾或假阳性邮件时您没有收到任何不可靠用户的警告,
因为这些actions会经常从不同的用户调用垃圾邮件而不是从标准投递调用。
如果您在调用DSPAM匹配目的用户之前已经更改了userid的MTA时,您不该(should, NOT)把每个用户都添加到trusted users文件中,您应该配置一个事先调整的命令行。DSPAM就会看到这个用户是不可靠的用户,自动设置其
DSPAM用户id和随意配置发送代理参数。
为了不考虑某个untrusted user的通过代理参数(是指可以用来攻击发送代理以获得访问系统的特权的参数),您只需在相同的目录
($DSPAM_HOME)中建一个untrusted.mailer_args的文件。第一行应该是到发送代理的路径,接下来是所有
要通过的LDP参数列表(如果必要的话可以包括每个用户的是识别标志)。这个文件的信息将不会考虑任何
由用户指定的通过命令行的参数。例如:
/bin/mail -d $u
变量$u告诉DSPAM您愿意目标用户名可以用于$u被指定的地方,因此当DSPAM为用户'bob'调用您的LDA时,
他将会这样调用:
/bin/mail -d bob
Be sure to examine DSPAM_HOME/dspam.debug to insure that you don't get any
untrusted user warnings when submitting spam or a false positive, as both
of these actions frequently call dspam from a different user than
standard mail delivery.
If you are using an MTA that changes its userid before calling DSPAM to
match the destination user, you should NOT add each user to the trusted
users file, but instead configure a preset commandline. DSPAM will see
that the user is not trusted and automatically set their DSPAM user id
and optionally the passthru delivery agent arguments.
To override an untrusted user's passthru delivery agent arguments
(arguments which could be used to hijack the delivery agent to gain
privileged access to the system) you will need to set up a file called
untrusted.mailer_args in the same directory ($DSPAM_HOME). The first line
should contain the path to the delivery agent followed by a list of
all the LDA arguments to pass through (including a user identity flag if
necessary). This file's information will override any passthru commandline
parameters specified by the user. For example:
/bin/mail -d $u
The variable $u informs DSPAM that you would like the destination username
to be used in the position $u is specified, so when DSPAM calls your LDA
for user 'bob', it will call it with:
/bin/mail -d bob
注:如果下列所有(ALL)事件都是真:
- 您的MTA在调用DSPAM之前对目标用户执行setup()
- 在配置文件中不能指定,但是还必需得传递给DSPAM的参数additional_dynamically assigned_paramerers存在
- 发送代理没有潜在危险的命令行参数选项,或者您给发送代理加了一层封皮
那么您或许希望删除untrusted.mail_args文件。如果没有发现文件,dspam将允许用户向预先配置了的LSA
(和一些合乎情理的核实要素)指定自己的通过参数,如果不正确的安装这会产生潜在的不安全因素。为了
忽略用户参数,强烈推荐您使用此文件。
不能打开untrusted.mailer_args文件时DSPAM会警告您(通过日志纪录)。
如果您不想看见这个警告的话,去建一个空的untrusted.mailer_args文件吧。
NOTE: In the event that ALL of the following are true:
- Your MTA performs a setuid() to the destination user prior to calling
DSPAM
- There are additional _dynamically assigned_ parameters that must be
passed to DSPAM which cannot be specified in configuration
- The delivery agent has no potentially dangerous commandline
options, or you are placing a wrapper around the delivery
agent
Then you may want to remove the untrusted.mailer_args file all together.
If the file cannot be found, dspam will permit the user to specify their
own passthru arguments to the preconfigured LDA (with some basic sanity
checking) which COULD POTENTIALLY BE INSECURE if improperly set up.. It
is strongly recommended you use this file to override the user.
DSPAM warns you (over log record) when unable to open
untrusted.mailer_args file.
If you don't want to see this warning then make untrusted.mailer_args
file exists but empty.
4. 配置服务器
有两种配置DSPAM的方法:
Mail Server: 当邮件来到时,使DSPAM直接和邮件服务器以及垃圾过滤器结合成整体的默认方式。
POP3 可选择的实现POP3的方法,用户连接到该代理为了检查他们的邮件,当下载完以后邮件就被过滤。POP3方法
比较简单,因为它和邮件服务器之间不需要配置太多的参数(同时也是在Exchange等实现DSPAM的理想工具)。
最大的区别在于前者(邮件服务器)在MTA时间过滤邮件,而后者(POP3代理)在MUA时间处理邮件过滤,而
且后者还有额外的好处:不必担心虚拟用户等等。
4. SERVER CONFIGURATION
There are two ways DSPAM can be configured:
Mail Server: The default approach integrates DSPAM directly with the mail
server and filters spam as mail comes in.
POP3 Proxy: The alternative approach implements a POP3 proxy where users
connect to the proxy to check their email, and email is filtered when
being downloaded. The POP3 proxy is a much easier approach, as it
requires much less integration work with the mail server (and is ideal
for implementing DSPAM on Exchange, etcetera).