Usage
=====

    kbs [config-file]

If config-file is not provided, a default of $HOME/.kbsrc is used.


Description
===========

kbs works on the simple premise that most people know who will be sending them
email, or at least have a few trusted domains.  It also makes it easy for
a special keyword in a subject header to be used.


Config file format
==================

The kbs config file is read from $HOME/.kbsrc (or the parameter supplied) and
has sections in the suggested order:

[config settings]

[include]

[trusted settings]

[blacklist settings]

[subject macros]

[domain settings]

It is *stringly* suggested to set .kbsrc to be only readable by the user - this
is as passwords are stored in there.

Blank lines and text proceeded with a hash (#) are ignored.

Each token is delimited with white space - if the value you want to use
includes spaces, simple quote them (with either single or double quotes).

Escapes aren't used (to make regular expression writing easier), so if you want
to include a quote in a string, simply use the other sort of quote.  If you
want both, sorry!

To see an example kbsrc file, see kbsrc in the src directory.


Config settings
===============

The config settings are in the general form:

    set variable=value

And understand the following variables:


    hostname <FQDN>	The Fully Qualified Domain Name of the POP3 server.
			Defaults to localhost.

    port <port number>	The port number to connect to.
			Defaults to 110.

    username <username>	The username to give to the server.
			Defaults to guest.

    password <username>	The password to give to the server.
			Defaults to an empty string.

    log <path>		Logs deleted messages to the given file, which is
    			appended to.  Note that non-deletion diagnostics go to
			stderr.
			If not used, or is an illegal path, logging is to
			$HOME/.kbs-deletelog unless the variable is set to
			a dash (-), in which case logging is disabled.

    verbose <on|off>	Whether logging is verbose.  If it's verbose then
    			the username, subject and a tailing report is generated.
			If logging is not verbose, then just usernames and
			the reason the message was blocked are logged.
			Defaults to verbose (on).

    showmatch <on|off>	Whether the matching expression (if applicable) is
    			put in the log file.
			Defaults to showing the matching expression (on).

    timeout <seconds>	Number of seconds to allow for no response from the
    			server.
			Defaults to 60 seconds.

    casesense <on|off>	Whether regular expressions are case sensitive.
			Defaults to case insensitive (off).

    dejunk <on|off>	Whether subjects are dejunked before checking.
    			Dejunking here means that anything an alphanumeric
			character is stripped, and all contiguos white space
			is reduced to one space.
			Defaults to off.

    blockhtml <on|off>	Whether messages that are pure HTML (content part just
    			reported as "text/html" are blocked.  Also blocks
			messages with an empty, or missing, content type
			WARNING: this will delete mail sent with old command
			line mailers as they don't set a content-type - however,
			plenty of spam has no valid content-type either.
			Defaults to off.

    testmode <on|off>	Whether things will be really deleted, or just
    			the actions logged.  The log is still filled out as if
			the deletion occured.
			Defaults to off, though it is recommended to use this
			for early runs to ensure your rules are not too harsh
			(or too easy for that matter).


Include files
=============

Include files are useful if you want to scan multiple hosts buy using mutiple
config files, eg.

In file 1:

set host	host1
set username	user1
set password	pass1

include "/etc/kbsrc"


In file 2:

set host	host2
set username	user2
set password	pass2

include "/etc/kbsrc"


Then put all your rules as needed in /etc/kbsrc.


Trusted settings
================

These define users and domains for which mail is let thorugh, regardless of
other tests.  These are not regular expressions - they are simply compared
with a case-insignificant test.

    trusted_users
    {
    	username
	[username]
    }

    trusted_domains
    {
    	domain
	[domain]
    }


Blacklist settings
==================

These define domains for which mail will be deleted with extreme prejedice.

    blacklist
    {
    	<domain regular expression>
    	[<domain regular expression>]
    }

Note that unlike the trusted commands these are regular expressions.  It's
probably fairly useless as I'd imagine most messages will have butchered
'From:' fields, but heck, it's there if you need it.


Subject Macros
==============

This command is simply to ease the writing of regular expressions where you
want to match expressions where numbers or alternative characters are
often used to try and fool filters:

    subject_macro <source string> <replacement string>

e.g.

    subject_macro "i" "[i1!l]"	# Common spam spellings for 'i'
    subject_macro "e" "[e3]"
    subject_macro "s" "[s5]"
    ...
    ...
    disallow_subject " ?free movies ?"

Subject macros are only expanded one time - if the result of an expansion
includes another macro, that will not be expanded.  Also remember
that macros are case sensitive - this allows easy constructions like this
which would break if the 'i' in the '[]' was expanded:

    set casesense off
    ...
    ...
    subject "i" "[i1!l]"
    ...
    ...
    disallow_subject "i...[I12]"

Even though it's not obvious from the examples above, the source string can
be any number of characters.


Domain settings
===============

These define the rules applied to a certain domain.  The order domains
appear in is important, as the first match found when checking domain names
will be used.

    domain <regular expression>
    {
	[default block|allow]
    	[block_user <username>]
	[allow_to <regular expression>]
    	[allow_subject <regular expression>]
	[block_subject <regular expression>]
    }

The default says what to do if neither the allow_subject or block_subject are
matched.  If not specified, the default is to allow.

The block user allows a specific username to be blocked.  For instance, I've
noticed that spammers have a great love of emailing from your username at
a different domain.

The allow_to sets a regular expression that the 'To: ' line in the header
data must match.  This can be handy as for some reason, even though spam may
be directed to your inbox, the 'To:' line will actually read a nonsense name
for your host.  Note that it's not recommended to anchor the regular expression
as if the mail has been sent to multiple recipients it is not guaranteed
where your name will appear in the list of users.

The allow_subject means that subjects that match that regular expression are
always let through.

The block_subject means that subjects that match that regular expression are
always blocked and deleted.

Multiple allow and block commands can be in one domain.

The commands inside a domain can appear in any order, but the checks are always
done in this order:

 1. If a trusted domain, allow message.
 2. If a trusted user, allow message.
 3. If an HTML message and these are blocked, delete message.
 4. If the domain is blacklisted, delete message.
 5. If the domain is not matched in a domain command, allow message.
 6. If the subject is allowed for the domain, allow message.
 7. If an allow_to has been set for the domain, and it doesn't match,
    delete message.
 8. If the username is blocked for the domain, delete message.
 9. If the subject is disallowed for the domain, delete message.
10. Delete the message if the default is to block, otherwise allow.


-------------------------------------------------------------------------------
$Id: INSTRUCTION,v 1.7 2004-01-26 02:01:49 ianc Exp $