summaryrefslogtreecommitdiff
path: root/doc/INSTRUCTION
blob: 5ab1ddcde5daaa25a00adebaae730f3fd2b33a2f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278

Usage
=====

    kbs [config-file]

If config-file is not provided, a default of $HOME/.kbsrc is used.


Description
===========

kbs works on the simple premise that most people know who will be sending them
email, or at least have a few trusted domains.  It also makes it easy for
a special keyword in a subject header to be used.


Config file format
==================

The kbs config file is read from $HOME/.kbsrc (or the parameter supplied) and
has sections in the suggested order:

[config settings]

[include]

[trusted settings]

[blacklist settings]

[subject macros]

[domain settings]

It is *stringly* suggested to set .kbsrc to be only readable by the user - this
is as passwords are stored in there.

Blank lines and text proceeded with a hash (#) are ignored.

Each token is delimited with white space - if the value you want to use
includes spaces, simple quote them (with either single or double quotes).

Escapes aren't used (to make regular expression writing easier), so if you want
to include a quote in a string, simply use the other sort of quote.  If you
want both, sorry!

To see an example kbsrc file, see kbsrc in the src directory.


Config settings
===============

The config settings are in the general form:

    set variable=value

And understand the following variables:


    hostname <FQDN>	The Fully Qualified Domain Name of the POP3 server.
			Defaults to localhost.

    port <port number>	The port number to connect to.
			Defaults to 110.

    username <username>	The username to give to the server.
			Defaults to guest.

    password <username>	The password to give to the server.
			Defaults to an empty string.

    log <path>		Logs deleted messages to the given file, which is
    			appended to.  Note that non-deletion diagnostics go to
			stderr.
			If not used, or is an illegal path, logging is to
			$HOME/.kbs-deletelog unless the variable is set to
			a dash (-), in which case logging is disabled.

    verbose <on|off>	Whether logging is verbose.  If it's verbose then
    			the username, subject and a tailing report is generated.
			If logging is not verbose, then just usernames and
			the reason the message was blocked are logged.
			Defaults to verbose (on).

    showmatch <on|off>	Whether the matching expression (if applicable) is
    			put in the log file.
			Defaults to showing the matching expression (on).

    timeout <seconds>	Number of seconds to allow for no response from the
    			server.
			Defaults to 60 seconds.

    casesense <on|off>	Whether regular expressions are case sensitive.
			Defaults to case insensitive (off).

    dejunk <on|off>	Whether subjects are dejunked before checking.
    			Dejunking here means that anything an alphanumeric
			character is stripped, and all contiguos white space
			is reduced to one space.
			Defaults to off.

    blockhtml <on|off>	Whether messages that are pure HTML (content part just
    			reported as "text/html" are blocked.  Also blocks
			messages with an empty, or missing, content type
			WARNING: this will delete mail sent with old command
			line mailers as they don't set a content-type - however,
			plenty of spam has no valid content-type either.
			Defaults to off.

    testmode <on|off>	Whether things will be really deleted, or just
    			the actions logged.  The log is still filled out as if
			the deletion occured.
			Defaults to off, though it is recommended to use this
			for early runs to ensure your rules are not too harsh
			(or too easy for that matter).

    dumpheaders <on|off>
    			Simply dumps the parsed From and Subject for each
			message read from the server into the delete log.
			Mainly for debugging purposes.


Include files
=============

Include files are useful if you want to scan multiple hosts buy using mutiple
config files, eg.

In file 1:

set host	host1
set username	user1
set password	pass1

include "/etc/kbsrc"


In file 2:

set host	host2
set username	user2
set password	pass2

include "/etc/kbsrc"


Then put all your rules as needed in /etc/kbsrc.


Trusted settings
================

These define users and domains for which mail is let thorugh, regardless of
other tests.  These are not regular expressions - they are simply compared
with a case-insignificant test.

    trusted_users
    {
    	username
	[username]
    }

    trusted_domains
    {
    	domain
	[domain]
    }


Blacklist settings
==================

These define domains for which mail will be deleted with extreme prejedice.

    blacklist
    {
    	<domain regular expression>
    	[<domain regular expression>]
    }

Note that unlike the trusted commands these are regular expressions.  It's
probably fairly useless as I'd imagine most messages will have butchered
'From:' fields, but heck, it's there if you need it.


Subject Macros
==============

This command is simply to ease the writing of regular expressions where you
want to match expressions where numbers or alternative characters are
often used to try and fool filters:

    subject_macro <source string> <replacement string>

e.g.

    subject_macro "i" "[i1!l]"	# Common spam spellings for 'i'
    subject_macro "e" "[e3]"
    subject_macro "s" "[s5]"
    ...
    ...
    disallow_subject " ?free movies ?"

Subject macros are only expanded one time - if the result of an expansion
includes another macro, that will not be expanded.  Also remember
that macros are case sensitive - this allows easy constructions like this
which would break if the 'i' in the '[]' was expanded:

    set casesense off
    ...
    ...
    subject "i" "[i1!l]"
    ...
    ...
    disallow_subject "i...[I12]"

Even though it's not obvious from the examples above, the source string can
be any number of characters.


Domain settings
===============

These define the rules applied to a certain domain.  The order domains
appear in is important, as the first match found when checking domain names
will be used.

    domain <regular expression>
    {
	[default block|allow]
    	[block_user <username>]
	[allow_to <regular expression>]
    	[allow_subject <regular expression>]
	[block_subject <regular expression>]
    }

The default says what to do if neither the allow_subject or block_subject are
matched.  If not specified, the default is to allow.

The block user allows a specific username to be blocked.  For instance, I've
noticed that spammers have a great love of emailing from your username at
a different domain.

The allow_to sets a regular expression that the 'To: ' line in the header
data must match.  This can be handy as for some reason, even though spam may
be directed to your inbox, the 'To:' line will actually read a nonsense name
for your host.  Note that it's not recommended to anchor the regular expression
as if the mail has been sent to multiple recipients it is not guaranteed
where your name will appear in the list of users.

The allow_subject means that subjects that match that regular expression are
always let through.

The block_subject means that subjects that match that regular expression are
always blocked and deleted.

Multiple allow and block commands can be in one domain.

The commands inside a domain can appear in any order, but the checks are always
done in this order:

 1. If a trusted domain, allow message.
 2. If a trusted user, allow message.
 3. If an HTML message and these are blocked, delete message.
 4. If the domain is blacklisted, delete message.
 5. If the domain is not matched in a domain command, allow message.
 6. If the subject is allowed for the domain, allow message.
 7. If an allow_to has been set for the domain, and it doesn't match,
    delete message.
 8. If the username is blocked for the domain, delete message.
 9. If the subject is disallowed for the domain, delete message.
10. Delete the message if the default is to block, otherwise allow.



-------------------------------------------------------------------------------
$Id: INSTRUCTION,v 1.8 2004-08-22 19:03:21 ianc Exp $