[e2e] latest spate of cruft postings to e2e

Joe Touch touch at ISI.EDU
Thu Nov 6 15:15:33 PST 2003


Vernon Schryver wrote:

>>From: Joe Touch <touch at ISI.EDU>
> 
> 
>>...
>>The spam filtering on the list is intended to reduce substantial 
>>bandwidth abuse to subscribers, but is not intended as a substitute for 
>>local, receiver-side spam filters.
> 
> I doubt there is such a thing as a "receiver-side spam filter" that
> could reject the spam through this list and not have a high false
> positive rate on other streams (i.e. reject more than 0.1% of legitimate
> mail).  Spam is unsolicited bulk email.  You can filter it with
> 
>   - domain name and SMTP client IP address blacklists
>      These can have low false positives rates but cannot be
>      applied to the bulk mail from this mailing list.
> 
>   - keyword and other scoring filters including so called "Bayesian"
>    systems
>       Except for some individuals and for them only some of the time,
>       these have non-trivial false positive rates.
> 
>   - procmail and other regular expressions
>       These have worse false positive and negative results than the 
>        preceding
> 
>   - Brightmail, Postini, etc.
>        Can you apply those to classic mailing lists?
> 
> The basic problem is that there is no technical difference between
> the legitimate bulk mail from this list and bulk mail from Ralsky etc.
> The only difference and defining between spam or unsolicted bulk mail
> and other bulk mail is whether the target solicited it.  Many and
> probably most spam filters require that the bulk mail sent via this
> list be white-listed to encode that difference in "solicitedness."
> But that necessarily passes the objectionable bulk mail sent to this
> this list to our mailboxes.
> 
> You should suppress a little of your pride in the filtering you are
> doing and consider better spam-filtering on the input side of this
> mailing list.

I don't have any pride in the filtering, FWIW. As I said before, it's 
clearly not optimized.

> Contrary to your repeated claims, zillions of lists
> prove that it is perfectly possible to have an open mailing list that
> does not pass spam, where "open" is defined very liberally.   For
> example, at the low legitimate traffics of this list, you could manually
> approve submissions from non-subscribers.   Also contrary to your
> claims and as demonstrated by the recent note about a rejection for
> a "suspicious header," this mailing list is not entirely "open."  You
> are clearly doing some filtering.

The list is open except as posted in the list policy. There are two sets 
of filters, FYI:

	1- automated anti-spam filters
	2- manual filters to enforce list policy

Occasionally we get false positives on #2; those are tagged with a 
"review" header, which is what Mailman says is so "suspicious".

> That note was very familiar to everyone who runs Mailman.  We all
> (can) receive notifications from Mailman to pass, discard, or reject
> such "suspicious" messages as well as messages from non-subscribers.
> 
> Something thing puzzles me about the spam I see from this list.  For
> months it has consistently had DCC target counts of "Many" and other
> body checksum counts of 0.  Why is that?  Are you guys at ISI.edu
> feeding some spam to the DCC but with some header or trailer added to
> the bodies?  If so, why not run the DCC on the input side of this list?
> 
> Vernon Schryver    vjs at rhyolite.com

We do run filters on the input side.




More information about the end2end-interest mailing list