[e2e] latest spate of cruft postings to e2e
Joe Touch
touch at ISI.EDU
Thu Nov 6 15:15:33 PST 2003
Vernon Schryver wrote:
>>From: Joe Touch <touch at ISI.EDU>
>
>
>>...
>>The spam filtering on the list is intended to reduce substantial
>>bandwidth abuse to subscribers, but is not intended as a substitute for
>>local, receiver-side spam filters.
>
> I doubt there is such a thing as a "receiver-side spam filter" that
> could reject the spam through this list and not have a high false
> positive rate on other streams (i.e. reject more than 0.1% of legitimate
> mail). Spam is unsolicited bulk email. You can filter it with
>
> - domain name and SMTP client IP address blacklists
> These can have low false positives rates but cannot be
> applied to the bulk mail from this mailing list.
>
> - keyword and other scoring filters including so called "Bayesian"
> systems
> Except for some individuals and for them only some of the time,
> these have non-trivial false positive rates.
>
> - procmail and other regular expressions
> These have worse false positive and negative results than the
> preceding
>
> - Brightmail, Postini, etc.
> Can you apply those to classic mailing lists?
>
> The basic problem is that there is no technical difference between
> the legitimate bulk mail from this list and bulk mail from Ralsky etc.
> The only difference and defining between spam or unsolicted bulk mail
> and other bulk mail is whether the target solicited it. Many and
> probably most spam filters require that the bulk mail sent via this
> list be white-listed to encode that difference in "solicitedness."
> But that necessarily passes the objectionable bulk mail sent to this
> this list to our mailboxes.
>
> You should suppress a little of your pride in the filtering you are
> doing and consider better spam-filtering on the input side of this
> mailing list.
I don't have any pride in the filtering, FWIW. As I said before, it's
clearly not optimized.
> Contrary to your repeated claims, zillions of lists
> prove that it is perfectly possible to have an open mailing list that
> does not pass spam, where "open" is defined very liberally. For
> example, at the low legitimate traffics of this list, you could manually
> approve submissions from non-subscribers. Also contrary to your
> claims and as demonstrated by the recent note about a rejection for
> a "suspicious header," this mailing list is not entirely "open." You
> are clearly doing some filtering.
The list is open except as posted in the list policy. There are two sets
of filters, FYI:
1- automated anti-spam filters
2- manual filters to enforce list policy
Occasionally we get false positives on #2; those are tagged with a
"review" header, which is what Mailman says is so "suspicious".
> That note was very familiar to everyone who runs Mailman. We all
> (can) receive notifications from Mailman to pass, discard, or reject
> such "suspicious" messages as well as messages from non-subscribers.
>
> Something thing puzzles me about the spam I see from this list. For
> months it has consistently had DCC target counts of "Many" and other
> body checksum counts of 0. Why is that? Are you guys at ISI.edu
> feeding some spam to the DCC but with some header or trailer added to
> the bodies? If so, why not run the DCC on the input side of this list?
>
> Vernon Schryver vjs at rhyolite.com
We do run filters on the input side.
More information about the end2end-interest
mailing list