Trudge, trudge, trudge

Oh no, I’ve hit the slump.

I spent some time working on the Wicked Cool Emacs today. Spam filtering – not something I’d set up before. I’m writing it because I promised to write it, but I can’t shake off the feeling that this part would be better done by someone who’s passionate about Emacs and spam filtering. I have Google handle my spam filtering for me, so I haven’t needed to do anything more sophisticated. Still, not everyone’s going to have the same set-up, so it would be good to document that too.

I’m tempted to jump to a different chapter and start working on that, just to make working on the book fun again.

  • I set up spam filtering, not because I don’t have spam filtering on my mail hosting (I do, and it blocks 90+% of email … not spam, mind you, but 90+% of delivery attempts) but because it wasn’t adequate. My email address has stayed the same for 9 years or so and I haven’t tried to keep it secret, so it has accumulated a bit of a following in the spammer community.

    As I said, SpamAssassin and sa-exim block over 90% of email at the server level, but spam still gets through. So, on my laptop, I have Emacs configured to filter email with CRM114. It adds some time to the mail reading process, but I end up with my Inbox being mostly spam-free.

    It wasn’t the most intuitive thing, but filling my .gnus with:
    (setq nnimap-split-inbox '("INBOX")
    nnimap-split-rule 'nnimap-split-fancy
    nnimap-split-fancy '(: spam-split))

    And then in a *Group* buffer, with point over my nnimap groups I hit “G c” and configure away.

    I’m leaving out the vitally important CRM114 bit, of course. And none of this was much fun, but it did make reading email more bearable.

    (Even Google doesn’t seem to filter out all the spam…)

  • i have small article about spam filtering in Gnus (3-4 pages), that i wrote 3 or 4 years ago (in Russian, but i can translate it if you need, or you can look to it via Google Translate)

  • Well, I recognise the feeling. Writing something that you do not immediately see the need or use for for yourself can be pretty boring indeed. Like coding for IE. So if we’d really want to help you we should all try and circumvent Google’s spam filter, and spam you really hard, so that you do feel the need. Somehow however, this just does not feel right.

    So how do we get spam filtering exciting? As I remember, the one chapter of Peter Seibel’s book Practical Common Lisp considering spam filtering was one of the few that did not really much to me (I let Google do the work too…). It might however be worth and interesting porting that code to Emacs Lisp. Then you can see it as a venture into the differences and commonalities between these two branches of the Lisp family – taking your mind of spam, but still working on it.

  • Andrey Fedorov

    Norvig once made a point that algorithms of the same complexity work better as you have more data until a certain point, after which their effectiveness levels off. I’m fairly certain that for spam filtering, that point is well above a single mailbox worth of data, so this is probably a problem that Google is in much better of a place to solve than someone with a single mailbox.

  • Hey Sacha, do you have a deadline for your Emacs book? When will it be released? :D

  • Gour

    Hi Sacha!

    The same question: when the book will be available?