Bayesian filtering for annoying blog topics
Next time I find myself with a few free hours, I know what to do: Take a Bayesian spam filter and modify it to do away with the following families of blog posts (which comprise 80% of my daily weblog intake):
1. I just installed [Whidbey / Longhorn / Whatever] and it rocks!
2. [name] [ says / on to something / points out / links to / announces ] [ Whatever ]
3. I just saw the [ Matrix / LOTR / Whatever ] and it [ sucks / rocks ]
4. Anything with the words “DON“ and “BOX“ in the same sentence (except for his blog, of course)
But really, has anyone considered a mechanism for filtering blogs? I would love a “bool FilterPost(xmlElement Item)” pluggable interface for SharpReader ... <hint hint>