Abstract

No current single filtering algorithm used to identify spam can provide for an error rate of zero. Different filtering approaches vary in technical and algorithmic aspects resulting in different error rates and costs to accomplish the classification goal. Therefore it is common practice in larger organizations to implement a spam-classifying process consisting of different single filters. We suggest a general model that aggregates cost and profit parameters of each filter step to an output, which represents the goodness of the whole classifying process. Optimizing this non-linear function leads to a problem which can be addressed by a heuristic approach.

Share

COinS