TR2005-162

CRM114 versus Mr. X: CRM114 Notes for the TREC 2005 Spam Track
Citation: Assis, F.; Yerazunis, W.; Siefkes, C.; Chhabra, S., "CRM114 versus Mr. X: CRM114 Notes for the TREC 2005 Spam Track", NIST Text REtrieval Conference (TREC), November 2005 (TREC 2005)
Date:November 2005
MERL Contact:William Yerazunis

This paper discusses the design decisions underlying the CRM114 Discriminator software, how it can be configured as a spam filter, and what we may glean from the preliminary TREC 2005 results. Unlike most other filters, CRM114 is not a fixed-purpose antispam filter; rather, it's a general purpose language meant to expedite the creation of text filters. The pluggable CRM114 architecture allows rapid prototyping and easy support of multiple classifier engines; rather than testing different cutoff parameters, the CRM114 TREC test set tested different classifier algorithms and learning protocols.

 Read the full technical report (PDF: 1.1 MB)