TR2005-162

CRM114 versus Mr. X: CRM114 Notes for the TREC 2005 Spam Track


    •  Assis, F.; Yerazunis, W.; Siefkes, C.; Chhabra, S., "CRM114 versus Mr. X: CRM114 Notes for the TREC 2005 Spam Track", NIST Text REtrieval Conference (TREC), November 2005.
      BibTeX Download PDF
      • @inproceedings{Assis2005nov,
      • author = {Assis, F. and Yerazunis, W. and Siefkes, C. and Chhabra, S.},
      • title = {CRM114 versus Mr. X: CRM114 Notes for the TREC 2005 Spam Track},
      • booktitle = {NIST Text REtrieval Conference (TREC)},
      • year = 2005,
      • month = nov,
      • url = {http://www.merl.com/publications/TR2005-162}
      • }
  • MERL Contact:
  • Research Area:

    Data Analytics


This paper discusses the design decisions underlying the CRM114 Discriminator software, how it can be configured as a spam filter, and what we may glean from the preliminary TREC 2005 results. Unlike most other filters, CRM114 is not a fixed-purpose antispam filter; rather, it's a general purpose language meant to expedite the creation of text filters. The pluggable CRM114 architecture allows rapid prototyping and easy support of multiple classifier engines; rather than testing different cutoff parameters, the CRM114 TREC test set tested different classifier algorithms and learning protocols.