dc.contributor.author | Delany, Sarah Jane | |
dc.contributor.author | Cunningham, Padraig | |
dc.date.accessioned | 2008-01-15T11:53:56Z | |
dc.date.available | 2008-01-15T11:53:56Z | |
dc.date.issued | 2004-08 | |
dc.identifier.citation | Delany, Sarah Jane; Cunningham, Padraig. 'An Analysis of Case-Base Editing in a Spam Filtering System'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2004-29, 2004, pp14 | en |
dc.identifier.other | TCD-CS-2004-29 | |
dc.description.abstract | Because of the volume of spam email and its evolving nature, any
deployed Machine Learning-based spam filtering system will need to have
procedures for case-base maintenance. Key to this will be procedures to edit the
case-base to remove noise and eliminate redundancy. In this paper we present a
two stage process to do this. We present a new noise reduction algorithm called
Blame-Based Noise Reduction that removes cases that are observed to cause
misclassification. We also present an algorithm called Conservative
Redundancy Reduction that is much less aggressive than the state-of-the-art
alternatives and has significantly better generalisation performance in this
domain. These new techniques are evaluated against the alternatives in the
literature on four datasets of 1000 emails each (50% spam and 50% non spam). | en |
dc.description.sponsorship | This research was supported by funding from Enterprise Ireland under grant no. CFTD/03/219
and funding from Science Foundation Ireland under grant no. SFI-02IN.1I111. | en |
dc.format.extent | 151485 bytes | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | en |
dc.publisher | Trinity College Dublin, Department of Computer Science | en |
dc.relation.ispartofseries | Computer Science Technical Report | en |
dc.relation.ispartofseries | TCD-CS-2004-29 | en |
dc.relation.haspart | TCD-CS-[no.] | en |
dc.subject | Computer Science | en |
dc.title | An Analysis of Case-Base Editing in a Spam Filtering System | en |
dc.type | Technical Report | en |
dc.identifier.rssuri | https://www.cs.tcd.ie/publications/tech-reports/reports.04/TCD-CS-2004-29.pdf | |
dc.contributor.sponsor | Science Foundation Ireland | |
dc.contributor.sponsor | Enterprise Ireland | |
dc.identifier.uri | http://hdl.handle.net/2262/13258 | |