Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets

File Type:
PDFItem Type:
Technical ReportDate:
2005-01-28Citation:
Cunningham, Padraig; Loughrey, John. 'Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets'. - Dublin, Trinity College Dublin, Department of Computer Science, TCD-CS-2005-17, 2005, pp11Download Item:
Abstract:
In Wrapper based feature selection, the more states that are
visited during the search phase of the algorithm the greater the
likelihood of finding a feature subset that has a high internal accuracy
while generalizing poorly. When this occurs, we say that the algorithm
has overfitted to the training data. We outline a set of experiments to
show this and we introduce a modified genetic algorithm to address this
overfitting problem by stopping the search before overfitting occurs.
This new algorithm called GAWES (Genetic Algorithm With Early
Stopping) reduces the level of overfitting and yields feature subsets that
have a better generalization accuracy.
Sponsor
Grant Number
Science Foundation Ireland
Author: Cunningham, Padraig; Loughrey, John
Sponsor:
Science Foundation IrelandPublisher:
Trinity College Dublin, Department of Computer ScienceType of material:
Technical ReportCollections
Series/Report no:
Computer Science Technical ReportTCD-CS-2005-17
Availability:
Full text availableSubject:
Computer ScienceMetadata
Show full item recordThe following license files are associated with this item: