Combining ensemble of classifiers by using genetic programming for cyber security applications

Abstract

Classification is a relevant task in the cyber security domain, but it must be able to cope with unbalanced and/or incomplete datasets and must also react in real-time to changes in the data. Ensemble of classifiers are a useful tool for classification in hard domains as they combine different classifiers that together provide complementary information. However, most of the ensemble-based algorithms require an extensive training phase and need to be re-trained in case of changes in the data. This work proposes a Genetic Programming-based framework to generate a function for combining an ensemble, having some interesting properties: the models composing the ensemble are trained only on a portion of the training set, and then, they can be combined and used without any extra phase of training; furthermore, in case of changes in the data, the function can be recomputed in an incrementally way, with a moderate computational effort. Experiments conducted on unbalanced datasets and on a well-known cyber-security dataset assess the goodness of the approach. © Springer International Publishing Switzerland 2015.

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Add the full text or supplementary notes for the publication here using Markdown formatting.