Coevolutionary Multi-Population Genetic Programming for Classification

Researchers: Mauša, G. and Galinac Grbac, T.

About the study:

  Evolving diverse ensembles using multi-objective genetic programming (MOGP) was recently proposed for classi cation problems with unbalanced data. It performed better then the classical individual classi cation approaches. Multilevel selection strategies have shown even better performances in some applications. This paper analyzes the performance of: 2 selection strategies (coevolutionary colonization and multiple subpopulation migrations) and 3 ensemble selection strategies (Pareto Front, Full Population and Convex Hull). Their performance is compared on both software defect prediction and general purpose classification data sets  with varying level of data imbalance.

Motivation:

  Population diversity is crucial for evolving algorithms of good performance and it is influenced by selection pressure and genetic drift due to mutation and recombination operations. A few techniques exist to prevent genetic drift, a premature convergence and stagnation in local optima.

  Multiple subpopulation isolates groups of solutions and Coevolution evolves di fferent species. Unlike other niching methods (crowding, clearing, fitness sharing) they do not require estimates or even a priori knowledge of the fitness landscape nor make assumption that all optima are nearly equidistant or perfectly discriminant. SDP problem has shown to be a complex one with unknown fitness landscape properties.

  The ensemble selection strategies can also bring improvements to MOGP classification performance. The promotion of phenotypic diversity has shown to be better than promoting genotypic diversity in the evolution process. Using the Pareto Front is a traditional ensemble selection strategy that promotes genotypic diversity, while the convex hull is a novel ensemble selections strategy that promotes phenotypic diversity.

Results:

  The coevolutionary MOGP approach based on the colonization operator outperformed the single population and multiple population MOGP configurations in the case of SDP datasets. The ensemble selection strategy based on convex hull, on the other hand, exhibited both positive and negative results in the attempt to improve the classi cation performance.

Supported by: project UIP-2014-09-7945 and research grant 13.09.2.2.16.

Comments are closed.