Defect prediction as a multiobjective optimization problem

Gerardo Canfora, Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, Annibale Panichella*, Sebastiano Panichella

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

61 Citations (Scopus)

Abstract

In this paper, we formalize the defect-prediction problem as a multiobjective optimization problem. Specifically, we propose an approach, coined as multiobjective defect predictor (MODEP), based on multiobjective forms of machine learning techniques - logistic regression and decision trees specifically - trained using a genetic algorithm. The multiobjective approach allows software engineers to choose predictors achieving a specific compromise between the number of likely defect-prone classes or the number of defects that the analysis would likely discover (effectiveness), and lines of code to be analysed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the PROMISE repository indicate the quantitative superiority of MODEP with respect to single-objective predictors, and with respect to trivial baseline ranking classes by size in ascending or descending order. Also, MODEP outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.

Original languageEnglish
Pages (from-to)426-459
Number of pages34
JournalSoftware Testing Verification and Reliability
Volume25
Issue number4
DOIs
Publication statusPublished - 1 Jun 2015

Keywords

  • cost-effectiveness
  • cross-project defect prediction
  • defect prediction
  • multiobjective optimization

Fingerprint

Dive into the research topics of 'Defect prediction as a multiobjective optimization problem'. Together they form a unique fingerprint.

Cite this