Standard

A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification. / Panichella, Annibale.

Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings. ed. / Shiva Nejati; Gregory Gay. Springer, 2019. p. 11-26 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11664 LNCS).

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Harvard

Panichella, A 2019, A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification. in S Nejati & G Gay (eds), Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11664 LNCS, Springer, pp. 11-26, 11th International Symposium on Search-Based Software Engineering, SSBSE 2019, Tallinn, Estonia, 31/08/19. https://doi.org/10.1007/978-3-030-27455-9_2

APA

Panichella, A. (2019). A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification. In S. Nejati, & G. Gay (Eds.), Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings (pp. 11-26). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11664 LNCS). Springer. https://doi.org/10.1007/978-3-030-27455-9_2

Vancouver

Panichella A. A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification. In Nejati S, Gay G, editors, Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings. Springer. 2019. p. 11-26. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-27455-9_2

Author

Panichella, Annibale. / A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification. Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings. editor / Shiva Nejati ; Gregory Gay. Springer, 2019. pp. 11-26 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

BibTeX

@inproceedings{a3603e0f8d874735987650cc7aebf1f8,
title = "A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification",
abstract = "Latent Dirichlet Allocation (LDA) has been used to support many software engineering tasks. Previous studies showed that default settings lead to sub-optimal topic modeling with a dramatic impact on the performance of such approaches in terms of precision and recall. For this reason, researchers used search algorithms (e.g., genetic algorithms) to automatically configure topic models in an unsupervised fashion. While previous work showed the ability of individual search algorithms in finding near-optimal configurations, it is not clear to what extent the choice of the meta-heuristic matters for SE tasks. In this paper, we present a systematic comparison of five different meta-heuristics to configure LDA in the context of duplicate bug reports identification. The results show that (1) no master algorithm outperforms the others for all software projects, (2) random search and PSO are the least effective meta-heuristics. Finally, the running time strongly depends on the computational complexity of LDA while the internal complexity of the search algorithms plays a negligible role.",
keywords = "Duplicate Bug Report, Evolutionary Algorithms, Latent Dirichlet Allocation, Search-based Software Engineering, Topic modeling",
author = "Annibale Panichella",
year = "2019",
month = jan,
day = "1",
doi = "10.1007/978-3-030-27455-9_2",
language = "English",
isbn = "9783030274542",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "11--26",
editor = "Shiva Nejati and Gregory Gay",
booktitle = "Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings",
note = "11th International Symposium on Search-Based Software Engineering, SSBSE 2019 ; Conference date: 31-08-2019 Through 01-09-2019",

}

RIS

TY - GEN

T1 - A Systematic Comparison of Search Algorithms for Topic Modelling—A Study on Duplicate Bug Report Identification

AU - Panichella, Annibale

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Latent Dirichlet Allocation (LDA) has been used to support many software engineering tasks. Previous studies showed that default settings lead to sub-optimal topic modeling with a dramatic impact on the performance of such approaches in terms of precision and recall. For this reason, researchers used search algorithms (e.g., genetic algorithms) to automatically configure topic models in an unsupervised fashion. While previous work showed the ability of individual search algorithms in finding near-optimal configurations, it is not clear to what extent the choice of the meta-heuristic matters for SE tasks. In this paper, we present a systematic comparison of five different meta-heuristics to configure LDA in the context of duplicate bug reports identification. The results show that (1) no master algorithm outperforms the others for all software projects, (2) random search and PSO are the least effective meta-heuristics. Finally, the running time strongly depends on the computational complexity of LDA while the internal complexity of the search algorithms plays a negligible role.

AB - Latent Dirichlet Allocation (LDA) has been used to support many software engineering tasks. Previous studies showed that default settings lead to sub-optimal topic modeling with a dramatic impact on the performance of such approaches in terms of precision and recall. For this reason, researchers used search algorithms (e.g., genetic algorithms) to automatically configure topic models in an unsupervised fashion. While previous work showed the ability of individual search algorithms in finding near-optimal configurations, it is not clear to what extent the choice of the meta-heuristic matters for SE tasks. In this paper, we present a systematic comparison of five different meta-heuristics to configure LDA in the context of duplicate bug reports identification. The results show that (1) no master algorithm outperforms the others for all software projects, (2) random search and PSO are the least effective meta-heuristics. Finally, the running time strongly depends on the computational complexity of LDA while the internal complexity of the search algorithms plays a negligible role.

KW - Duplicate Bug Report

KW - Evolutionary Algorithms

KW - Latent Dirichlet Allocation

KW - Search-based Software Engineering

KW - Topic modeling

UR - http://www.scopus.com/inward/record.url?scp=85072853501&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-27455-9_2

DO - 10.1007/978-3-030-27455-9_2

M3 - Conference contribution

AN - SCOPUS:85072853501

SN - 9783030274542

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 11

EP - 26

BT - Search-Based Software Engineering - 11th International Symposium, SSBSE 2019, Proceedings

A2 - Nejati, Shiva

A2 - Gay, Gregory

PB - Springer

T2 - 11th International Symposium on Search-Based Software Engineering, SSBSE 2019

Y2 - 31 August 2019 through 1 September 2019

ER -

ID: 62455533