Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis

Ahmad Alwosheel*, Sander van Cranenburgh, Caspar G. Chorus

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

215 Citations (Scopus)
2151 Downloads (Pure)

Abstract

Artificial Neural Networks (ANNs) are increasingly used for discrete choice analysis. But, at present, it is unknown what sample size requirements are appropriate when using ANNs in this particular context. This paper fills this knowledge gap: we empirically establish a rule-of-thumb for ANN-based discrete choice analysis based on analyses of synthetic and real data. To investigate the effect of complexity of the data generating process on the minimum required sample size, we conduct extensive Monte Carlo analyses using a series of different model specifications with different levels of model complexity, including RUM and RRM models, with and without random taste parameters. Based on our analyses we advise to use a minimum sample size of fifty times the number of weights in the ANN; it should be noted, that the number of weights is generally much larger than the number of parameters in a discrete choice model. This rule-of-thumb is considerably more conservative than the rule-of-thumb that is most often used in the ANN community, which advises to use at least ten times the number of weights.

Original languageEnglish
Pages (from-to)167-182
Number of pages16
JournalJournal of Choice Modelling
Volume28
DOIs
Publication statusPublished - 2018

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Fingerprint

Dive into the research topics of 'Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis'. Together they form a unique fingerprint.

Cite this