Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces suboptimal hyperparameter estimates in problem settings where large weights arise with high probability. We study its sampling variance as a function of the training data distribution and introduce a control variate to increase its robustness to problematically large weights.

Original languageEnglish
Title of host publication2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP)
Place of PublicationPiscataway
PublisherIEEE
Pages1-6
Number of pages6
ISBN (Electronic)978-1-7281-0824-7
ISBN (Print)978-1-7281-0825-4
DOIs
Publication statusPublished - 2019
Event29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2019 - Pittsburgh, United States
Duration: 13 Oct 201916 Oct 2019

Conference

Conference29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2019
CountryUnited States
CityPittsburgh
Period13/10/1916/10/19

    Research areas

  • cross-validation, Sample selection bias

ID: 68748044