Abstract
Semi-supervised classification methods try to improve a supervised learned classifier with the help of unlabeled data. In many cases one assumes a certain structure on the data, as for example the manifold assumption, the smoothness assumption or the cluster assumption. Self-training is a method that does not need any assumptions on the data itself. The idea is to use the supervised trained classifier to label the unlabeled points and to enlarge this way the training data. This paper aims to show that a self-training approach with soft-labeling is preferable in many cases in terms of expected loss (risk) minimization. The main idea is to use a soft-labeling to minimize the risk on labeled and unlabeled data together, in which the hard-labeled self-training is an extreme case.
Original language | English |
---|---|
Title of host publication | 2016 23rd International Conference on Pattern Recognition (ICPR) |
Place of Publication | Piscataway, NJ |
Publisher | IEEE |
Pages | 2604-2609 |
Number of pages | 6 |
ISBN (Electronic) | 978-1-5090-4847-2 |
ISBN (Print) | 978-1-5090-4848-9 |
DOIs | |
Publication status | Published - 2016 |
Event | ICPR 2016: 23rd International Conference on Pattern Recognition - Cancún, Mexico Duration: 4 Dec 2016 → 8 Dec 2016 Conference number: 23 |
Conference
Conference | ICPR 2016 |
---|---|
Country/Territory | Mexico |
City | Cancún |
Period | 4/12/16 → 8/12/16 |
Keywords
- Labeling
- Minimization
- Linear programming
- Probability distribution
- Mathematical model
- Pattern recognition
- Risk management