A soft-labeled self-training approach

Alexander Mey; Marco Loog

doi:10.1109/ICPR.2016.7900028

A soft-labeled self-training approach

Alexander Mey, Marco Loog

Pattern Recognition and Bioinformatics

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

8 Citations (Scopus)

Abstract

Semi-supervised classification methods try to improve a supervised learned classifier with the help of unlabeled data. In many cases one assumes a certain structure on the data, as for example the manifold assumption, the smoothness assumption or the cluster assumption. Self-training is a method that does not need any assumptions on the data itself. The idea is to use the supervised trained classifier to label the unlabeled points and to enlarge this way the training data. This paper aims to show that a self-training approach with soft-labeling is preferable in many cases in terms of expected loss (risk) minimization. The main idea is to use a soft-labeling to minimize the risk on labeled and unlabeled data together, in which the hard-labeled self-training is an extreme case.

Original language	English
Title of host publication	2016 23rd International Conference on Pattern Recognition (ICPR)
Place of Publication	Piscataway, NJ
Publisher	IEEE
Pages	2604-2609
Number of pages	6
ISBN (Electronic)	978-1-5090-4847-2
ISBN (Print)	978-1-5090-4848-9
DOIs	https://doi.org/10.1109/ICPR.2016.7900028
Publication status	Published - 2016
Event	ICPR 2016: 23rd International Conference on Pattern Recognition - Cancún, Mexico Duration: 4 Dec 2016 → 8 Dec 2016 Conference number: 23

Conference

Conference	ICPR 2016
Country/Territory	Mexico
City	Cancún
Period	4/12/16 → 8/12/16

Keywords

Labeling
Minimization
Linear programming
Probability distribution
Mathematical model
Pattern recognition
Risk management

Access to Document

10.1109/ICPR.2016.7900028

Cite this

@inproceedings{ebe879002654484ba69bf2844cfd11bf,

title = "A soft-labeled self-training approach",

abstract = "Semi-supervised classification methods try to improve a supervised learned classifier with the help of unlabeled data. In many cases one assumes a certain structure on the data, as for example the manifold assumption, the smoothness assumption or the cluster assumption. Self-training is a method that does not need any assumptions on the data itself. The idea is to use the supervised trained classifier to label the unlabeled points and to enlarge this way the training data. This paper aims to show that a self-training approach with soft-labeling is preferable in many cases in terms of expected loss (risk) minimization. The main idea is to use a soft-labeling to minimize the risk on labeled and unlabeled data together, in which the hard-labeled self-training is an extreme case.",

keywords = "Labeling, Minimization, Linear programming, Probability distribution, Mathematical model, Pattern recognition, Risk management",

author = "Alexander Mey and Marco Loog",

year = "2016",

doi = "10.1109/ICPR.2016.7900028",

language = "English",

isbn = "978-1-5090-4848-9",

pages = "2604--2609",

booktitle = "2016 23rd International Conference on Pattern Recognition (ICPR)",

publisher = "IEEE",

address = "United States",

note = "ICPR 2016 : 23rd International Conference on Pattern Recognition ; Conference date: 04-12-2016 Through 08-12-2016",

}

TY - GEN

T1 - A soft-labeled self-training approach

AU - Mey, Alexander

AU - Loog, Marco

N1 - Conference code: 23

PY - 2016

Y1 - 2016

N2 - Semi-supervised classification methods try to improve a supervised learned classifier with the help of unlabeled data. In many cases one assumes a certain structure on the data, as for example the manifold assumption, the smoothness assumption or the cluster assumption. Self-training is a method that does not need any assumptions on the data itself. The idea is to use the supervised trained classifier to label the unlabeled points and to enlarge this way the training data. This paper aims to show that a self-training approach with soft-labeling is preferable in many cases in terms of expected loss (risk) minimization. The main idea is to use a soft-labeling to minimize the risk on labeled and unlabeled data together, in which the hard-labeled self-training is an extreme case.

AB - Semi-supervised classification methods try to improve a supervised learned classifier with the help of unlabeled data. In many cases one assumes a certain structure on the data, as for example the manifold assumption, the smoothness assumption or the cluster assumption. Self-training is a method that does not need any assumptions on the data itself. The idea is to use the supervised trained classifier to label the unlabeled points and to enlarge this way the training data. This paper aims to show that a self-training approach with soft-labeling is preferable in many cases in terms of expected loss (risk) minimization. The main idea is to use a soft-labeling to minimize the risk on labeled and unlabeled data together, in which the hard-labeled self-training is an extreme case.

KW - Labeling

KW - Minimization

KW - Linear programming

KW - Probability distribution

KW - Mathematical model

KW - Pattern recognition

KW - Risk management

U2 - 10.1109/ICPR.2016.7900028

DO - 10.1109/ICPR.2016.7900028

M3 - Conference contribution

SN - 978-1-5090-4848-9

SP - 2604

EP - 2609

BT - 2016 23rd International Conference on Pattern Recognition (ICPR)

PB - IEEE

CY - Piscataway, NJ

T2 - ICPR 2016

Y2 - 4 December 2016 through 8 December 2016

ER -

A soft-labeled self-training approach

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this