Recurrent Knowledge Distillation

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)
21 Downloads (Pure)

Abstract

Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasting multiple residual layers in the teacher network into a single recurrent student layer. We propose three variants of adding recurrent connections into the student network, and show experimentally on CIFAR-10, Scenes and MiniPlaces, that we can reduce the number of parameters at little loss in accuracy.
Original languageEnglish
Title of host publication2018 25th IEEE International Conference on Image Processing (ICIP)
Subtitle of host publicationProceedings
Place of PublicationPiscataway
PublisherIEEE
Pages3393-3397
Number of pages5
ISBN (Electronic)978-1-4799-7061-2
ISBN (Print)978-1-4799-7062-9
DOIs
Publication statusPublished - 2018
Event25th IEEE International Conference on Image Processing - Athens, Greece
Duration: 7 Oct 201810 Oct 2018
Conference number: 25

Conference

Conference25th IEEE International Conference on Image Processing
Abbreviated titleICIP 2018
Country/TerritoryGreece
CityAthens
Period7/10/1810/10/18

Keywords

  • Knowledge distillation
  • compacting deep representations for image classification
  • recurrent layers

Fingerprint

Dive into the research topics of 'Recurrent Knowledge Distillation'. Together they form a unique fingerprint.

Cite this