Documents

DOI

Both human listeners and machines need to adapt their sound categories whenever a new speaker is encountered. This perceptual learning is driven by lexical information. In previous work, we have shown that deep neural network-based (DNN) ASR systems can learn to adapt their phoneme category boundaries from a few labeled examples after exposure (i.e., training) to ambiguous sounds, as humans have been found to do. Here, we investigate the time-course of phoneme category adaptation in a DNN in more detail, with the ultimate aim to investigate the DNN’s ability to serve as a model of human perceptual learning. We do so by providing the DNN with an increasing number of ambiguous retraining tokens (in 10 bins of 4 ambiguous items), and comparing classification accuracy on the ambiguous items in a held-out test set for the different bins. Results showed that DNNs, similar to human listeners, show a step-like function: The DNNs show perceptual learning already after the first bin (only 4 tokens of the ambiguous phone), with little further adaptation for subsequent bins. In follow-up research, we plan to test specific predictions made by the DNN about human speech processing.
Original languageEnglish
Title of host publicationStatistical Language and Speech Processing
Subtitle of host publication7th International Conference, SLSP 2019
EditorsC. Martín-Vide, M. Purver, S. Pollak
Place of PublicationCham
PublisherSpringer
Pages3-15
Number of pages13
ISBN (Electronic)978-3-030-31372-2
ISBN (Print)978-3-030-31371-5
DOIs
Publication statusPublished - 2019
EventSLSP 2019: Statistical Language and Speech Processing - Ljubljana, Slovenia
Duration: 14 Oct 201916 Oct 2019
Conference number: 7th

Publication series

NamePart of the Lecture Notes in Computer Science book series, Also part of the Lecture Notes in Artificial Intelligence book sub series
PublisherSpringer
Volume11816

Conference

ConferenceSLSP 2019
CountrySlovenia
City Ljubljana
Period14/10/1916/10/19

    Research areas

  • Phoneme category adaptation, Human perceptual learning, Deep neural networks, Time-course

ID: 57212394