Curriculum Learning Strategies for IR: An Empirical Study on Conversation Response Ranking

Gustavo Penha*, Claudia Hauff

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

22 Citations (Scopus)

Abstract

Neural ranking models are traditionally trained on a series of random batches, sampled uniformly from the entire training set. Curriculum learning has recently been shown to improve neural models’ effectiveness by sampling batches non-uniformly, going from easy to difficult instances during training. In the context of neural Information Retrieval (IR) curriculum learning has not been explored yet, and so it remains unclear (1) how to measure the difficulty of training instances and (2) how to transition from easy to difficult instances during training. To address both challenges and determine whether curriculum learning is beneficial for neural ranking models, we need large-scale datasets and a retrieval task that allows us to conduct a wide range of experiments. For this purpose, we resort to the task of conversation response ranking: ranking responses given the conversation history. In order to deal with challenge (1), we explore scoring functions to measure the difficulty of conversations based on different input spaces. To address challenge (2) we evaluate different pacing functions, which determine the velocity in which we go from easy to difficult instances. We find that, overall, by just intelligently sorting the training data (i.e., by performing curriculum learning) we can improve the retrieval effectiveness by up to 2% (The source code is available at https://github.com/Guzpenha/transformers_cl.).

Original languageEnglish
Title of host publicationAdvances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020
Subtitle of host publicationProceedings
EditorsJoemon M. Jose, Emine Yilmaz, João Magalhães, Flávio Martins, Pablo Castells, Nicola Ferro, Mário J. Silva
Place of PublicationCham
PublisherSpringer
Pages699-713
Number of pages15
ISBN (Electronic)978-3-030-45439-5
ISBN (Print)978-3-030-45438-8
DOIs
Publication statusPublished - 2020
Event42nd European Conference on IR Research, ECIR 2020 - Lisbon, Portugal
Duration: 14 Apr 202017 Apr 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12035
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference42nd European Conference on IR Research, ECIR 2020
Country/TerritoryPortugal
CityLisbon
Period14/04/2017/04/20

Keywords

  • Conversation response ranking
  • Curriculum learning

Fingerprint

Dive into the research topics of 'Curriculum Learning Strategies for IR: An Empirical Study on Conversation Response Ranking'. Together they form a unique fingerprint.

Cite this