We study the problem of Protein Remote Homology Detection, which assesses the functional similarity of two proteins. We approach this as a problem of binary multiple-instance learning (MIL) that aims to distinguish between homologous and non-homologous proteins. The particular MIL approach employed is based on the dissimilarity representation in which various schemes of combining N-gram representations are considered. This approach allows us to cope with longer N-grams, capturing a richer biological context, and results in versatile framework offering competitive performance compared to state of the art.

Original languageEnglish
Pages (from-to)231-236
Number of pages6
JournalPattern Recognition Letters
Volume128
DOIs
Publication statusPublished - 1 Dec 2019

    Research areas

  • Dissimilarity representation, Multiple-instance learning, Protein Remote Homology Detection

ID: 57104877