Data masking for recommender systems: Prediction performance and rating hiding

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

120 Downloads (Pure)

Abstract

Data science challenges allow companies, and other data holders, to collaborate with the wider research community. In the area of recommender systems, the potential of such challenges to move forward the state of the art is limited due to concerns about releasing user interaction data. This paper investigates the potential of privacy-preserving data publishing for supporting recommender system challenges. We propose a data masking algorithm, Shuffle-NNN, with two steps: Neighborhood selection and value swapping. Neighborhood selection preserves valuable item similarity information. The data shuffling technique hides (i.e., changes) ratings of users for individual items. Our experimental results demonstrate that the relative performance of algorithms, which is the key property that a data science challenge must measure, is comparable between the original data and the data masked with Shuffle-NNN.

Original languageEnglish
Title of host publicationACM RecSys LBR 2019 ACM RecSys 2019 Late-breaking Results
Subtitle of host publicationProceedings of ACM RecSys 2019 Late-breaking Results co-located with the 13th ACM Conference on Recommender Systems (RecSys 2019)
EditorsMarko Tkalcic , Sole Pera
PublisherCEUR-WS
Pages21-25
Number of pages5
Publication statusPublished - 2019
Event2019 ACM Conference on Recommender Systems Late-breaking Results, ACM RecSys LBR 2019 - Copenhagen, Denmark
Duration: 16 Sept 201920 Sept 2019

Publication series

NameCEUR Workshop Proceedings
Volume2431
ISSN (Print)1613-0073

Conference

Conference2019 ACM Conference on Recommender Systems Late-breaking Results, ACM RecSys LBR 2019
Country/TerritoryDenmark
CityCopenhagen
Period16/09/1920/09/19

Keywords

  • Data masking
  • Privacy-preserving data publishing
  • Recommender systems

Fingerprint

Dive into the research topics of 'Data masking for recommender systems: Prediction performance and rating hiding'. Together they form a unique fingerprint.

Cite this