• Yongming Luo
  • George H. L. Fletcher
  • Jan Hidders
  • Paul De Bra
Computing containment relations between massive collections of sets is a fundamental operation in data management, for example in graph analytics and data mining applications. Motivated by recent hardware trends, in this paper we present two novel solutions for computing set-containment joins over massive sets: the Patricia Trie-based Signature Join (PTSJ) and PRETTI+, a Patricia trie enhanced extension of the state-of-the-art PRETTI join. The compact trie structure not only enables efficient use of main-memory, but also significantly boosts the performance of both approaches. By carefully analyzing the algorithms and conducting extensive experiments with various synthetic and real-world datasets, we show that, in many practical cases, our algorithms are an order of magnitude faster than the state-of-the-art.
Original languageEnglish
Title of host publication31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13-17, 2015
EditorsJohannes Gehrke, Wolfgang Lehner, Kyuseok Shim, Sang Kyun Cha, Guy M. Lohman
PublisherIEEE
Pages303-314
Number of pages12
Publication statusPublished - 2015

ID: 46974159