To-many or to-one? All-in-one! Efficient purely functional multi-maps with type-heterogeneous hash-tries

Michael J. Steindorfer, Jurgen J. Vinju

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)

Abstract

An immutable multi-map is a many-to-many map data structure with expected fast insert and lookup operations. This data structure is used for applications processing graphs or many-to-many relations as applied in compilers, runtimes of programming languages, or in static analysis of object-oriented systems. Collection data structures are assumed to carefully balance execution time of operations with memory consumption characteristics and need to scale gracefully from a few elements to multiple gigabytes at least. When processing larger in-memory data sets the overhead of the data structure encoding itself becomes a memory usage bottleneck, dominating the overall performance. In this paper we propose AXIOM, a novel hash-trie data structure that allows for a highly efficient and type-safe multi-map encoding by distinguishing inlined values of singleton sets from nested sets of multi-mappings. AXIOM strictly generalizes over previous hash-trie data structures by supporting the processing of fine-grained type-heterogeneous content on the implementation level (while API and language support for type-heterogeneity are not scope of this paper). We detail the design and optimizations of AXIOM and further compare it against state-of-the-art immutable maps and multi-maps in Java, Scala and Clojure. We isolate key differences using microbenchmarks and validate the resulting conclusions on a case study in static analysis. AXIOM reduces the key-value storage overhead by 1.87 x; with specializing and inlining across collection boundaries it improves by 5.1 x.

Original languageEnglish
Title of host publicationPLDI 2018 - Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation
PublisherAssociation for Computing Machinery (ACM)
Pages283-295
Number of pages13
ISBN (Electronic)9781450356985
DOIs
Publication statusPublished - 11 Jun 2018
Event39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018 - Philadelphia, United States
Duration: 18 Jun 201822 Jun 2018

Conference

Conference39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018
Country/TerritoryUnited States
CityPhiladelphia
Period18/06/1822/06/18

Keywords

  • Data structures
  • Functional programming
  • Graph
  • Hashtable
  • JVM
  • Many-to-many relation
  • Multi-map
  • Optimization
  • Performance
  • Persistent data structures

Fingerprint

Dive into the research topics of 'To-many or to-one? All-in-one! Efficient purely functional multi-maps with type-heterogeneous hash-tries'. Together they form a unique fingerprint.

Cite this