ISWC 2020 Paper: Tentris – A Tensor-Based Triple Store

This page provides access to all the scripts, data and code needed to reproduce the results of the paper:

Alexander Bigerl, Felix Conrads, Charlotte Behning, Mohamed Ahmed Sherif, Muhammad Saleem and Axel-Cyrille Ngonga Ngomo (2020) Tentris – A Tensor-Based Triple Store. In: The Semantic Web – ISWC 2020

Cite as:

@InProceedings{bigerl2020tentris,
  author = {Bigerl, Alexander and Conrads, Felix and Behning, Charlotte and Sherif, Mohamed Ahmed and Saleem, Muhammad and Ngonga Ngomo, Axel-Cyrille},
  booktitle = {The Semantic Web -- ISWC 2020},
  publisher = {Springer International Publishing},
  title = { {T}entris -- {A} {T}ensor-{B}ased {T}riple {S}tore},
  pages = {56--73},
  url = {https://papers.dice-research.org/2020/ISWC_Tentris/iswc2020_tentris_public.pdf},
  year = 2020,
  isbn = {978-3-030-62419-4}
}

Abstract

The number and size of RDF knowledge graphs grows continuously. Efficient storage solutions for these graphs are indispensable for their use in real applications. We present such a storage solution dubbed Tentris. Our solution represents RDF knowledge graphs as sparse order-3 tensors using a novel data structure, which we dub hypertrie. It then uses tensor algebra to carry out SPARQL queries by mapping SPARQL operations to Einstein summation. By being able to compute Einstein summations efficiently, Tentris outperforms the commercial and open-source RDF storage solutions evaluated in our experiments by at least 1.8 times with respect to the average number of queries it can serve per second on three datasets of up to 1 billion triples.

Supplementary Material

Extended Example: pdf

Proof of Hypertrie Space Complexity: pdf

Impact of Small and Large Queries: csv | plots

Evaluation Results

http bechmarks | cli benchmarks | data loading stats

Reproducing Evaluation

We provide an Ansible playbook to automatically setup triple stores, datasets, queries, benchmarking tools and scripts to run the benchmarks on a test machine. The machine should have 32+ cores, 768+ GB RAM and 1+ TB free space on /home. We tested the setup on Ubuntu 20.04 and Debian Buster.
Visit Ansible playbook

Direct download links for the benchmarks and Tentris-binaries used in the evaluation are provided below.

Tentris Binaries:
v1.0.4 | v1.0.4 2-way join | v1.0.4 rand. label order

SWDF Benchmark:
rdf data | queries | queries’ stats

DBpedia 2015-10 en Benchmark:
rdf data [1] | queries | queries’ stats

WatDiv Benchmark:
generator [2] | queries | queries’ stats

[1] concatination of those files
[2] run with scale factor 10000