Track

Design Science

Abstract

Current “data deluge” has flooded the Web of Data with very large RDF datasets. They are hosted and queried throughSPARQL endpoints which act as nodes of a semantic net built on the principles of the Linked Data project. Although thisis a realistic philosophy for global data publishing, its query performance is diminished when the RDF engines (behindthe endpoints) manage these huge datasets. Their indexes cannot be fully loaded in main memory, hence these systemsneed to perform slow disk accesses to solve SPARQL queries.This paper addresses this problem by a compact indexed RDF structure (called k2-triples) applying compact k2-treestructures to the well-known vertical-partitioning technique. It obtains an ultra-compressed representation of large RDFgraphs and allows SPARQL queries to be full-in-memory performed without decompression. We show that k2-triplesclearly outperforms state-of-the-art compressibility and traditional vertical-partitioning query resolution, remaining verycompetitive with multi-index solutions.

Share

COinS