Towards Efficient SPARQL Query Processing on RDF Data
【摘要】：Efficient support for querying large-scale resource description framework (RDF) triples plays an important role in semantic web data management. This paper presents an efficient RDF query engine to evaluate SPARQL queries, where the inverted index structure is employed for indexing the RDF triples. A set of operators on the inverted index was developed for query optimization and evaluation. Then a main-tree-shaped optimization algorithm was developed that transforms a SPARQL query graph into the op-timal query plan by effectively reducing the search space to determine the optimal joining order. The opti-mization collects a set of RDF statistics for estimating the execution cost of the query plan. Finally the opti-mal query plan is evaluated using the defined operators for answering the given SPARQL query. Extensive tests were conducted on both synthetic and real datasets containing up to 100 million triples to evaluate this approach with the results showing that this approach can answer most queries within 1 s and is extremely efficient and scalable in comparison with previous best state-of-the-art RDF stores.