An Evaluation of Triple-Store Technologies for Large Data Stores

Rohloff, Dean, Emmons, Ryder, Sumner

rdf semantic web triple store evaluation query

Looks at Sesame, Jena, and Allegro in working with large datasets

Primarily LUBM datasets, mentions University Ontology Benchmark

Metrics: Load time, repository size, response time

  • Cumulative load time: How long to long data and ontologies (hours!)
  • Query response time: Mean of execution time over four identical queries
  • Completeness: Complete iff returns all correct responses
  • Soundness: Sound iff only returns correct responses
  • Disk-Space usage: Amount of disk space used to load data and ontologies

Multiple styles of query used

  • Low volume, low complexity (Lehigh query 1)
  • High volume, low complexity (Lehigh query 2)
  • High complexity (Lehigh query 9)
  • High volume: Large portion of data returned in result set
  • Complexity: Substantial processing time required

Queries manually ported for different input languages

