RDFPeers: A Scalable Distributed {RDF} Repository Based on a Structured Peer-to-peer Network
Cai and Frank
rdf semanticweb p2p query search triple store
@inproceedings{cai:www-2004,
title={RDFPeers: A Scalable Distributed {RDF} Repository Based
on a Structured Peer-to-peer Network},
author={Cai, M. and Frank, M.},
booktitle={13th International Conference on the World Wide Web},
pages={650--657},
year={2004}
}
Attempts to provide a distributed RDF triple store in a scalable fashion
Takes each triple and stores it at three hosts
- Each determined by hashing subject, object, predicate respectively
- URIs and string literals are SHA1 hashed to determine key
- Locality preserving hash is applied to literal numbers
Uses MAAN---Multi-Attribute Addressable Network---to spread triples
- Extension to Chord to allow multiple keys
- Total messages is O(M log n) to store triple
- N nodes, M attributes (always 3 in RDFPEER)
Querying is done in obvious way given spreading to three nodes (hashes of sub, obj, pred respectively)
Edutella and successors are very related work
- But based around flooding and super peers, with some routing based on fixed, known schemas
Triples are replicated on neighbors in case of node failure
Popularity is a significant problem in this scheme
- E.g., rdf:type predicates will be plentiful, to say the least
- RDFPeers manages this by tracking how frequent a predicate value is and simply ceasing to index based on it once a threshold is crossed