Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications
Stoica, Morris, Karger, Kaashoek, Balakrishnan
distributed hashtable dht
@inproceedings{stoica:sigcomm-2001,
title={{Chord}: A Scalable Peer-to-Peer Lookup
Service for {Internet} Applications},
author={Stoica, I. and Morris, R. and Karger, D. and
Kaashoek, M.F. and Balakrishnan, H.},
booktitle={Conference on Applications, Technologies, Architectures, and
Protocols for Computer Communications ({SIGCOMM})},
pages={149--160},
year={2001},
organization={{ACM}}
}
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. Stoica, Morris, Karger, Kaashoek, Balakrishnan. SIGCOMM 2001.
This paper, and the Chord work, are trying to solve one basic problem: Given a key, find the node responsible for that key in a large network of nodes. A primary, and reasonable, assumption is that there are too many keys and nodes for any node to keep track of many of either. Furthere, there is no natural hierarchical or other structure to the network, as imposed by services such as DNS. How the key and the responsible node are used is up to the application, but typically involves that node storing data associated with that key.
Chord provides a relatively simple solution to this problem, based on hashing keys and nodes onto a one-dimensional space and then hopping through nodes in that space to search for responsible nodes. It provides asymptotic bounds logarithmic in the number of nodes for both storage per node and search hops.
A significant shortcoming that Chord appears to have is the difficulty of using partial keys. This makes it difficult to support tasks such as searching for data by keywords. Even with simple keywords, at first glance it seems difficult to assign a structure to the keywords such that similar (subset) searches produce similar (i.e. commonly prefixed) keys and some mechanism adopted to search the Chord network for matching keys (e.g. to find all nodes within a key range). This sounds even more difficult for more structured data and queries, such as finding all recent data or data containing particular values. However, neither of these tasks is what Chord is intended for, although they are closely related, keyword searching in particular.
It's also not exactly clear how practical Chord would be on real networks, both Internet or otherwise. For example, the frequent outages and partitions on mobile networks sound problematic, although the authors do describe some preliminary ideas on replication and detecting partitions. The protocol would also become more complicated and likely generate more network traffic if it did more to route more efficiently in terms of network latency in addition to search hops.