Papers /

Li-P 2 P 2003

Reading

Outdoors

Games

Hobbies

LEGO

Food

Code

Events

Nook

sidebar

Li-P 2 P 2003

On the Feasibility of Peer-to-Peer Web Indexing and Search

Li et al

distributed hashtables p2p search keywords approximate matching

@article{li:p2p-2003,
  title={On the Feasibility of Peer-to-Peer Web Indexing and Search},
  author={Li, J. and Loo, B. and Hellerstein, J. and Kaashoek, M. and
          Karger, D. and Morris, R.},
  journal={Peer-to-Peer Systems II},
  pages={207--215},
  year={2003},
  publisher={Springer-Verlag}
}

Notably in this context, core DHT mechanics are based around flat labels: Content identifiers consist of exact numeric or bit string identifiers. Those identifiers are generated either from some natural key associated with content, e.g., a URI, or other exact specification, e.g., hashing the content itself. DHT mechanisms rely on having exact content identification and do not directly provide more expressive addressing and querying models, e.g., keyword search for content.

The most natural mechanism to provide search in a DHT stores content at its identifier as usual. A set of extracted features of the content, such as keywords, are then also mapped to identifiers, with lists of associated content identifiers kept by the controlling nodes of each. Search proceeds by mapping search terms to identifiers, fetching the lists associated with edge, applying any query logic to them, e.g., intersecting results, and fetching from the remaining object identifiers. This approach, however, may introduce significant network or storage resource consumption~\cite{li:p2p-2003} from the replicated content pointers, and becomes significantly worse once practical issues such as substrings and misspellings are considered. Alternative DHT structures address these latter two points, such as approximate matching based on metric keyspaces~\cite{wong:hotos-2007,wong:cubit-2008}, but the basic paradigm of replicating object pointers out to controlling nodes for each search feature remains.

Recent Changes (All) | Edit SideBar Page last modified on February 14, 2011, at 12:03 PM Edit Page | Page History