Papers /

Saroiu-OSDI 2002

Reading

Outdoors

Games

Hobbies

LEGO

Food

Code

Events

Nook

sidebar

Saroiu-OSDI 2002

An Analysis of Internet Content Delivery Systems

Saroiu, Gummadi, Dunn, Gribble, Levy

content-delivery cdn dissemination internet p2p usage web monitoring

@inproceedings{saroiu:osdi-2002,
  title={An Analysis of Internet Content Delivery Systems},
  author={Saroiu, S. and Gummadi, K.P. and Dunn, R.J.
          and Gribble, S.D. and Levy, H.M.},
  journal={Symposium on Operating Systems Design and Imprementation}
  pages={315--328},
  month={December},
  year={2002},
  location={Boston, MA}
}

Substantial amount of data collected from U Washington network

  • Looking at traffic and usage patterns
    • HTTP Web Traffic
    • Akamai
    • Kazaa
    • Gnutella
  • E.g., average Kazaa user consumes 90 times more bandwidth than the average web user
  • Most Web objects are small (5--10kb) but heavy tailed and large objects exist
  • Web objects & servers accessed with Zipf popularity distribution

Web caching helps alleviate network and server loads

  • Cache hit rates of 40--50% on Web traffic may be achieved
    • Hit rate increases only logarithmically with user growth
    • Constrained by dynamic content
  • Web content delivery networks work mostly through DNS interposition or URL rewriting
    • Do reduce average download times, but DNS redirection adds latency
    • Possible they merely prevent using worst service, rather than optimal service

P2P usage patterns are very different from Web use

  • Tend more toward non-interactive batch downloads, larger object transfers---three orders of magnitude larger than Web objects
  • Most providers are end users with low availability and network resources

Several common P2P search structures: Centralized, (overlay) broadcast, super-peers (hybrid)

  • Most download directly from provider; some download fragments in parallel from multiple providers (BitTorrent, Kazaa)
  • Request rate is low compared to WWW, but transfers are long---1000x longer, meaning that many more P2P connections are live at any point, though request rate is low

Gnutella network does not restructure according to network topology, causing many queries to go outside the immediate area

In theory, Web traffic hits many popular sites hard P2P traffic has less hotspots, wider distribution of load

  • This is not true in practice

Based on this study at UW

  • P2P traffic outweighs Web traffic by factor of three
  • P2P nodes consume bandwidth in both directions
  • Not obvious that P2P traffic scales well at all
    • Its use of the network is so intense that it can easily overcome resources
    • Small number of objects count for disproportionate amount of bandwidth used
  • Placing P2P caches at gateways could provide substantial savings on inbound/outbound bandwidth consumption
Recent Changes (All) | Edit SideBar Page last modified on January 22, 2009, at 12:57 PM Edit Page | Page History