Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

HADI: Mining Radii of Large Graphs

HADI: Mining Radii of Large Graphs TKD00018 ACM (Typeset by SPi, Manila, Philippines) 1 of 24 February 23, 2011 HADI: Mining Radii of Large Graphs U. KANG and CHARALAMPOS E. TSOURAKAKIS, Carnegie Mellon University ANA PAULA APPEL, Universidade de Sao Paulo at Sao Carlos Ëœ Ëœ CHRISTOS FALOUTSOS, Carnegie Mellon University JURE LESKOVEC, Stanford University Given large, multimillion-node graphs (e.g., Facebook, Web-crawls, etc.), how do they evolve over time? How are they connected? What are the central nodes and the outliers? In this article we de ne the Radius plot of a graph and show how it can answer these questions. However, computing the Radius plot is prohibitively expensive for graphs reaching the planetary scale. There are two major contributions in this article: (a) We propose HADI (HAdoop DIameter and radii estimator), a carefully designed and ne-tuned algorithm to compute the radii and the diameter of massive graphs, that runs on the top of the H ADOOP/M AP R EDUCE system, with excellent scale-up on the number of available machines (b) We run HADI on several real world datasets including YahooWeb (6B edges, 1/8 of a Terabyte), one of the largest public graphs ever analyzed. Thanks to HADI, we report fascinating patterns on large http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Knowledge Discovery from Data (TKDD) Association for Computing Machinery

Loading next page...
 
/lp/association-for-computing-machinery/hadi-mining-radii-of-large-graphs-NoKcALC08a

References (21)

Publisher
Association for Computing Machinery
Copyright
Copyright © 2011 by ACM Inc.
ISSN
1556-4681
DOI
10.1145/1921632.1921634
Publisher site
See Article on Publisher Site

Abstract

TKD00018 ACM (Typeset by SPi, Manila, Philippines) 1 of 24 February 23, 2011 HADI: Mining Radii of Large Graphs U. KANG and CHARALAMPOS E. TSOURAKAKIS, Carnegie Mellon University ANA PAULA APPEL, Universidade de Sao Paulo at Sao Carlos Ëœ Ëœ CHRISTOS FALOUTSOS, Carnegie Mellon University JURE LESKOVEC, Stanford University Given large, multimillion-node graphs (e.g., Facebook, Web-crawls, etc.), how do they evolve over time? How are they connected? What are the central nodes and the outliers? In this article we de ne the Radius plot of a graph and show how it can answer these questions. However, computing the Radius plot is prohibitively expensive for graphs reaching the planetary scale. There are two major contributions in this article: (a) We propose HADI (HAdoop DIameter and radii estimator), a carefully designed and ne-tuned algorithm to compute the radii and the diameter of massive graphs, that runs on the top of the H ADOOP/M AP R EDUCE system, with excellent scale-up on the number of available machines (b) We run HADI on several real world datasets including YahooWeb (6B edges, 1/8 of a Terabyte), one of the largest public graphs ever analyzed. Thanks to HADI, we report fascinating patterns on large

Journal

ACM Transactions on Knowledge Discovery from Data (TKDD)Association for Computing Machinery

Published: Feb 1, 2011

There are no references for this article.