corner
corner

Phys. Rev. E 70, 066111 (2004) [6 pages]

Finding community structure in very large networks

Download: PDF (194 kB) Buy this article Export: BibTeX or EndNote (RIS)

Aaron Clauset1, M. E. J. Newman2, and Cristopher Moore1,3
1Department of Computer Science, University of New Mexico, Albuquerque, New Mexico 87131, USA
2Department of Physics and Center for the Study of Complex Systems, University of Michigan, Ann Arbor, Michigan 48109, USA
3Department of Physics and Astronomy, University of New Mexico, Albuquerque, New Mexico 87131, USA

Received 30 August 2004; published 6 December 2004

The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O(md log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with mn and d∼log n, in which case our algorithm runs in essentially linear time, O(n log2 n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2×106 edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.

© 2004 The American Physical Society

URL:
http://link.aps.org/doi/10.1103/PhysRevE.70.066111
DOI:
10.1103/PhysRevE.70.066111
PACS:
89.75.Hc, 05.10.−a, 87.23.Ge, 89.20.Hh