Triangle Sparsifiers

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2011 English Publisher:Journal of Graph Algorithms and ApplicationsJournal:Journal of Graph Algorithms and Applications, volume 15, pages 703-726 (eissn: 1526-1719,

Copyright policy )

Authors: Charalampos E. Tsourakakis; Mihail N. Kolountzakis; Gary L. Miller;

doi: 10.7155/jgaa.00245

Triangle Sparsifiers

- Summary
- Subjects
- Metrics

Abstract

Summary: In this work, we introduce the notion of triangle sparsifiers, i.e., sparse graphs which are approximately the same to the original graph with respect to the triangle count. This results in a practical triangle counting method with strong theoretical guarantees. For instance, for unweighted graphs we show a randomized algorithm for approximately counting the number of triangles in a graph \(G\), which proceeds as follows: keep each edge independently with probability \(p\), enumerate the triangles in the sparsified graph \(G'\) and return the number of triangles found in \(G'\) multiplied by \(p^{-3}\). We prove that under mild assumptions on \(G\) and \(p\) our algorithm returns a good approximation for the number of triangles with high probability. Specifically, we show that if \(p \geq \max \left(\frac{\text{poly}\log(n)(n)\Delta}{t}, \frac{\text{poly}\log(n)}{t^{1/3}}\right)\), where \(n, t, \Delta\), and \(T\) denote the number of vertices in \(G\), the number of triangles in \(G\), the maximum number of triangles an edge of \(G\) is contained and our triangle count estimate respectively, then \(T\) is strongly concentrated around \(t\): \[ \mathbf{Pr}\left[|T-t|\geq\epsilon t\right] \leq n^{-K}. \] We illustrate the efficiency of our algorithm on various large real-world datasets where we obtain significant speedups. Finally, we investigate cut and spectral sparsifiers with respect to triangle counting and show that they are not optimal.

Keywords

Graph algorithms (graph-theoretic aspects), triangle sparsifiers, triangle count, sparse graphs

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	40
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

40

Top 10%

gold

Fields of Science (4) View all

natural sciences

computer and information sciences

Fields of Science

natural sciences

computer and information sciences

View all