From Graph Theory in the Information Age by Fan Chung:
Empirically, most real-world graphs have the following properties:
- sparsity — The number of edges is within a constant multiple of the number of vertices.
- small world phenomenon — Any two vertices are connected by a short path. Two vertices having a common neighbor are more likely to be neighbors.
- power law degree distribution — The degree of a vertex is the number of its neighbors. The number of vertices with degree j (or having j neighbors) is proportional to j-β for some fixed constant β.
5 thoughts on “Three properties of real-world graphs”
I would rather split “short path” (i.e., the small-world phenomenon) from “vertices with a common neighbor are probably neighbors” (i.e., high transitivity).
The small-world phenomenon presents itself even in purely random graphs, where there is no edge correlation. Formally it means that the average geodesic path grows logarithmically with network size.
On the other hand, transitivity depends on the “kingdom” from where the graph comes. Social and biological networks usually present high transitivity, while some technological networks present low transitivity. This is related to occurrence of communities, but not necessarily.
I highly recommend the SIAM review of Mark Newman (“The structure and function of complex networks”) to introduce yourself on the topic. For a lighter read, “Linked” from Albert-Lászlo Barabási is also a nice introduction, and the book responsible to hook me into this research topic :)
I’m not a graph theorist.
Are these properties independent of each other? The last two don’t really seem to be.
Also, are you sure that “sparse” is actually a feature of “real graphs”, or just a feature of “phenomena that we tend to use graphs to study”? And is sparsity a useful feature such that when we encounter a non-sparse graph we’ll automatically transform the data until the result actually is sparse?
Can you expand on the first point? For any given graph, if we assume no duplicate edges then of course |E| <= |V| * |V| (where we view the first |V| as our "constant"). Do we mean that given a collection of graphs, |E_n| 0?
Eh, either I made a mistake or the formatting got messed up. In words: do we mean that for any collection of sparse/real-world graphs the size of |E| is bounded by a constant times |V|? If so, would not the constant depend on the source of the real world graphs, and couldn’t we call nearly complete graphs sparse by putting them in a collection with sufficiently large constant of proportionality?
Power law or lognornal? Cosma shalizi has a paper on this issue, if I remember well.