Networks and Simulation — Tutorial Two

Supplementary Notes: Generative Models

These notes are intended as pre-tutorial reading alongside Chapters 16, 17.3, and 18.1 of the Atlas. They do not replace the reading but bridge it to the tutorial exercises, filling in details the Atlas covers briefly and making explicit the conceptual connections between the models.

1. Why Build Synthetic Networks?

Before examining specific models, it is worth being precise about why we generate random networks at all. There are three distinct uses, and conflating them leads to confusion about what a model needs to do.

Null models for hypothesis testing. The most common use. Given an observed measurement on a real network — say, that the Medici family has the highest centrality score in the Florentine trade network — we want to ask: is this result surprising, or is it what we would expect from a network with this network’s basic properties? To answer this, we generate an ensemble of synthetic networks that share some properties with the real network (but not others) and check whether our measurement falls outside the typical range. The synthetic network is a baseline, not a model of reality. What matters is choosing the right baseline for the right question.

Forward simulation. When we cannot observe a network directly — a future contact network during a pandemic, a hypothetical infrastructure system, a communication network being designed — we need to generate plausible synthetic instances to study. Here the model must be realistic enough to support valid inference: the synthetic network should reproduce the structural features that matter for the process being studied.

Theoretical understanding. Random graph models are analytically tractable in ways real networks are not. Proving things about ER graphs gives us theorems with precise conditions. This theoretical understanding, even when it does not perfectly describe real networks, gives us intuitions and benchmarks.

These uses impose different requirements on a model. A null model only needs to preserve the properties being controlled for. A forward simulation model needs to capture whatever structural features drive the outcome of interest. A theoretical model needs to be simple enough to analyse. No single model is optimal for all three purposes simultaneously — and choosing between models is a scientific judgement, not a technical one.

2. The Erdős–Rényi Model

You will have covered the ER model in the reading. A brief summary of the key results relevant to the exercises:

The ER model G(n, p) places n nodes and includes each possible edge independently with probability p. The expected degree of every node is ⟨k⟩ = p(n − 1), and the degree distribution converges to a Poisson distribution with mean ⟨k⟩ as n → ∞:

\[P(k) = e^{-\langle k \rangle} \frac{\langle k \rangle^k}{k!}\]

The Poisson distribution has a crucial property: its variance equals its mean. This means ER graphs have very homogeneous degree sequences — most nodes have degree close to ⟨k⟩, and very high-degree nodes are exponentially rare.

Two results are worth keeping in mind for the exercises:

Phase transition. Below p_c = 1/(n − 1) (equivalently ⟨k⟩ = 1), the graph consists of many small isolated components. Above this threshold a giant connected component emerges and grows rapidly. The transition is sharp and well-defined.

Short path lengths. Above the phase transition, the diameter of an ER graph scales as log(n)/log(⟨k⟩). Even for very large n, typical path lengths are short. A network of one million nodes with mean degree 10 has an average path length of around 6. This is the mathematical substrate of “six degrees of separation.”

What ER does not capture: high clustering. The clustering coefficient of a sparse ER graph is C ≈ p ≈ ⟨k⟩/n, which vanishes as n grows. Real social networks have clustering coefficients that are orders of magnitude higher than this prediction.

3. The Watts–Strogatz Model and the Small-World Property

The Watts–Strogatz (WS) model was introduced specifically to explain a pattern observed in real networks that ER cannot reproduce: high local clustering coexisting with short global path lengths. This combination — called the small-world property — is nearly universal in social, biological, and technological networks.

Construction. Start with a regular ring lattice: n nodes arranged in a circle, each connected to its k nearest neighbours on each side (so each node has degree k). This structure has high clustering (your neighbours are likely connected to each other) but long path lengths (to reach the far side of the ring requires O(n/k) steps). Then, independently rewire each edge with probability β: with probability β, replace one endpoint of the edge with a node chosen uniformly at random.

The key result. Even very small values of β — just a few percent of edges rewired — dramatically reduce average path lengths while barely affecting clustering. The intuition is asymmetric: a single random rewiring can create a “shortcut” that connects two distant parts of the network, reducing path length globally; but clustering, which measures local triangular structure, is barely affected by a small number of such shortcuts. The “small-world regime” lies at intermediate β values where both properties are simultaneously satisfied.

What WS captures and misses. The WS model does produce the small-world property. However, for any β > 0 the degree distribution is approximately Poisson — all nodes still have approximately the same degree. Real social networks, as you will see in Exercise 2, have much more heterogeneous degree distributions. The WS model is therefore not a satisfactory description of real social networks, but it establishes an important theoretical point: high clustering and short path lengths are not contradictory and can coexist in a simple model.

4. Degree Distributions: What Real Networks Look Like

The Poisson degree distribution of ER graphs (and approximately of WS graphs) implies that nodes are essentially interchangeable: none has dramatically more connections than others. Real networks are typically very different.

In many real social, technological, and biological networks, the degree distribution has a heavy tail: most nodes have low degree, but a small number of “hubs” have degree far above the mean. For some networks (the World Wide Web, citation networks, metabolic networks) the distribution is approximately a power law:

\[P(k) \propto k^{-\gamma}\]

with exponent γ typically between 2 and 3. These are called scale-free networks. The power law has no characteristic scale — there is no “typical” degree, and hubs with degree many times the mean are not exponentially rare.

The practical implication for modelling is significant. In a Poisson network, the ratio ⟨k²⟩/⟨k⟩ (which controls many network properties including epidemic thresholds and centrality concentration) is close to ⟨k⟩ + 1. In a power-law network with γ < 3, ⟨k²⟩ diverges in the infinite network limit, meaning this ratio can be very large — the network behaves fundamentally differently. A null model based on ER will drastically misrepresent these properties.

The configuration model, described in the next section, was designed specifically to address this: rather than specifying a degree distribution, it takes the actual degree sequence of a real network as input.

5. The Configuration Model

The configuration model generates random graphs that exactly preserve a specified degree sequence while randomising all other structural features. It is the right null model when you want to ask: “Is this observation explained by the degree sequence alone, or does it require something more?”

Construction: stub matching. Assign each node i a number of “stubs” (half-edges) equal to its desired degree k_i. The total number of stubs is 2m (since every edge contributes two stubs). Then repeatedly draw two stubs uniformly at random and join them to form an edge, until no stubs remain.

The multigraph problem. This procedure can produce self-loops (a stub matched to another stub of the same node) and multi-edges (two nodes matched more than once). The resulting object is technically a multigraph, not a simple graph. The expected number of self-loops is approximately (⟨k²⟩ − ⟨k⟩)/(2⟨k⟩), which grows with the heterogeneity of the degree sequence. For networks with heavy-tailed degree distributions, self-loops and multi-edges are not negligible.

There are two principled approaches to this problem. The first is to simply discard self-loops and multi-edges after the fact, yielding a simple graph; this is fast and usually acceptable for sparse networks. The second, due to Fosdick et al. (2016), is to use a Markov chain Monte Carlo approach that samples directly from a specified graph space — either the vertex-labeled simple graph space (each simple graph equally likely) or the stub-labeled multigraph space (each stub-matching equally likely). The two spaces assign different probabilities to the same simple graph, and for heterogeneous networks they can give meaningfully different results. For most practical purposes the difference is small, but it is worth knowing the distinction exists.

What the configuration model preserves and does not preserve. By construction, the model preserves the degree sequence: every realisation has exactly the same degree for every node as the original network. It does not preserve:

Clustering. The randomisation of connections destroys triangular structure. Configuration model graphs typically have much lower clustering than the original network.
Community structure. Nodes in the same community have no higher probability of being connected.
Degree–degree correlations. Whether high-degree nodes tend to connect to other high-degree nodes (assortative) or to low-degree nodes (dissortative) is not preserved.
Any higher-order structure. Motif counts, local patterns, and spatial organisation are all destroyed.

This means that the configuration model is the right null model precisely for questions where these higher-order features are what you are trying to explain. If you observe that a network has significantly more clustering than a configuration model ensemble, that clustering is genuinely not explained by the degree sequence and requires a structural explanation.

6. A Note on the Spectral Gap

An idea that will recur in Tutorial 3 is worth introducing here. A graph is called an expander if every subset of nodes has many edges leaving it — there are no bottlenecks that would isolate part of the network. Intuitively, expanders are networks where information, disease, or influence spreads rapidly and cannot be trapped.

The quality of a graph as an expander is quantified by the spectral gap: the difference λ₁ − λ₂ between the largest and second-largest eigenvalues of the adjacency matrix. A large spectral gap means the leading eigenvector is dominant — one direction of the matrix captures most of the network’s structure. A small spectral gap means several eigenvectors are comparably important, often reflecting the presence of distinct clusters or communities.

ER graphs are good expanders with high probability once above the connectivity threshold: their spectral gap is large and grows with n. The configuration model inherits its spectral properties from the degree sequence; networks with heterogeneous degrees tend to have larger spectral gaps because hubs create strong central structure. Watts–Strogatz graphs, particularly at low β, can have small spectral gaps reflecting their ring-lattice structure.

In Tutorial 3, the leading eigenvector of the adjacency matrix is exactly what eigenvector centrality computes. The spectral gap determines how concentrated centrality is: a large gap means one or a few nodes dominate; a small gap means centrality is spread more evenly. Keeping this connection in mind as you work through the exercises will help you build a coherent picture of how network structure and network analysis methods relate to each other.