Skip to article frontmatterSkip to article content

9.4 Graph Theory - Consensus

convergence of agents' behavior

Dept. of Electrical and Systems Engineering
University of Pennsylvania

Binder

Lecture notes

1ReadingΒΆ

Material related to this page can be found in

2Learning ObjectivesΒΆ

By the end of this page, you should know:

  • what is a consensus protocol
  • the consensus theorem
  • how the eigenvalues/eigenvectors of the Laplacian matrix relate to consensus in a graph

3Consensus ProtocolsΒΆ

Consider a collection of NN agents that communicate along a set of undirected links described by a graph GG. Each agent has state xi(t)∈Rx_i(t) \in \mathbb{R}, with initial value xi(0)x_i(0), and together, they wish to determine the average of the initial states avg(x(0))=1Nβˆ‘i=1Nxi(0)\text{avg}(\vv x(0)) = \frac{1}{N} \sum_{i=1}^N x_i(0).

The agents implement the following consensus protocol:

xΛ™i=βˆ‘j∈Ni(xjβˆ’xi)=βˆ’βˆ£Ni∣(xiβˆ’avg(xNi)),\dot{x}_i = \sum_{j \in N_i} (x_j - x_i) = -|N_i| (x_i - \text{avg}(x_{N_i})),

where avg(xNi)=1∣Niβˆ£βˆ‘j∈Nixj\text{avg}(x_{N_i}) = \frac{1}{|N_i|} \sum_{j \in N_i} x_j is the average of the states of the neighbors of agent ii. This is equivalent to the first-order homogeneous linear ordinary differential equation:

xΛ™=βˆ’Lx.(AVG)\dot{\vv x} = -L\vv x. \quad (\text{AVG})

Based on our previous analysis of such systems, we know that the solution to (AVG) is given by

x(t)=c1eΞ»1tv1+β‹―+cneΞ»ntvn,x(0)=[v1β‹―vn]c.(SOL)\vv x(t) = c_1 e^{\lambda_1 t} \vv v_1 + \cdots + c_n e^{\lambda_n t} \vv v_n, \quad \vv x(0) = \bm \vv v_1 & \cdots & \vv v_n \em \vv c. \quad (\text{SOL})

where (Ξ»i,vi)(\lambda_i, \vv v_i), i=1,…,ni=1,\ldots,n, are the eigenvalue/eigenvector pairs of the negative graph Laplacian βˆ’L-L. Thus, the behavior of the consensus system (AVG) is determined by the spectrum of LL. We will spend the rest of this lecture on understanding the following theorem:

This result is extremely intuitive! It says that so long as the information at one node can eventually reach every other node in the graph, then we can achieve consensus via the protocol (AVG). Let’s try to understand why. As in the previous lecture, we order the eigenvalues of βˆ’L-L in decreasing order: Ξ»1β‰₯Ξ»2β‰₯β‹―β‰₯Ξ»n\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n.

Our first observation is that Ξ»1=0\lambda_1 = 0, v=1\vv v = \vv 1 is an eigenvalue/eigenvector pair for βˆ’L-L. This follows from the fact that each row of LL sums to 0, and so:

βˆ’L1=0=01.-L \mathbf{1} = \mathbf{0} = 0 \mathbf{1}.

A fact that we’ll show is true later is that the eigenvalues of LL are all nonnegative, and thus we know that Ξ»i≀0\lambda_i \leq 0 for the eigenvalues of βˆ’L-L. As such, we know that Ξ»=0\lambda = 0 is the largest eigenvalue of βˆ’L-L: hence we label them Ξ»1=0\lambda_1 = 0, v1=1\vv v_1 = \mathbf{1}.

Next, we recall that for an undirected graph, the Laplacian LL is symmetric, and hence is diagonalized by an orthonormal eigenbasis βˆ’L=QΞ›QT-L = Q \Lambda Q^T, where Q=[u1β‹―un]Q = \bm \vv u_1 & \cdots & \vv u_n\em is an orthogonal matrix composed of orthonormal eigenvectors of LL, and Ξ›=diag(Ξ»1,…,Ξ»n)\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n). Although we do not know u2,…,un\vv u_2, \ldots, \vv u_n, we know that u1=v1βˆ₯v1βˆ₯1N1\vv u_1 = \frac{\vv v_1}{\|\vv v_1\|}\frac{1}{\sqrt{N}} \mathbf{1}.

We can therefore rewrite (SOL) as:

x(t)=c1e0t1N1+c2eΞ»2tu2+β‹―+cneΞ»ntun=c11N1+c2eΞ»2tu2+β‹―+cneΞ»ntun\begin{align*} \vv x(t) &= c_1 e^{0t} \frac{1}{\sqrt{N}} \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n \\ &= c_1 \frac{1}{\sqrt{N}} \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n \end{align*}

where now we can compute c\vv c by solving x(0)=Qc⇒c=QTx(0)\vv x(0) = Q\vv c \Rightarrow \vv c = Q^T \vv x(0), as QQ is an orthogonal matrix.

Let’s focus on computing c1c_1:

c1=u1Tx(0)=1N1Tx(0)=1Nβˆ‘i=1Nxi(0).c_1 = \vv u_1^T \vv x(0) = \frac{1}{\sqrt{N}} \mathbf{1}^T \vv x(0) = \frac{1}{\sqrt{N}} \sum_{i=1}^N x_i(0).

Plugging this back into (5), we get:

x(t)=1Nβˆ‘i=1Nxi(0)cot⁑1+c2eΞ»2tu2+β‹―+cneΞ»ntun=avg(x(0))1+c2eΞ»2tu2+β‹―+cneΞ»ntun.\begin{align*} \vv x(t) &= \frac{1}{N} \sum_{i=1}^N x_i(0) \cot \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n \\ &= \text{avg}(\vv x(0)) \mathbf{1} + c_2 e^{\lambda_2 t} \vv u_2 + \cdots + c_n e^{\lambda_n t} \vv u_n. \end{align*}

This is very exciting! We have shown that the solution x(t)\vv x(t) to (AVG) is composed of a sum of the final consensus state xβˆ—=avg(x(0))1\vv x^* = \text{avg}(\vv x(0)) \mathbf{1} and exponential functions cieΞ»ituic_i e^{\lambda_i t} \vv u_i, i=2,…,ni=2,\ldots,n, evolving in the subspace u1βŠ₯\vv u_1^\perp orthogonal to the consensus direction 1N1\frac{1}{\sqrt{N}} \mathbf{1}. Thus, if we can show that Ξ»2,…,Ξ»n<0\lambda_2, \ldots, \lambda_n < 0, we will have established our result.

To establish this result, we start by stating a widely used theorem for bounding localizing eigenvalues.

Let’s apply this theorem to a graph Laplacian LL. The diagonal elements of L=Ξ”βˆ’AL = \Delta - A are given by Ξ”ii=out(vi)\Delta_{ii} = \text{out}(v_i), the out-degree of node ii. Further, the radii ri=out(vi)r_i = \text{out}(v_i) as well, as aij=1a_{ij} = 1 if node ii is connected to node jj, and 0 otherwise. Therefore, for row ii, we have the following Gershgorin intervals:

Gi(L)={λ∈Rβˆ£βˆ£Ξ»βˆ’out(vi)βˆ£β‰€out(vi)}. G_i(L) = \{\lambda \in \mathbb{R} \mid |\lambda - \text{out}(v_i)| \leq \text{out}(v_i)\}.

These are intervals of the form [0,2out(vi)][0, 2\text{out}(v_i)], and therefore the union G(L)=⋃i=1nGi(L)=[0,2dmax]G(L) = \bigcup_{i=1}^n G_i(L) = [0, 2d_{\text{max}}], where dmax=max⁑iout(vi)d_{\text{max}} = \max_i \text{out}(v_i) is the maximal out degree of a node in the graph. Taking the negative of everything, we conclude that G(βˆ’L)=[βˆ’2dmax,0]G(-L) = [-2d_{\text{max}}, 0].

This tells us that Ξ»i≀0\lambda_i \leq 0 for i=1,2,…,ni=1,2,\ldots,n for the eigenvalues of βˆ’L-L. This is almost what we wanted. We still need to show that only Ξ»1=0\lambda_1 = 0 and that Ξ»n≀⋯≀λ2<0\lambda_n \leq \cdots \leq \lambda_2 < 0. To answer this question, we rely on the following proposition:

Unfortunately proving this result would take us too far astray. Instead, we highlight the intuitive nature of the result in terms of the consensus system (AVG). This proposition tells us that if the communication graph GG is strongly connected, i.e., if everyone’s information eventually reaches everyone, then x(t)β†’xβˆ—=avg(x(0))1\vv x(t) \to \vv x^* = \text{avg}(\vv x(0))\mathbf{1} at a rate governed by the slowest decaying node eβˆ’Ξ»2te^{-\lambda_2 t}.

In contrast, suppose the graph GG is disconnected, and consists of the disjoint union of two connected graphs G1=(V1,E1)G_1 = (\mathcal{V}_1, \mathcal{E}_1) and G2=(V2,E2)G_2 = (\mathcal{V}_2, \mathcal{E}_2), i.e., G=(V1βˆͺV2,E1βˆͺE2)G = (\mathcal{V}_1 \cup \mathcal{V}_2, \mathcal{E}_1 \cup \mathcal{E}_2) and V1∩V2=βˆ…\mathcal{V}_1 \cap \mathcal{V}_2 = \emptyset and E1∩E2=βˆ…\mathcal{E}_1 \cap \mathcal{E}_2 = \emptyset. Then if we run the consensus protocol (AVG) on GG, the system effectively decouples into two parallel systems, each evolving on their own graph and blissfully unaware of the other:

xΛ™1=βˆ’L1x1andxΛ™2=βˆ’L2x2.\dot{\vv x}_1 = -L_1 \vv x_1 \quad \text{and} \quad \dot{\vv x}_2 = -L_2 \vv x_2.

Here we use x1\vv x_1 to denote the state of agents in G1G_1, with Laplacian L1L_1, and similarly for x2\vv x_2. By the above discussion, if L1L_1 and L2L_2 are both strongly connected, then xi(t)β†’avg(xi(0))1\vv x_i(t) \to \text{avg}(\vv x_i(0)) \mathbf{1}, and Ξ»=0,v=1\lambda = 0, \vv v = \mathbf{1} is an eigenvalue/vector pair for each graph.

If we now consider the joint graph GG composed of the two disjoint graphs G1G_1 and G2G_2, we can immediately see how to change our consensus protocol: xΛ™i=βˆ’Lixi\dot{\vv x}_i = -L_i\vv x_i will evolve as it did before.

To see how this manifests in the algebraic multiplicity of the 0 eigenvalue of L=[L1L2]L = \begin{bmatrix} L_1 & \\ & L_2 \end{bmatrix}, note that for the composite system with state x=[x1x2]\vv x = \begin{bmatrix} \vv x_1 \\ \vv x_2 \end{bmatrix}, we have the consensus dynamics:

[xΛ™1xΛ™2]=[βˆ’L1βˆ’L2][x1x2],\begin{bmatrix} \dot{\vv x}_1 \\ \dot{\vv x}_2 \end{bmatrix} = \begin{bmatrix} -L_1 & \\ & -L_2 \end{bmatrix} \begin{bmatrix} \vv x_1 \\ \vv x_2 \end{bmatrix},

which has Ξ»1=0\lambda_1 = 0 with v1=[10]\vv v_1 = \begin{bmatrix} \vv 1 \\ \vv 0 \end{bmatrix} and Ξ»2=0\lambda_2 = 0 with v2=[01]v_2 = \begin{bmatrix} \vv 0 \\ \vv 1 \end{bmatrix} so that:

[x1βˆ—x2βˆ—]=[10]avg(x1(0))+[01]avg(x2(0)).\begin{bmatrix} \vv x_1^* \\ \vv x_2^* \end{bmatrix} = \begin{bmatrix} \vv 1 \\ \vv 0 \end{bmatrix} \text{avg}(\vv x_1(0)) + \begin{bmatrix} \vv 0 \\ \vv 1 \end{bmatrix} \text{avg}(\vv x_2(0)).

This is of course expected, as all we have done is rewrite (13) using block vectors and matrices --- we have not changed anything about the consensus protocol.

4Python Break: Determining if a Graph is Connected!ΒΆ

In this section, let’s look at an efficient algorithm to check if an (undirected) graph GG is connected, i.e., has exactly one connected component. Of course, we showed above that the zero eigenvalue has algebraic multiplicity one in the Laplacian of GG if and only if GG is connected, which gives us one avenue of attack: to find the eigenvalues of GG.

In this section, we will explore how a much simpler class of algorithms, known as graph traversal algorithms, can be used to find the connected components of a graph (and count how many there are). The particular class of algorithm we will look at are called tree-based graph traversal algorithms. At a high level, the algorithm starts maintains a set of visited vertices, and at each step, chooses an unvisited vertex that is adjacent to a visited vertex and visits it; the algorithm starts at an arbitrary vertex. A visual can be found below:

alt text

From left to right, top to bottom, we see that at each iteration, we choose a vertex (marked in blue) that is adjacent to a green vertex and make it green (visit it). If we start this procedure from a vector vv, then this procedure will in fact visit all vertices in the connected component of vv. Try to convince yourself of this fact! To start, recall that uu is in the connected component of vv if and only if there is a path P=v→⋯→uP = v\to \dots \to u from vv to uu in the graph.

Below, we have an implementation of this procedure in Python:

import random
from random import randrange
random.seed(6928)

class Graph:
    def __init__(self, n):
        self.n_vertices = n
        self.neighbors = [set() for _ in range(n)]

    def add_edge(self, i, j):
        self.neighbors[i].add(j)
        self.neighbors[j].add(i)

    def n_edges(self):
        return int(sum([len(s) for s in self.neighbors]) / 2)
    
    def print_edges(self):
        for v in range(self.n_vertices):
            for neighbor in self.neighbors[v]:
                if neighbor > v:
                    print('(' + str(v) + ', ' + str(neighbor) + ')')

    def connected_component(self, v):
        cc = set()

        visited = [False for _ in range(self.n_vertices)]
        frontier = [v]

        while len(frontier) > 0:
            index = randrange(len(frontier))
            v = frontier.pop(index)
            if visited[v]:
                continue

            visited[v] = True
            cc.add(v)
            for neighbor in self.neighbors[v]:
                frontier.append(neighbor)

        return cc

g = Graph(5)
g.add_edge(0, 1)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(3, 4)

print('Edges of g:')
g.print_edges()
print('Connected component of 0:', g.connected_component(0))
print('Connected component of 1:', g.connected_component(1))
print('Connected component of 2:', g.connected_component(2))
print('Connected component of 3:', g.connected_component(3))
print('Connected component of 4:', g.connected_component(4))

print()

h = Graph(5)
h.add_edge(0, 1)
h.add_edge(1, 2)
h.add_edge(2, 0)
h.add_edge(3, 4)
h.add_edge(0, 4)

print('Edges of h:')
h.print_edges()
print('Connected component of 0:', h.connected_component(0))
print('Connected component of 1:', h.connected_component(1))
print('Connected component of 2:', h.connected_component(2))
print('Connected component of 3:', h.connected_component(3))
print('Connected component of 4:', h.connected_component(4))
Edges of g:
(0, 1)
(0, 2)
(1, 2)
(3, 4)
Connected component of 0: {0, 1, 2}
Connected component of 1: {0, 1, 2}
Connected component of 2: {0, 1, 2}
Connected component of 3: {3, 4}
Connected component of 4: {3, 4}

Edges of h:
(0, 1)
(0, 2)
(0, 4)
(1, 2)
(3, 4)
Connected component of 0: {0, 1, 2, 3, 4}
Connected component of 1: {0, 1, 2, 3, 4}
Connected component of 2: {0, 1, 2, 3, 4}
Connected component of 3: {0, 1, 2, 3, 4}
Connected component of 4: {0, 1, 2, 3, 4}

Above, we define two graphs g and h and find the connected component corresponding to each vertex in the two graphs using the connected_components function. Note that g has two connected components, {0, 1, 2} and {3, 4} whereas h has only one connected component {0, 1, 2, 3, 4}, meaning that h is connected and g is not.

Let’s understand how this algorithm implements the procedure we described above. At a high level, at each iteration of the while loop, the cc set will contain all the vertices we have already visited (the visited array will also indicate which vertices we have already visited), and the frontier list will contain all the unvisited vertices adjacent to any vertex we have already visited. At each iteration of the while loop, we pick a random vertex in the frontier list and visit it, and then repeat until there are no more vertices to explore (i.e., we have explored the entire connected component). At the end of this procedure, the cc set will contain all the vertices in our connected component!

To conclude, we have a few remarks about the way we choose vertices from the frontier list. Currently, we randomly select vertices from this list, but we can instead use an arbitrary rule to choose which vertex instead. For example:

  • If we choose the vertex in frontier which was added earliest (out of all vertices which are in frontier), then this algorithm is called breadth-first search, or BFS.

  • If we choose the vertex in frontier which was most recently added, then this algorithm is called depth-first search, or DFS.

A visual of BFS versus DFS is below:

alt text

If you take a computer science class, you’ll definitely see these very commonly used algorithms in a lot more detail!

Binder