Distributed Nash Equilibrium Seeking by A Consensus Based Approach

arXiv:1602.00771v3 [math.OC] 26 Mar 2017

Maojiao Ye and Guoqiang Hu

Abstract— In this paper, Nash equilibrium seeking among a
network of players is considered. Different from many existing
works on Nash equilibrium seeking in non-cooperative games,
the players considered in this paper cannot directly observe
the actions of the players who are not their neighbors. Instead,
the players are supposed to be capable of communicating with
each other via an undirected and connected communication
graph. By a synthesis of a leader-following consensus protocol
and the gradient play, a distributed Nash equilibrium seeking
strategy is proposed for the non-cooperative games. Analytical
analysis on the convergence of the players’ actions to the Nash
equilibrium is conducted via Lyapunov stability analysis. For
games with non-quadratic payoffs, where multiple isolated Nash
equilibria may coexist in the game, a local convergence result
is derived under certain conditions. Then, a stronger condition
is provided to derive a non-local convergence result for the
non-quadratic games. For quadratic games, it is shown that
the proposed seeking strategy enables the players’ actions to
converge to the Nash equilibrium globally under the given
conditions. Numerical examples are provided to verify the
effectiveness of the proposed seeking strategy.
Index Terms— Nash equilibrium; gradient play; leaderfollowing consensus; neighboring communication

I. INTRODUCTION
The past decade witnessed the penetration of game theory
into various research fields including biology, economy,
computer science, just to name a few. With the development
of game theory, Nash equilibrium seeking in non-cooperative
games emerges to be of both theoretical significance and
practical relevance (e.g., see [1]-[20] and the references
therein).
The authors in [2] formulated pure strategy Nash equilibria
seeking as a mixed-integer linear programming problem for
pool-based electricity market. Gradient play was leveraged
for finding differential Nash equilibria in continuous games
in [3]. Dynamic fictitious play and gradient play were
exploited in [4] for a continuous-time form of repeated
matrix games. Policy evaluation and policy improvement
were utilized for the computation of the Nash equilibrium in
differential graphical games [5]. The discrete-time stochastic
algorithm developed in [6] allows the players to take actions
in both simultaneous and asynchronous fashions. Based on
the state-based potential games, the authors in [7] considered
game design for distributed optimization problems, in which
a distributed process was proposed to obtain the equilibrium.
By utilizing the saddle point dynamics, convergence to the
Nash equilibria of a two-network zero-sum game was derived
M. Ye and G. Hu are with the School of Electrical and Electronic
Engineering, Nanyang Technological University, 639798, Singapore (Email:
mjye@ntu.edu.sg,gqhu@ntu.edu.sg).
This work was supported by Singapore Economic Development Board
under EIRP grant S14-1172-NRF EIRP-IHL.

in [10]. Switching communications were considered for the
two-network zero-sum games in [11]. Nash equilibrium seeking in generalized convex games was considered in [12]-[14].
The authors in [12] and [13] solved the generalized convex
game by a discrete-time distributed algorithm and Lemke’s
method was adapted for the computation of generalized
Nash equilibrium in convex games with quadratic payoffs
subject to linear constraints in [14]. Besides, extremum
seeking based approaches were proposed to seek for the
Nash equilibrium (e.g., see [1] and [15]-[18]). These methods
vary from integrator-type extremum seeking [1], discretetime extremum seeking [15], stochastic extremum seeking
[16], Shahshahani gradient-like extremum seeking [17], to
Lie bracket approximation based extremum seeking [18], etc.
A common characteristic of these methods is that no explicit
model information is required for the implementation of the
methods.
In non-cooperative games, the players’ payoff functions
are determined by the players’ own actions, together with
the other players’ actions [21]. Hence, a body of the existing
works require the players’ observations over their opponents’
actions to search for the Nash equilibrium. However, full
communication is impractical in many engineering systems
(e.g., multi-agent systems, ad hoc networks) [22]. Motivated
by the penetration of the game theoretic approaches into
cooperative control and distributed optimization problems in
engineering systems where full communication is not available (see, e.g., [5], [7], [8], [9]), this paper addresses Nash
equilibrium seeking under local communication network, i.e.,
the players communicate with their neighbors only.
To solve games with limited information, the main idea
of this paper is to utilize a consensus protocol to broadcast
local information. Consensus problems have been extensively
investigated in the existing literature (e.g., see [23]-[30] [42],
[43]). For instance, a class of consensus controllers were
proposed for networked dynamical systems in [42] and a
consensus based approach was studied in [43] for distributed
coordination of the generation, load and storage devices in a
microgrid. In particular, leader-following consensus concerns
with the synchronization of the agents’ states to a common
value, which is equal to the reference signal provided by the
leader [26]. The proposed Nash equilibrium seeking strategy
is based on an adaptation of a leader-following consensus
protocol and the gradient play. More specifically, each agent
acts as a virtual leader to provide its action as the reference
signal and the agents generate their estimates on the players’
actions by utilizing a leader-following consensus protocol.
Based on the estimates, the gradient play is implemented for
each player.
Related works: Two-network zero sum games were in-

vestigated in [10] and [11] under communication graphs.
Distributed learning for games on communication graph
was investigated in [31] and [32] for large-scale multiagent games by an adaptation of the fictitious play. Though
the communication setting in [32] was similar to the presented work, the authors considered games with identical
permutation-invariant utilities or exact potential games with
a permutation-invariant potential function, which are distinct
from the games considered in this paper. In [33], the authors
considered Nash equilibrium seeking for aggregative games
on graph, where each player’s payoff function depends on
their own actions and an aggregate of all the players’
actions. A discrete-time gossip algorithm was proposed to
solve it. A similar problem was addressed in [34] for the
energy consumption game in smart grids. The continuoustime method in [34] was based on a dynamic average
consensus protocol and the primal-dual dynamics. The idea
on solving games without using full information from all
the players was then generalized in [22], where the players’
payoff functions depend on the players’ actions in a more
general manner. The Nash equilibrium was characterized
by a variational inequality and a gossip-based algorithm
was proposed. The players were equipped with a waking
clock and they updated their actions asynchronously to reach
the Nash equilibrium. Different from [22], we consider the
Nash equilibrium seeking problem in a deterministic and
continuous-time scenario. The continuous-time algorithm
activates the powerful analysis tools in control theory to
solve games. Compared with the existing works, the main
contributions of the paper are summarized as follows.
1) Nash equilibrium seeking for non-cooperative games,
where the players have no direct access to the actions of the players who are not their neighbors, is
investigated in this paper. Based on a leader-following
consensus protocol and the gradient play, a Nash
equilibrium seeking strategy is designed. In the proposed algorithm, the players only need to communicate
with their neighbors on their estimates of the players’ actions. Avoiding full communication among the
players broadens the applicability of game theory to
engineering systems where only local communication
is attainable.
2) The convergence of the players’ actions to the Nash
equilibrium by utilizing the proposed seeking strategy
is analytically explored. Based on the Lyapunov stability analysis, it is shown that the proposed method
enables the players’ actions to converge to the Nash
equilibrium under certain conditions. Non-quadratic
games are firstly investigated followed by quadratic
games.
The rest of the paper is organized as follows. The problem
is formulated in Section II and the main results are presented in Section III. In the main result part, non-quadratic
games are firstly investigated followed by quadratic games.
Numerical examples are presented in Section IV to verify
the effectiveness of the proposed method. Conclusions are
given in Section V.

II. P ROBLEM F ORMULATION
Problem 1: Consider a game with N players. The set
of players is denoted by N = {1, 2, · · · , N }. The payoff
function of player i is fi (x), where x = [x1 , x2 , · · · , xN ]T ∈
RN is the vector of the players’ actions and xi ∈ R is the
action of player i. Suppose that if player j is not a neighbor
of player i, then, player i has no direct access to player
j’s action. Given that the Nash equilibrium of the game
exists, design a Nash equilibrium seeking strategy that can
be adopted by the players to learn the Nash equilibrium of
the non-cooperative game.
For the convenience of the readers, the definition of the
Nash equilibrium is given below.
Nash equilibrium is an action profile on which no player
can gain more payoff by unilaterally changing its own action,
i.e., an action profile x∗ = (x∗i , x∗−i ) is the Nash equilibrium
if [37]
fi (x∗i , x∗−i ) ≥ fi (xi , x∗−i ), ∀i ∈ N ,
(1)
where x−i = [x1 , x2 , · · · , xi−1 , xi+1 , · · · , xN ]T . Note that
fi (x) and x might alternatively be written as fi (xi , x−i ) and
(xi , x−i ), respectively, in this paper.
Remark 1: The objective for the players in the game is
different from the objective of the agents in distributed
optimization problems. Given the other players’ actions, the
players in the game intend to maximize their own payoffs by
adjusting their own actions. In contrast, the agents engaged in
distributed optimization problems collaboratively maximize
the sum of the agents’ objective functions, i.e.,
maxx

N
X

fi (x),

(2)

i=1

where fi (x) is the objective function of agent i and x is the
vector of the decision variables (see, e.g., [35]).
The following assumptions on the communication graph
(see Section VI-A for the related definitions) and the payoff
functions will be utilized in the rest of the paper.
Assumption 1: The players can communicate with each
other via an undirected and connected communication graph.
Assumption 2: The players’ payoff functions fi (x), ∀i ∈
N are C 2 functions.
III. M AIN R ESULTS
In this section, a distributed Nash equilibrium seeking
strategy will be proposed based on a leader-following consensus protocol and the gradient play. Non-quadratic games
are firstly considered followed by quadratic games.
Strategy design: Noticing that the players have no direct
access to the actions of the players who are not their
neighbors, we suppose that each player generates estimates
on the players’ actions. Let yi = [yi1 , yi2 , · · · , yiN ]T ∈ RN
denote player i’s estimates on the players’ actions and yij is
player i’s estimate on player j’s action. Then, the action of
player i is updated according to
ẋi = ki

∂fi
(yi ), i ∈ N ,
∂xi

(3)

where ki = δ k̄i , δ is a small positive parameter and k̄i is
a positive, fixed parameter. Furthermore, yij , ∀i, j ∈ N are
generated by
!
N
X
ẏij = −
aik (yij − ykj ) + aij (yij − xj ) , (4)
k=1

where aij is the element on the ith row and jth column of
the adjacency matrix of the communication graph.
Insight into the strategy design: Let τ = δt. Then, at the
τ -time scale,
∂fi
dxi
(yi ),
= k̄i
dτ
∂xi
!
(5)
N
X
dyij
δ
=−
aik (yij − ykj ) + aij (yij − xj ) ,
dτ
k=1

q
∀i, j ∈ N . Define yij
as the quasi-steady state of yij , for all
q
i, j ∈ N , i.e., yij for i, j ∈ N satisfy

N
X
q
q
q
−(
aik (yij
− ykj
) + aij (yij
− xj )) = 0, ∀i, j ∈ N . (6)
k=1

q
Then, yij
= xj , ∀i, j ∈ N as the communication graph is
undirected and connected.
Letting δ = 0 “freezes” yij , ∀i, j ∈ N on the quasi-steady
q
state on which yij = yij
= xj , ∀i, j ∈ N . Then, the reducedsystem is
∂fi
dxi
(x), ∀i ∈ N
(7)
= k̄i
dτ
∂xi
by which the convergence to the Nash equilibrium can be
derived under certain conditions (see, e.g., [1] and [3]).

A. Games with Non-quadratic Payoffs
In the following, we show that the seeking strategy in
(3) and (4) enables the players’ actions to converge to the
Nash equilibrium. To facilitate the subsequent analysis, the
following assumptions are made.
Assumption 3: There exists at least one, possibly multiple
Nash equilibria x∗ = [x∗1 , x∗2 , · · · , x∗N ]T such that for all
i ∈ N,
∂ 2 fi ∗
∂fi ∗
(x ) = 0,
(x ) < 0.
∂xi
∂x2i
Assumption 4: The matrix


2
2
2



B=



∂ f1
(x∗ )
∂x2
1
2
∂ f2
(x∗ )
∂x2 ∂x1

..
.
2

∂ fN
(x∗ )
∂xN ∂x1

∂ f1
(x∗ )
∂x1 ∂x2
2
∂ f2
(x∗ )
∂x2
2

···
..

2

∂ fN
(x∗ )
∂xN ∂x2

∂ f1
(x∗ )
∂x1 ∂xN
2
∂ f2
(x∗ )
∂x2 ∂xN

.
∂ 2 fN
(x∗ )
∂x2
N
2




,



is strictly diagonally dominant, i.e., | ∂∂xf2i (x∗ )| >
i
PN
∂ 2 fi
∗
|
(x
)|,
∀i
∈
N
[39].
j=1,j6=i ∂xi ∂xj
Remark 2: Assumptions 3-4 are adoptions of Assumptions 4.3-4.4 in [1]. By Theorem 2 in [3], the Nash equilibria
that satisfy Assumptions 3-4 are isolated. The objective of
this paper is to design a Nash equilibrium seeking strategy
that can be utilized by the players to learn the Nash equilibrium of the non-cooperative games. The characterizations

on the existence, uniqueness and isolation issues of the Nash
equilibria are beyond the scope of the paper. The readers are
referred to [3], [21] for the related explorations.
For notational convenience, let diag{pij }, pij ∈ R, i, j ∈
N be a diagonal matrix whose diagonal elements are
p11 , p12 , · · · , p1N , p21 , · · · , pN N , successively. Similarly,
diag{pi }, pi ∈ R, i ∈ N denotes an N × N dimensional
diagonal matrix whose ith
h diagonal element is ipi .
T

N (x)
1 (x) ∂f2 (x)
,
= ∂f∂x
, ∂x2 , · · · , ∂f∂x
Moreover, define ∂G(x)
∂x
1
N
h
iT
∂f2
∂fN
∂f1
∂G
=
,
∂x (y)
∂x1 (y1 ), ∂x2 (y2 ), · · · , ∂xN (yN )
T T
y
=
[y1T , y2T , · · · , yN
] ,
h1 (x)
=
−[a11 x1 , a12 x2 , · · · , a1N xN , a21 x1 , a22 x2 , · · · , aN N xN ]T ,
and B0 = diag{aij }, i, j ∈ N . Then, the concatenated form
of (4) is

ẏ = − ((L ⊗ IN ×N + B0 )y + h1 (x)) ,

(8)

where L is the Laplacian matrix of the communication graph,
IN ×N denotes an N × N dimensional identity matrix and
⊗ is the Kronecker product. Note that −(L ⊗ IN ×N + B0 )
is Hurwitz as the communication graph is undirected and
connected.
Theorem 1: Suppose that Assumptions 1-2 hold, and the
agents update their actions according to (3)-(4). Then, for
each x∗ that further satisfies Assumptions 3-4, there exists a
positive constant δ ∗ such that for each δ ∈ (0, δ ∗ ), (x∗ , 1N ⊗
x∗ ) is exponentially stable.
Proof: See Section VI-B for the proof.

Remark 3: Theorem 1 can be derived by utilizing singular perturbation analysis (see e.g., [1][38]). The detailed
Lyapunov stability analysis is conducted in the proof for
the convenience of the subsequent non-local convergence
analysis. From the proof, it can be seen that the convergence
result depends on the convergence of the players’ actions to
each isolated Nash equilibrium under the gradient play, i.e.,
∂fi
dxi
(x), ∀i ∈ N ,
= k̄i
dt
∂xi

(9)

which is ensured if the matrix k̄B, where k̄ = diag{k̄i }, i ∈
N , is Hurwitz. Hence, Assumption 4 is conservative to derive
the result (see also [1] for the argument).
Note that Assumption 4 is different from the condition
provided in Proposition 2 of [3]. In [3], the gradient play is
governed by
dxi
∂fi
(x), ∀i ∈ N .
(10)
=
dt
∂xi
Linearizing (10) at the Nash equilibrium point, it can be
derived that the Nash equilibrium is exponentially stable
under (10) if B is Hurwitz [3].
In Theorem 1, a local convergence result to each isolated
Nash equilibrium that satisfies the given conditions is presented. In the following, non-local convergence results are
investigated under stronger conditions.
Assumption 5: Each player’s payoff function fi (xi , x−i )
is concave in xi , ∀i ∈ N . Furthermore, for x, z ∈ RN ,


∂G(x) ∂G(z)
T
≤ −m||x − z||2 ,
(11)
(x − z)
−
∂x
∂z

where m > 0 is a constant.
Remark 4: By this assumption, it can be derived that the
Nash equilibrium of the game is unique. In addition, the
= 0N , where 0N denotes an N
stationary condition ∂G(x)
∂x
dimensional column vector composed of 0, is a sufficient
and necessary condition for x = x∗ , which indicates that
(x − x∗ )T

∂G(x)
≤ −m||x − x∗ ||2 ,
∂x

(12)

for x ∈ RN . Similar to [21], the uniqueness of the equilibrium point can be derived by supposing that


∂G(x) ∂G(z)
< 0,
(13)
−
(x − z)T
∂x
∂z
for all distinct x, z in the considered domain. Assumption 5
is slightly stronger than (13) to derive exponential stability
of the Nash equilibrium under the proposed seeking strategy.
Theorem 2: Suppose that Assumptions 1, 2 and 5 are
satisfied and the players update their actions according to
(3)-(4). Then, for each positive constant ∆, there is a
positive constant δ ∗ (∆) such that for each δ ∈ (0, δ ∗ (∆)),
(x, y) converges exponentially to (x∗ , 1N ⊗ x∗ ) for every
||[(x(0) − x∗ )T , (y(0) − 1N ⊗ x∗ )T ]T || ≤ ∆.
Proof: See Section VI-C for the proof.
Corollary 1: Suppose that Assumptions 1, 2 and 5 are
satisfied, the players update their actions according to (3)i (x)
, ∀i ∈ N are globally Lipschitz.
(4) and the functions ∂f∂x
i
Then, there is a positive constant δ ∗ such that for each δ ∈
(0, δ ∗ ), (x∗ , 1N ⊗ x∗ ) is globally exponentially stable.
Proof: See Section VI-D for the proof.
Remark 5: A typical example that satisfies Assumption
5 is a potential game1 in which the potential function is
strongly concave2.
B. Quadratic Games
In this section, quadratic games are considered. Suppose
that player i’s payoff function is
fi (x) =

N
N N
X
1 XX i
vji xj + gi ,
hjk xj xk +
2 j=1
j=1

(16)

k=1

where hijk , vji , gi are the coefficients of the quadratic terms,
monomial terms and constant terms, respectively. Furthermore, hiii < 0, hijk = hikj , ∀i, j, k ∈ N .
1 Given that the players’ payoff functions are continuously differentiable,
the game is a potential game if there exists a function F (x) such that

∂fi (xi , x−i )
∂F (xi , x−i )
=
,
∂xi
∂xi

(14)

∀i ∈ N . Furthermore, the function F is the potential function [36].
2 A differentiable function f is strongly convex if the following inequality
holds for all points x, y in its domain:


∂f (y)
∂f (x)
−
≥ m||x − y||2 ,
(15)
(x − y)T
∂x
∂y
for some parameter m > 0. A function f is strongly concave if −f is
strongly convex [40].

Assumption 6: The matrix
 1
h11 h112
 h221 h222

H = .
 ..
hN
N1

hN
N2


h11N
h22N 

,

N
· · · hN N

···
···
..
.

>
is strictly diagonally dominant, i.e., |hiii |
P
N
i
|h
|,
∀i
∈
N
.
ij
j=1,j6=i
Remark 6: By Assumption 6, the Nash equilibrium of the
quadratic games exists and is unique. Moreover, the unique
Nash equilibrium is given by x∗ = −H −1 v, where v =
N T
[v11 , v22 , · · · , vN
] [1], [41].
Corollary 2: Suppose that Assumptions 1, 2 and 6 hold,
and the agents update their actions according to (3)-(4). Then,
there exists a positive constant δ ∗ such that for each δ ∈
(0, δ ∗ ), (x∗ , 1N ⊗ x∗ ) is globally exponentially stable.
Proof: See Section VI-E for the proof.

Corollary 3: Suppose that Assumptions 1-2 are satisfied,
matrix H is Hurwitz and the agents update their actions
according to (3)-(4). Then, there exists a positive constant
δ ∗ such that for each δ ∈ (0, δ ∗ ), the Nash equilibrium is
globally exponentially stable given that the quadratic game
is a potential game.
Proof: See Section VI-F for the proof.

Remark 7: The presented results still hold if the estimation dynamics in (4) is changed to
!
N
X
ẏij = −mij
aik (yij − ykj ) + aij (yij − xj ) , (17)
k=1

i, j ∈ N , where mij is a fixed positive constant as the matrix
−m(L ⊗ IN ×N + B0 ), where m = diag{mij }, i, j ∈ N , is
Hurwitz.
Remark 8: The proposed seeking strategy can be leveraged to seek for the Nash equilibrium for games where each
player’s payoff function depends on its own action and all
the other players’ actions. However, for aggregative games
in which each player’s payoff function depends on its own
action and an aggregate of the all the player’s actions, it
is not necessary for each player to estimate all the players’
actions. Instead, a dynamic average consensus protocol can
be utilized to estimate the aggregate of the players’ actions
to reduce the computation cost (see e.g., [34]). Moreover,
for games where the players’ payoff functions depend on
a subset of the players’ actions, an interference graph can
be introduced to describe the interactions among the players
(see e.g., [44]). To further reduce the computation cost, the
adaptation of the proposed seeking strategy for games on
interference graph will be considered in future work.

IV. N UMERICAL E XAMPLES
In this section, three games with a network of 5 players
are considered. The communication graph for the players in
the following examples is depicted in Fig. 1.

5
2.5
x (t)

1

4

1
*
1

x

2

2

3

Fig. 1: Communication graph for the players in the numerical
examples.

The players' actions

x (t)
2
*
2

x

1.5

x (t)
3
*
3

x

1

x (t)
4
*
4

x

0.5

x (t)
5
*
5

x

0

-0.5
0

A. Non-quadratic games

(18)

where m1 = 1, m2 = 5, m3 = 2, m4 = 3, m5 = 2, di =
0, ∀i ∈ {1, 2, · · · , 5} in the simulation and

1 4
x + 5x21 + 2x1 x2 + 5x22 + x2 x3
f (x) = −
12 1

5
+x2 x5 + x23 + x3 x4 + 5x24 + 2x4 x5 + 3x25 .
2
The function f (x) is at its maximum when x = 05 , which
is also the Nash equilibrium of the game. Note that in this
game, all the players’ payoffs are maximized if f (x) is
maximized. Hence, any deviation from the Nash equilibrium
reduces all the players’ payoffs, indicating that adopting the
proposed seeking strategy to seek for the Nash equilibrium
would benefit all the players.
3 Note that in the simulations, the distributed control gain discussed in

Remark 7 is included in the seeking strategy.

4

6

8

10

Time (Seconds)

Fig. 2: The plot of xi (t), i ∈ {1, 2, 3, 4, 5} produced by
(3)-(4) in Example 1.

20
x1 (t)

15

x2 (t)
x3 (t)

The players' actions

Example 1: The players’ payoff functions are f1 (x) =
−x31 + 3x1 x2 , f2 (x) = −(−2x1 + 4x2 + 12 x4 + x5 )2 +
48x2 , f3 (x) = −(x1 + 4x3 − x4 − x5 )2 , f4 (x) = −(2x1 +
4x3 + 8x4 − x5 )2 , f5 (x) = −(x1 + 4x3 + 8x4 + 17x5 )2 , for
players 1-5, respectively.
i (x)
= 0, ∀i ∈ {1, 2, · · · , 5} gives x =
Letting ∂f∂x
i
19 1
1 T
1 1 T
[−1, 1, 72
, 9 , − 18
] or x = [ 23 , 94 , − 19
48 , − 6 , 12 ] . How∂ 2 f1 (x)
> 0 for x1 < 0, the point x =
ever, as
∂x21
1
1 T
[−1, 1, 19
,
,
−
]
does not satisfy Assumptions 3-4. From
72 9
18
the other aspect, the matrix B is strictly diagonally dominant
1 1 T
at x = [ 23 , 94 , − 19
48 , − 6 , 12 ] , with its diagonal elements
1 1 T
being negative. Hence, x = [ 32 , 94 , − 19
satisfies
48 , − 6 , 12 ]
the conditions in Theorem 1 by which there is a δ ∗ > 0
such that for each δ ∈ (0, δ ∗ ), the Nash equilibrium is
exponentially stable.
The players’ actions generated by the seeking strategy
in (3)-(4)3 are plotted in Fig. 2. The initial values for the
variables are set as x(0) = [1 2 0 0 0]T and yi (0) =
[1 2 0 0 0]T , ∀i ∈ {1, 2, · · · , 5}, which are close to x∗ as
only local convergence to the Nash equilibrium is ensured
by Theorem 1. From the simulation result, it can be seen
that the players’ actions generated by the proposed seeking
strategy converge to the Nash equilibrium point under the
given initial conditions, which verifies Theorem 1.
Example 2: The payoff function for player i is
fi (x) = mi f (x) + di , i ∈ {1, 2, · · · , 5},

2

10

x4 (t)
x5 (t)

5

0

0
-5
-10
-15
0

2

4

6

8

10

Time (Seconds)

Fig. 3: The plot of xi (t), i ∈ {1, 2, 3, 4, 5} produced by
(3)-(4) in Example 2.

By direct calculation, it can be verified that in this game,
Assumption 5 is satisfied. Hence, for any bounded initial
condition, there is a δ ∗ > 0 such that for each δ ∈ (0, δ ∗ ),
the players’ actions generated by the proposed method converge to the unique Nash equilibrium by Theorem 2. In
the simulation, the initial values of the variables are set as
xi (0) = 20, yij (0) = 20, ∀i, j ∈ {1, 2, · · · , 5}, which are
far away from the equilibrium point. The players’ actions
generated by the proposed method in (3)-(4) are shown in
Fig. 3. It can be seen from the simulation result that the
players’ actions generated by the proposed method in (3)-(4)
converge to the Nash equilibrium though the initial values of
the variables are far away from the equilibrium point, which
verifies Theorem 2.
B. A quadratic game
Example 3: The payoff function of player i is
fi (x) = −ρi (xi − xdi )2 − (p0

N
X
i=1

xi + q0 )xi ,

R EFERENCES

25

The players' actions

20

x (t)
1
*
x
1

15

x (t)
2
*
2

x

10

x (t)
3
*
3

5

x

x (t)
4
*
4

0

x

x (t)
5
*
5

-5

x

-10
0

2

4

6

8

10

Time (Seconds)

Fig. 4: The plot of xi (t), i ∈ {1, 2, 3, 4, 5} produced by
(3)-(4) in Example 3.

for i ∈ {1, 2, · · · , 5}, where ρi > 0, xdi for i ∈ {1, 2, · · · , 5},
p0 > 0 and q0 are constants. The given example can be
utilized to model the energy consumption game for heating
ventilation and air conditioning systems (see, e.g., [34] and
the references therein).
In the simulation, the parameters are set as ρi = 1, for i ∈
{1, 2, · · · , 5}, p0 = 0.1, q0 = 10 and xdi for i ∈ {1, 2, · · · , 5}
are set as 10, 15, 20, 25, 30, respectively. By direct calculation, it can be derived that the unique Nash equilibrium is
x∗ = [2.0147, 6.7766, 11.5385, 16.3004, 21.0623]T . For this
example, H defined in Assumption 6 is strictly diagonally
dominant with all the diagonal elements being negative.
Hence, the conditions in Corollary 2 are satisfied by which,
it can be concluded that there is a δ ∗ > 0 such that for
each δ ∈ (0, δ ∗ ), the equilibrium is globally exponentially
stable under the given strategy. In the simulation, the initial
conditions of all the variables in (3)-(4) are set as −10. The
players’ actions generated by the proposed method in (3)-(4)
are plotted in Fig. 4. From the simulation result, it can be
seen that though the initial conditions are far away from the
equilibrium, the players’ actions still converge to the Nash
equilibrium, which verifies Corollary 2.

V. C ONCLUSION
Distributed Nash equilibrium seeking by neighboring communication for non-cooperative games among a network of
players is studied in this paper. The players are supposed to
be equipped with an undirected and connected communication graph. Based on a leader-following consensus protocol
and the gradient play, a Nash equilibrium seeking algorithm
is designed. For non-cooperative games, local convergence
to the Nash equilibrium is firstly provided under mild conditions. Then, non-local convergence results are derived for the
non-quadratic games under stronger conditions. For quadratic
games, it is proven that under the proposed seeking strategy,
the Nash equilibrium is globally exponentially stable under
certain conditions.

[1] P. Frihauf, M. Krstic and T. Basar, “Nash equilibrium seeking in noncooperative games,” IEEE Trans. Autom. Control, vol. 57, no. 5, pp.
1192-1207, May 2012.
[2] D. Pozo and J. Contreras, “Finding multiple Nash Equilibria in poolbased markets: A stochastic EPEC approach,” IEEE Trans. Power
Syst., vol. 26, no. 3, pp. 1744-1752, Aug. 2011.
[3] L. Ratliff, S. Burden and S. Sastry, “On the characterization of local
Nash Equilibria in continuous games,” IEEE Trans. Autom. Control,
vol. 61, no. 8, pp. 2301-2307, Aug. 2016.
[4] J. Shamma and G. Arslan, “Dynamic fictitious play, dynamic gradient
play, and distributed convergence to Nash equilibria,” IEEE Trans.
Autom. Control, vol. 50, no. 3, pp. 312-327, Mar. 2005.
[5] K. Vamvoudakis, F. Lewis and G. Hudas, “Multi-agent differential
graphical games: online adaptive learning solution for synchronization
with optimality,” Automatica, vol. 48, no. 8, pp. 1598-1611, Aug. 2012.
[6] J. Poveda, A. Teel and D. Nesic, “Flexible Nash seeking using
stochastic difference inclusion,” in Proc. Amer. Control Conf., pp.
2236-2241, Jul. 2015.
[7] N. Li and J. Marden, Designing games for distributed optimization,”
IEEE J. Sel. Topics Signal Process., vol. 7, no. 2, pp. 230-242, 2013.
[8] J. Marden, G. Arslan and J. Shamma, “Cooperative control and
potential games,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol.
39, no. 6, pp. 1393-1407, 2009.
[9] T. Tanaka, F. Farokhi and C. Langbort, “A faithful distributed implementation of dula decomposition and average consensus algorithms,”
in Proc. IEEE Conf. Decis. Control, pp. 2985-2990, 2013.
[10] B. Gharesifard and J. Cortes, “Distributed convergence to Nash
equilibria in two-network zero-sum games,” Automatica, vol. 49, no.
6, pp. 1683-1692, Jun. 2013.
[11] Y. Lou, Y. Hong, L. Xie, G. Shi and K. Johansson, “Nash equilibrium
computation in subnetwork zero-sum games with switching communications,” IEEE Trans. Autom. Control, vol. 61, 2016.
[12] M. Zhu and E. Frazzoli. “On distributed equilibrium seeking for
generalized convex games,” in Proc. IEEE Conf. Decis. Control,, pp.
4858-4863, Dec. 2012.
[13] M. Zhu and E. Frazzoli, “Distributed robust adaptive
equilibrium computation for generalized convex games,”
http://arxiv.org/abs/1507.01530.
[14] D. Schiro, J. Pang, and U. Shanbhag, ”On the solution of affine
generalized Nash equilibrium problems with shared constraints by
Lemke’s method,” Math. Program., vol. 142, no. 1-2, pp. 1-46, May
2012.
[15] M. Stankovic, K. Johansson, and D. Stipanovic, “Distributed seeking
of Nash Equilibria with applications to mobile sensor networks,” IEEE
Trans. Autom. Control, vol. 57, no. 4, pp. 904-919, Apr. 2012.
[16] S. Liu and M. Krstic, “Stochastic Nash equilibrium seeking for games
with general nonlinear payoffs,” SIAM J. Control Optim., vol. 49, no.
4, pp. 1659-1679, Jan. 2011.
[17] J. Poveda and N. Quijano, “Shahshahani gradient-like extremum
seeking,” Automatica, vol. 58, pp. 51-59, Aug. 2015.
[18] H. Durr, M. Stankovic, C. Ebenbauer and K. Johansson, “Lie bracket
approximation of extremum seeking systems,” Automatica, vol. 49,
no. 6, pp. 1538-1552, Jun. 2013.
[19] A. Mohsenian-Rad, V. Wong, J. Jatskevich, R. Schober and A.
Leon-Garcia, “Autonomous demand-side management based on gametheoretic energy consumption scheduling for the future smart grid,”
IEEE Trans. Smart Grid, vol. 1, no. 3, pp. 320-331, Dec. 2010.
[20] E. Nekouei, T. Alpcan, and D. Chattopadhyay, “Game-Theoretic
Frameworks for demand response in electricity markets,” IEEE Trans.
Smart Grid, vol. 6, no. 2, pp. 748-758, Mar. 2015.
[21] J. Rosen, “Existence and uniquness of equilibrium points for concave
N -person games,” Econometrica, vol. 33, no. 3, p. 520, Jul. 1965.
[22] F. Salehisadaghiani and L. Pavel, “Nash equilibrium seeking by a
gossip-based algorithm,” in Proc. IEEE Conf. Decis. Control, pp.
1155-1160, Dec. 2014.
[23] W. Ren, R. Beard and E. Atkins, “Information consensus in multivehicle cooperative control,” IEEE Control Syst. Mag., vol. 27, no. 2,
pp. 71-82, Apr. 2007.
[24] R. Olfati-Saber and R. Murray, “Consensus problems in networks of
agents with switching Topology and time-delays,” IEEE Trans. Autom.
Control, vol. 49, no. 9, pp. 15201533, Sep. 2004.
[25] Y. Hong, J. Hu and L. Gao, “Tracking control for multi-agent
consensus with an active leader and variable topology,” Automatica,
vol. 42, no. 7, pp. 1177-1182, Jul. 2006.

[26] G. Hu, “Robust consensus tracking of a class of second-order multiagent dynamic systems,” Systems and Control Letters, vol. 61, no. 1,
pp. 134-142, Jan. 2012.
[27] Y. Dong and J. Huang, “A leader-following rendezvous problem of
double integrator multi-agent systems,” Automatica, vol. 49, no. 5,
pp. 1386-1391, May 2013.
[28] G. Wen, G. Hu, W. Yu, J. Cao and G. Chen, “Consensus tracking for
higher-order multi-agent systems with switching directed topologies
and occasionally missing control inputs,” Systems and Control Letters,
vol. 62, no. 12, pp. 1151-1158, Dec. 2013.
[29] R. Freeman, P. Yang and K. Lynch, “Stability and convergence
properties of dynamic average consensus estimators,” in Proc. IEEE
Conf. Decis. Control, pp. 398-403, Dec. 2006.
[30] A. Menon and J. Baras, “Collaborative extremum seeking for welfare
optimization,” in Proc. IEEE Conf. Decis. Control, pp. 346-351, Dec.
2014.
[31] B. Swenson, S. Kar and J. Xavier, “Distributed learning in large-scale
multi-agent games: a modified fictitious play approach,” in Conf. Rec.
46th Asilomar Conf. Signals, Systems and Comput., pp. 1490-1495,
Nov. 2012.
[32] B. Swenson, S. Kar and J. Xavier, “Empirical centroid fictitious play:
an approach for distributed learning in multi-agent games,” IEEE
Trans. Signal Process. , vol. 63, no. 15, pp. 3888-3901, Aug. 2015.
[33] J. Koshal, A. Nedic and U. Shanbhag, “A gossip algorithm for
aggregative games on graphs,” in Proc. IEEE Conf. Decis. Control,
pp. 4840-4845, Dec. 2012.
[34] M. Ye and G. Hu, “Game design and analysis for price-based demand
response: an aggregate game approach,” IEEE Trans. Cybern., vol. 47,
no. 3, pp. 720-730, 2017.
[35] M. Ye and G. Hu, “Distributed extremum seeking for constrained
networked optimization and its application to energy consumption
control in smart grid,” IEEE Trans. Control Syst. Technol., vol. 24,
no. 6, pp. 2048-2058, 2016.
[36] D. Monderer and L. Shapley, “Potential games,” Games Econ. Behav.,
vol. 14, no. 1, pp. 124143, May 1996.
[37] J. Nash, “Non-cooperative games,” Ann. Math., vol. 54, no. 2, p. 286,
Sep. 1951.
[38] H. Khailil, Nonlinear System, Prentice Hall, 3rd edtion, 2002.
[39] R. Horn and C. Johnson, Matrix Analysis, Cambridge, U.K.:Cambridge
Univ. Press, 1985.
[40] D. Bertsekas, A. Nedic and A. Ozdaglar, Convex Analysis and Optimization, Athena Scientific, 2003.
[41] T. Basar and G. Olsder, Dynamic Noncooperative Game Theory, 2nd
ed. Philadelphia, PA: SIAM, 1999.
[42] M. Andreasson, D. Dimarogonas, H. Sandberg and K. Johansson,
“Distributed control of networked dynamical systems: state feedback,
integral action and consensus,” IEEE Trans. Autom. Control, vol. 59,
no. 7, pp. 1750-1764, 2014.
[43] G. Hug, S. Kar and C. Wu, “Consensus+ innovations approach for
distributed multiagent coordination in a microgrid,” IEEE Trans. Smart
Grid, vol. 6, no. 4, pp. 1893-1903, 2015.
[44] C. Tekin, M. Liu, R. Southwell, J. Huang and S. Ahmad, “Atomic
congestion games on graphs and their applications in networking,”
IEEE/ACM Trans. Netw, vol. 20, no. 5, pp. 1541 - 1552, 2012.

VI. A PPENDIX
A. Basics on Graph Theory
For a graph defined as G = (V, E), where E is the edge
set satisfying E ⊆ V × V with V = {1, 2, · · · , N } being
the set of nodes in the network, it is undirected if for every
(i, j) ∈ E, (j, i) ∈ E. Furthermore, j is a neighbor of agent
i if (j, i) ∈ E. An undirected graph is connected if there
is a path between any pair of distinct vertices. The element
on the ith row and jth column of the adjacency matrix A
is defined as aij = 1 if node j is connected with node i,
else, aij = 0. Moreover, we suppose that aii = 0 in this
paper. The Laplacian matrix for the graph L is defined as
L = D−A, where D is a diagonal matrix whose ith diagonal

entry is equal to the out degree of node i, represented by
N
X
aij [26].
j=1

B. Proof of Theorem 1
To facilitate the subsequent analysis, define an auxiliary
system as
∂fi (x)
,i ∈ N.
(19)
ẋi = k̄i
∂xi
Linearizing (19) at x∗ gives
dx
= k̄B(x − x∗ ),
dt

(20)

where k̄ = diag{k̄i }, i ∈ N . Since B is strictly diagonally
dominant with all the diagonal elements being negative,
k̄B is Hurwitz by the Gershgorin Circle Theorem [39].
Hence, the equilibrium point is exponentially stable under
(19). Therefore, there is a function W1 : D0 → R, where
D0 = {x ∈ RN |||x − x∗ || ≤ r0 } for some positive constant
r0 such that
c1 ||x − x∗ ||2 ≤ W1 (x) ≤ c2 ||x − x∗ ||2
T 


∂G(x)
∂W1 (x)
≤ −c3 ||x − x∗ ||2
k̄
∂x
∂x
∂W1 (x)
≤ c4 ||x − x∗ ||,
∂x

(21)

for some positive constants c1 , c2 , c3 , c4 by the Lyapunov
Converse Theorem [38].
Define
ȳ = y − yq ,

(22)

q
q
q
q
q
q
T
where yq = [y11
, y12
, · · · , y1N
, y21
, · · · , y2N
, · · · , yN
N]
q
and yij = xj , ∀i, j ∈ N . Then,

ȳ˙ = ẏ − ẏq

= −((L ⊗ IN ×N + B0 )(ȳ + yq ) + h1 (x)) − ẏq
= −(L ⊗ IN ×N + B0 )ȳ − ẏq .

(23)

Define a Lyapunov candidate function as
V = cW1 (x) + (1 − c)ȳT P1 ȳ,

(24)

where c ∈ (0, 1) is a constant and P1 is a symmetric positive
definite matrix such that
P1 (L ⊗ IN ×N + B0 ) + (L ⊗ IN ×N + B0 )T P1 = Q1 , (25)
where Q1 is a symmetric positive definite matrix as −(L ⊗
IN ×N + B0 ) is Hurwitz by noticing that the communication
graph is undirected and connected.
Define z = [(x − x∗ )T , ȳT ]T . Then, there exists a domain
2
D1 = {z ∈ RN +N |||z|| ≤ r1 }, for some positive constant
r1 such that the time derivative of the Lyapunov candidate

function satisfies
T 


∂G(x)
∂W1 (x)
k̄
V̇ =cδ
∂x
∂x
T 


∂G
∂G
∂W1 (x)
k̄
(y) −
(x)
+ cδ
∂x
∂x
∂x
+ (1 − c)(−(L ⊗ IN ×N + B0 )ȳ − ẏq )T P1 ȳ

(x, y) that belongs to a compact set, the time derivative of
the Lyapunov candidate function satisfies

(26)

+ (1 − c)ȳT P1 (−(L ⊗ IN ×N + B0 )ȳ − ẏq )

≤ − cδc3 ||x − x∗ ||2 − (1 − c)ȳT Q1 ȳ
T

q

∂G
(y)
∂x
+ (1 − c)ȳ˙ T P1 ȳ + (1 − c)ȳT P1 ȳ˙


∂G
∂G
∂G
=δc(x − x∗ )T
(x) + δc(x − x∗ )T
(y) −
(x)
∂x
∂x
∂x
+ (1 − c)(−(L ⊗ IN ×N + B0 )ȳ − ẏq )T P1 ȳ

V̇ =δc(x − x∗ )T

+ (1 − c)ȳT P1 (−(L ⊗ IN ×N + B0 )ȳ − ẏq )

∗

− 2(1 − c)ȳ P1 ẏ + δl1 ||x − x ||||ȳ||,

∂fi
for some positive constant l1 by noticing that the ∂x
(x) for
i
i ∈ {1, 2, · · · , N } are Lipschitz. Let λmin (Q1 ) denote the
minimal eigenvalue of Q1 . Then,

V̇ ≤ − δcc3 ||x − x∗ ||2 − (1 − c)λmin (Q1 )||ȳ||2

T
∂(yq )T
dx
∗
T
+ δl1 ||x − x ||||ȳ|| − 2(1 − c)ȳ P1
∂x
dt
∗ 2
2
≤ − δcc3 ||x − x || − (1 − c)λmin (Q1 )||ȳ||
+ δl2 ||x − x∗ ||||ȳ|| + δl3 ||ȳ||2 ,

(27)
for some positive constants l2 and l3 by noticing that
∂fi
∂fi
∂fi
dxi
= δ k̄i ∂x
(yi ) = δ k̄i ∂x
(yi ) − δ k̄i ∂x
(x∗ ) ≤
dt
i
i
i
∗
δ l̄i1 ||ȳ|| + δ l̄i2 ||x − x ||, for some positive constants l̄i1 and
l̄i2 , for i ∈ N .

 (1−c)λmin (Q1 )
− l3 − l22
δ
. Then, if
Define B1 =
cc3
− l22
3 λmin (Q1 )
0 < δ < 4(1−c)cc
, matrix B1 is symmetric positive
l2 +4cc3 l3
2

definite. If this is the case, V̇ ≤ −δλmin (B1 )||z||2 , where
λmin (B1 ) is the minimal eigenvalue of B1 and λmin (B1 ) >
0.
Noticing that there exist positive constants c̄1 and c̄2 such
that c̄1 ||z||2 ≤ V ≤ c̄2 ||z||2 , it can be derived that
r
(B1 )
c̄2 − δλmin
t
2c̄2
||z(0)||,
(28)
||z(t)|| ≤
e
c̄1
by utilizing the Comparison Lemma [38].
Furthermore, define Er (t) = [(x − x∗ )T , (y −
q
y (x∗ ))T ]T . Then, ||Er (t)|| q≤ ||z(t)|| + ||yq (x) −

≤ − δcm||x − x∗ ||2 + δcl1 ||x − x∗ ||||ȳ||
− (1 − c)ȳT Q1 ȳ − 2(1 − c)ȳT P1 ẏq ,

(30)
for some positive constant l1 .
Hence, for any (x, y) that belongs to the compact set,
V̇ ≤ − δcm||x − x∗ ||2 + δcl1 ||x − x∗ ||||ȳ||

T
∂(yq )T
dx
T
T
− (1 − c)ȳ Q1 ȳ − 2(1 − c)ȳ P1
∂x
dt
≤ − δcm||x − x∗ ||2 − (1 − c)λmin (Q1 )||ȳ||2
+ δcl2 ||x − x∗ ||||ȳ|| + δl3 ||ȳ||2 ,

(31)
for some positive constants l2 and l3 , in which λmin (Q1 )
is the minimal eigenvalue of Q1 . The second inequality in
(31) is derived by using the fact that for (x, y) that belongs
∂fi
∂fi
∂fi
(yi ) = ∂x
(yi ) − ∂x
(x∗ ) ≤
to the compact set, ∂x
i
i
i
∂fi
∂fi
∂fi
∂fi
∗
≤ ¯li1 ||ȳ|| +
∂xi (yi ) − ∂xi (x) +
∂xi (x) − ∂xi (x )
∗
l̄i2 ||x − x ||, for some positive constants l̄i1 and l̄i2 , for all

i ∈ N.
The rest of the proof follows the proof of Theorem 1 and
is omitted here.
D. Proof of Corollary 1

The proof follows the proof of Theorem 2 by further
2
noticing that (30) and (31) hold for any z ∈ RN +N given
i (x)
, ∀i ∈ N , are globally Lipschitz.
that the functions ∂f∂x
i

yq (x∗ )|| ≤ K1 ||z(t)|| ≤ K1 c̄c̄21 e− 2c̄2 t ||z(0)|| ≤
q
δλ
(B1 )
t
− min
2c̄2
K1 c̄c̄21 e
(||Er (0)|| + ||yq (x∗ ) − yq (x(0))||) ≤
q
δλmin (B1 )
K2 c̄c̄21 e− 2c̄2 t ||Er (0)||, for some positive constants
K1 and K2 . Hence, the conclusion is derived.

E. Proof of Corollary 2
If Assumption 6 is satisfied, all the conditions in Theorem
1 are satisfied for the quadratic games. In addition, the Nash
equilibrium is unique for the quadratic game. By the result
in Theorem 1, there exists a δ ∗ > 0 such that for each δ ∈
(0, δ ∗ ), the equilibrium is exponentially stable. Moreover,
the system in (3)-(4) is a linear system, by which it can be
derived that the equilibrium is globally exponentially stable.

C. Proof of Theorem 2

F. Proof of Corollary 3

δλmin (B1 )

Define the Lyapunov candidate function as
c
V = (x − x∗ )T k̄−1 (x − x∗ ) + (1 − c)ȳT P1 ȳ,
2

i (x)
= ∂F∂x(x)
, ∀i ∈ N ,
Note that for potential games ∂f∂x
i
i
2

(29)

where c ∈ (0, 1) is a constant, k̄ = diag{k̄i }, i ∈ N , P1
is defined in (25) and ȳ is defined in (22). Then, there
exist positive constants c̄1 and c̄2 such that c̄1 ||z||2 ≤ V ≤
c̄2 ||z||2 , where z = [(x − x∗ )T , ȳT ]T . Furthermore, for any

2

(x)
(x)
which indicates that ∂∂xfii∂x
= ∂∂xFi ∂x
, ∀i, j ∈ N . Hence,
j
j

∂ 2 fj (x)
∂ 2 fi (x)
∂xi ∂xj = ∂xi ∂xj , ∀i, j ∈ N . Therefore, H is a symmetric

negative definite matrix.
Before we facilitate the subsequent analysis, we firstly
show that k̄H is Hurwitz. Define an auxiliary system as
Φ̇ = k̄HΦ,

(32)

√
√
k̄ is a
where k̄ = diag{k̄i }, i ∈ N . Let Φ = k̄Ψ, where
p
diagonal matrix whose ith diagonal element is k̄i . Then,
p
p
(33)
Ψ̇ = k̄H k̄Ψ.
√
√
Since ΨT k̄H k̄Ψ = ΦT HΦ < 0 for every ||Φ||
√ 6=√0,
hence for every ||Ψ|| 6= 0, it can be derived that, k̄H k̄
is Hurwitz, which indicates that the equilibrium point of (33)
is exponentially stable. Hence, the equilibrium point of (32)
is exponentially stable, by which it can be derived that k̄H
is Hurwitz.
Noticing that k̄H is Hurwitz, the Nash equilibrium is
exponentially stable under the gradient play in (19). The
rest of the proof follows the proof of Theorem 1 by further
2
noticing that the proof of Theorem 1 holds for z ∈ RN +N
for the quadratic potential games.

