1434

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

Distributed Adaptive Convex Optimization on Directed Graphs via
Continuous-Time Algorithms
Zhenhong Li

, Zhengtao Ding , Senior Member, IEEE, Junyong Sun
and Zhongkui Li , Member, IEEE

Abstract—This note considers the distributed optimization problem on directed graphs with nonconvex local objective functions
and the unknown network connectivity. A new adaptive algorithm
is proposed to minimize a differentiable global objective function.
By introducing dynamic coupling gains and updating the coupling
gains using relative information of system states, the nonconvexity
of local objective functions, unknown network connectivity, and the
uncertain dynamics caused by locally Lipschitz gradients are tackled concurrently. Consequently, the global asymptotic convergence
is established when the global objective function is strongly convex
and the gradients of local objective functions are only locally Lipschitz. When the communication graph is strongly connected and
weight-balanced, the algorithm is independent of any global information. Then, the algorithm is naturally extended to unbalanced directed graphs by using the left eigenvector of the Laplacian matrix
associated with the zero eigenvalue. Several numerical simulations
are presented to verify the results.
Index Terms—Adaptive control, consensus control, distributed
convex optimization.

I. INTRODUCTION
The last decades have witnessed a growing interest of research in
distributed optimization, due to its potential applications in a variety
of scenarios such as sensor networks, distributed parameters estimation, power system economic dispatch, and regression of distributed
data (see, e.g., [1]–[4]). An important class of distributed optimization problems is to minimize a global objective function which is the
sum of local objective functions, by local computation and information
exchange with neighboring agents. This kind of distributed optimization problems have been addressed by many researchers from various
perspectives (see, e.g., [5]–[20]).
Most of existing algorithms are based on discrete-time dynamics
(see e.g., [5]–[10], [21]). By designing the consensus-based dynamics,
these discrete-time algorithms can find the solution of the optimization
problem. Recently, continuous-time algorithms have been introduced
Manuscript received March 13, 2017; revised June 23, 2017; accepted
August 10, 2017. Date of publication September 7, 2017; date of current
version April 24, 2018. This work was supported in part by the National
Natural Science Foundation of China under Grant 61473005, and in part
by a Foundation for the Author of National Excellent Doctoral Dissertation
of China, 111 Project (B08015). Recommended by Associate Editor F.
R. Wirth. (Corresponding author: Zhengtao Ding.)
Z. H. Li and Z. Ding are with the School of Electrical and Electronic Engineering, University of Manchester, Manchester M13 9PL,
U.K. (e-mail: zhenhong.li@postgrad.manchester.ac.uk; zhengtao.ding@
manchester.ac.uk).
J. Sun and Z. K. Li are with the State Key Laboratory for Turbulence
and Complex Systems, Department of Mechanics and Engineering Science, College of Engineering, Peking University, Beijing 100871, China
(e-mail: sjymath@pku.edu.cn; zhongkli@pku.edu.cn).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TAC.2017.2750103

,

to solve distributed optimization problems (see, e.g., [11]–[20]). In
[14], [16], and [19], the Newton–Raphson and the zero-gradient-sumbased continuous-time algorithms achieve the global convergence on
undirected graphs using the positive bounded Hessian of local objective
functions. Since the requirement of the positive bounded Hessian, local
objective functions are assumed to be twice differentiable and convex.
The projection based algorithm in [15] removes the requirement of the
twice differentiability of local objective functions by using the projection of their gradients. Two adaptive schemes are designed in [20] to
solve the distributed optimization problem for general linear dynamics
with undirected communications. The global convergence is established when local objective functions are convex and local gradients
are error bounded. The algorithms in [12], [13], and [17] successfully
solve the distributed optimization problem on weight-balanced directed
graphs, which is more challenging than the undirected case. To deal
with the unidirectional gradient flow, the global Lipschitz constants of
local gradients and the network connectivity (i.e., the smallest nonzero
eigenvalue of the Laplacian matrix) are used in the algorithms. However, these parameters require the knowledge of entire network connections and are difficult to get for large scale networks. The global
asymptotic convergence of the algorithms is established when all local
objective functions with global Lipschitz gradients are convex. Moreover, if the gradients of local objective functions are only locally Lipschitz, the global convergence will degrade to semiglobal. The aforementioned results require local objective functions to be convex, which
may bring difficulties in decentralizing optimization problems.
Two main challenges for the distributed optimization with directed
communications are: 1) removing the requirement of global information and establishing the global convergence with only locally Lipschitz
gradients; and 2) relaxing the common assumption (see, e.g., [11]–[20]
and [22]) of convexity of local objective functions. It is well known
in the consensus control design that adaptive techniques can be used
to deal with the unknown network connectivity (see, e.g., [23]–[25]).
However, different from simply borrowing the adaptive techniques
from consensus control design, the design of adaptive schemes in the
distributed optimization has to tackle different features: the nonlinearity of local gradients and the coupled dynamics between system states
and internal states caused by asymmetric communications.
In this note, we propose a new adaptive algorithm on weightbalanced directed graphs. Consequently, the requirements of the Lipschitz constants and the network connectivity are removed. Unlike the
previous results, the convexity properties of local objective functions is
not used in our convergence analysis, which makes local objective functions allowed to be nonconvex. The global asymptotic convergence can
be guaranteed if the sum of local objective functions is strongly convex.
Another contribution of this note is that we extend our results to general unbalanced directed graphs. The global asymptotic convergence
can be guaranteed on unbalanced directed graphs provided that the left
eigenvector of the Laplacian matrix associated with the zero eigenvalue
is available.

0018-9286 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

The remaining sections of this note are organized as follows.
Section II is devoted to notations and mathematical preliminaries.
In Section III, the distributed optimization problem is formulated. In
Section IV, an adaptive distributed algorithm is proposed for strongly
connected weight-balanced directed graphs. In Section V, a different
adaptive algorithm is designed for unbalanced directed graphs. Simulations are included in Section VI, and conclusions are drawn in
Section VII.
II. NOTATIONS AND PRELIMINARIES
Throughout this section, we introduce our notations and some basic
concepts of convex functions and graph theory.
A. Notations
Let R, Rn , and Rn ×m denote the sets of real numbers, real vectors
of dimension n, and real matrices of size n × m, respectively. R> 0
denotes the positive real numbers. The superscript T denotes the transpose of a real matrix. The identity matrix of dimension n is denoted
by In , and the column vector of size n with all entries equal to one is
denoted by 1n . For a vector a ∈ Rn , a is the Euclidean norm of a;
for a matrix A ∈ Rn ×n , |A| is the spectral norm of A (also known as
its maximum singular value). The ith eigenvalue of the matrix A is denoted by λi (A). Besides, the symbol ⊗ denotes the Kronecker product
of the matrices, which has the properties that (B ⊗ C)(D ⊗ E) =
(BD) ⊗ (CE) and (B ⊗ C)T = B T ⊗ C T , where B, C, D, and
E are matrices with proper dimensions. For a differentiable function f : Rn → R, f denotes the gradient of f ; f is strongly convex over a convex set Ω ⊆ Rn iff there exists m ∈ R> 0 such that
(x − y)T (f (x) − f (y)) > m x − y2 for all x, y ∈ Ω, x = y; f
is locally Lipschitz at x ∈ Rn , if there exists a neighborhood W of
x and M ∈ R≥0 such that |f (y) − f (z)| ≤ M y − z for y, z ∈ W;
f is locally Lipschitz on Rn , if it is locally Lipschitz at x for all
x ∈ Rn .
B. Graph Theory
The information flow among agents is described by a directed graph.
Let a triplet G = (V, E, A) be a directed graph, where V = {1, . . . , N }
is a set of nodes, E ⊆ V × V is a set of edges, and A = [ai j ] ∈ RN ×N
is a weighted adjacency matrix. An edge (i, j) ∈ E represents that ith
agent can receive the information from jth agent, but not vice versa.
The jth agent is a neighbor of ith agent if (i, j) ∈ E. A directed path
from node i1 to node iq is a sequence of ordered edges in the form of
(i1 , i2 ), . . . , (iq −1 , iq ). A directed graph is strongly connected, if there
exists a directed path connecting every pair of nodes. The weighted
adjacency matrix A is defined as ai j > 0 if (i, j) ∈ E, otherwise
ai j = 0. Due to the fact that there is no self-loop in graph, ai i = 0
for all nodes. The Laplacian matrix L = [li j ] ∈ RN ×N associated with

the directed graph G is defined as li i = N
j = 1 ai j and li j = −ai j
for i = j. The eigenvalues of a symmetric L can be ordered as
0 ≤ λ2 (L) ≤ · · · ≤ λN (L). A directed graph G is weight balanced iff
1TN L = 0TN .
Lemma 1 (see [26]): Let L ∈ RN ×N be the Laplacian matrix of a
strongly connected directed graph G. The following properties hold.
1) Matrix L has a simple zero eigenvalue corresponding to the right
eigenvector 1N , and all nonzero eigenvalues have positive real
part.
2) Let r = [r1 , r2 , . . . , rN ]T , ri ∈ R> 0 , i = 1, 2, . . . , N , be the
left eigenvector of L associated with the zero eigenvalue and
T
> λ2N( L̄) ,
R = diag(r1 , r2 , . . . , rN ). Then, minζ T x = 0 , x = 0 xx TL̄x
x
where L̄  RL + LT R and ζ is any vector with positive entries. Moreover, r = 1N iff G is strongly connected and weight
balanced.

1435

III. PROBLEM STATEMENT AND EQUIVALENT FORMULATIONS
In this section, we consider a set of N agents interacting over a
directed connection graph. Each agent has a local cost function
fi :

f
(z).
Rn → R. The global cost function is defined as f (z) = N
i= 1 i
The objective of this note is to design a continuous-time distributed
algorithm such that each agent can solve the optimization problem
minz ∈Rn f (z)

(1)

by using its own and neighboring information.
Following assumptions are supposed to be satisfied throughout this
note.
Assumption 1: The global cost function f is differentiable and
strongly convex over Rn . The local cost function fi is differentiable
and its gradient is locally Lipschitz on Rn , i.e., for any compact set
U ⊂ Rn , there always exists Mi ∈ R≥0 such that |fi (y) − fi (z)| ≤
Mi y − z for y, z ∈ U.
From Assumption 1, the strong convexity of the global cost function
f guarantees the unique solution of the problem (1).
Assumption 2: The communication graph G is strongly connected.
Under Assumption 2, the problem (1) can be reformulated as
minx i ∈Rn f˜ =

N


fi (xi ), subject to (L ⊗ In )x = 0

(2)

i= 1

where xi is the state of ith agent and x = [xT1 , xT2 , . . . , xTN ]T is the
state of network. In view of Lemma 1, (L ⊗ In )x = 0 iff x = 1N ⊗ τ ,
for some τ ∈ Rn . Then, it can be concluded that the problem (1) is
equivalent to the problem (2). By reformulating the problem (1), the
problem is transformed into a distributed minimization problem under
a consensus condition.
IV. DISTRIBUTED ADAPTIVE CONTINUOUS-TIME CONVEX
OPTIMIZATION ALGORITHM ON WEIGHT-BALANCED
DIRECTED GRAPHS
In this section, we propose a fully distributed algorithm to solve the
problem (2). Consider the following algorithm with dynamic coupling
gains:
v̇i = γ1 (αi + βi )

N


ai j (xi − xj )

(3a)

j=1

ẋi = − γ2 fi (xi ) − γ1 (αi + βi )

N


ai j (xi − xj )

j=1

−

N


ai j (vi − vj )

(3b)

j=1

α̇i = eTi ei

(3c)

whereβi = eTi ei , vi ∈ Rn is the internal state of the algorithm,
ei = N
j = 1 ai j (xi − xj ) is the relative error, γ1 , γ2 ∈ R> 0 are control
gains, and αi and βi are the dynamic coupling gains with αi (0) ∈ R> 0 .
Note that (3) is distributed, since
each agent only communicates with
its neighboring agents. The term N
j = 1 ai j (vi − vj ) in (3b) implies
that agents need to transmit the internal states of the algorithm via the
communication graph G.
The dynamics of the network can be written in a compact form as
v̇ = γ1 [(α̂ + β̂)L ⊗ In ]x

(4a)

ẋ = − γ2 f˜(x) − γ1 [(α̂ + β̂)L ⊗ In ]x − (L ⊗ In )v

(4b)

α̇i = eTi ei

(4c)

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

1436

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

where βi = eTi ei , α̂ = diag(α1 , α2 , . . . , αN ), β̂ = diag(β1 , β2 , . . . ,
βN ), v = [v1T , v2T , . . . , vNT ]T is the internal state of the network, and
f˜(x) = [f1 (x1 )T , f2 (x2 )T , . . . , fN (xN )T ]T is the vector of the
network gradient.
Lemma 2: Under Assumptions 1 and 2, if the communication graph
G is weight-balanced, the equilibrium point of (4) is an optimal solution
of the distributed optimization problem (2).
Proof: We can obtain the equilibrium point (x̃, ṽ) of (4) from
0 = γ1 [(α̂ + β̂)L ⊗ In ]x̃

(5)

0 = −γ2 f˜(x̃) − γ1 [(α̂ + β̂)L ⊗ In ]x̃ − (L ⊗ In )ṽ.

(6)

In the sequel, we will show that the equilibrium point is a solution
of the problem (2). Deducing from (5) and (6), the equilibrium point
satisfies

Consider following positive definite functions
V1 =

1
(αi − α)2
2 i= 1

(12)

V2 =

1
(2αi + βi )ηiT ηi
2 i= 1

(13)

N

N

where α is a positive scalar to be designed later. The time derivatives
of (12) and (13) along the trajectory of (11) are given by
V̇1 =

N


(αi − α)eTi ei

i= 1

=

N


ηiT (αi − α)ηi = η T [(α̂ − αIN ) ⊗ In ]η

(14)

i= 1

x̃i = x , i = 1, 2, . . . , N
N


(7)
V̇2 =

ai j (ṽi − ṽj ) = − γ2 fi (x )


(8)

(2αi + 2βi )ηiT η˙i + βi α̇i

i= 1

= − 2γ2 η T [(α̂ + β̂)L ⊗ In ]h

j=1

γ2

N


N


fi (x ) = (1TN L ⊗ In )ṽ = 0

(9)

i= 1

− γ1 η T {[(α̂ + β̂)(L + LT )(α̂ + β̂)] ⊗ In }η.

N

where x ∈ Rn . Since f˜ is strongly convex, i = 1 fi (x ) = 0 implies that (x̃, ṽ) is a global minimizer of (2). Note that if (x̃, ṽ) is a

solution of (2), so is (x̃, ṽ + 1N ⊗ κ), κ ∈ Rn .
Theorem 1: Under Assumptions 1 and 2 , if the communication
graph G is weight-balanced, the dynamic algorithm (3) solves the distributed optimization problem (2) for any xi (0), vi (0) ∈ Rn . Furthermore, the dynamic coupling gains αi converge to some finite steadystate values.
Proof: Theorem 1 is proved by showing that the trajectories of
(x, v) converge to the equilibrium point of (4). Transferring the
equilibrium point (x̃, ṽ) to the origin by the state transformation
μ = v − ṽ, g = x − x̃, we can further write the network dynamics as
μ̇ = γ1 [(α̂ + β̂)L ⊗ In ]g

(10a)

j=1

(10b)

Let η̃ = [(α̂ + β̂) ⊗ In ]η. Then, we can obtain η̃ T [(α̂ + β̂)−1 1N ⊗
1n ] = g T (LT 1N ⊗ 1n ) = 0.
Since all the entries of (α̂ + β̂)−1 1N ⊗ 1n are positive, in light of
Lemma 1, it follows that:
η T {[(α̂ + β̂)L(α̂ + β̂) + (α̂ + β̂)LT (α̂ + β̂)] ⊗ In }η
≥


N
T
where βi = [ N
j = 1 ai j (gi − gj )] [
j = 1 ai j (gi − gj )]. To get (10),
we have used (L ⊗ In )[(α̂ + β̂) ⊗ In ](L ⊗ In ) = [L(α̂ + β̂) ⊗
In ](L ⊗ In ).
The distributed optimization problem (2) is solved by the algorithm
(3) if limt →∞ μ(t) = 1N ⊗ κ, κ ∈ Rn , and limt →∞ g(t) = 0.
Introducing another state transformation = (L ⊗ In )μ, η = (L ⊗
In )g, we obtain the dynamics

where L̂ = L + LT and the equality holds iff η = 0. By incorporating
this fact into (15), we have
V̇1 + V̇2 ≤ −

γ1 λ2 (L̂) T
η [(α̂ + β̂)2 ⊗ In ]η
N

⊗ In ] − 2γ2 η T [(α̂ + β̂)L ⊗ In ]h.

α̇i = ηiT ηi
where h = f˜(g + x̃) − f˜(x̃) and βi = ηiT ηi .

(16)

H = {(x, v) ∈ RN n × RN n |
x − x̃ ≤ x(0) − x̃ , for any v ∈ RN n }.
Since x̃ is unique, H is compact for x. Based on Assumption 1, there exists Mi ∈ R> 0 such that fi (xi ) − fi (x̃i ) ≤ Mi xi − x̃i  , i =
1, 2, . . . , N , for (x, v) ∈ H. In the following, we show that η and
converge to 0. Using Young’s inequality, we deduce that
−2η T [(α̂ + β̂)L ⊗ In ] ≤

(11a)

γ1 λ2 (L̂) T
η [(α̂ + β̂)2 ⊗ In ]η
4N
+

η̇ = − γ2 (L ⊗ In )h − γ1 [L(α̂ + β̂) ⊗ In ]η
− (L ⊗ In )

λ2 (L̂) T
η [(α̂ + β̂)2 ⊗ In ]η
N

Define a convex set containing (x̃, ṽ) as
(10c)

j=1

˙ = γ1 [L(α̂ + β̂) ⊗ In ]η

(15)

+ η T [(α̂ + β̂ − αIN ) ⊗ In ]η − 2η T [(α̂ + β̂)L

ġ = − γ2 (f˜(g + x̃) − f˜(x̃)) − γ1 [(α̂ + β̂)L ⊗ In ]g
− (L ⊗ In )μ
 N
T  N



ai j (gi − gj )
ai j (gi − gj )
α̇i =

− 2η T [(α̂ + β̂)L ⊗ In ] + η T (β̂ ⊗ In )η

(11b)

−2γ2 η T [(α̂ + β̂)L ⊗ In ]h ≤

4N λN (LT L) T
γ1 λ2 (L̂)

(17)

γ1 λ2 (L̂) T
η [(α̂ + β̂)2 ⊗ In ]η
2N

(11c)
+

2N γ22 Mm2 a x λN (LT L) T
g g (18)
γ1 λ2 (L̂)

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

1437

where Mm a x = max(M1 , M2 , . . . , MN ). To obtain (18), we have applied the fact


hT (LT L ⊗ In )h ≤ LT L h2

2 ( L̂)+ 2 λN (L L))
Choosing α ≥ 1 7 N λN (L L)(λ
+ + 1 + γ λN ( L̂) ,
γ 1 λ2 ( L̂) 3
1 2
we can obtain from (24)
⎡
⎤

2
2N
T ⎣ γ1 λ2 (L̂)
IN
⊗ In ⎦ η
−
(α̂ + β̂) −
V̇ ≤ η
4N
γ1 λ2 (L̂)

≤ Mm2 a x λN (LT L)g T g.

(19)

Based on (16), (17), and (18), we can obtain

− ηT η −

γ1 λ2 (L̂) T
V̇1 + V̇2 ≤ −
η [(α̂ + β̂)2 ⊗ In ]η
4N
+ η T [(α̂ + β̂ − αIN ) ⊗ In ]η +
+

4N λN (LT L) T
γ1 λ2 (L̂)

2N γ22 Mm2 a x λN (LT L) T
g g.
γ1 λ2 (L̂)

(20)

1
( + η)T ( + η).
2

(21)

Consider
V3 =

The time derivative of V3 along (11) can be written as
V̇3 =

T

˙ + η T η̇ +

T

η̇ + η T ˙

= − γ2 η T (L ⊗ In )h − η T (L ⊗ In ) − γ2 T (L ⊗ In )h
−

T

(L ⊗ In )

γ 2 M 2 λN (LT L)(λ2 (L̂) + 8) T
g g.
+ 2 m ax
4λ2 (L̂)

(22)

To get (22), we have used (19) and the facts that
(L ⊗ In ) =

T

(L̂ ⊗ In )
, λi (LLT ) = λi (LT L).
2

Consider a Lyapunov function candidate for the whole closed-loop
system as
V = V1 + V2 +

17N λN (LT L)
γ1 λ2 (L̂)2

V3 .

(23)

Applying the results (20) and (22), and from (23), we have

γ1 λ2 (L̂)
T
(α̂ + β̂)2 + (α̂ + β̂) − αIN
−
V̇ ≤ η
4N


17N λN (LT L)(λ2 (L̂) + 2λN (LT L))

IN ⊗ In η
γ1 λ2 (L̂)3

2N γ22 Mm2 a x λN (LT L)
N λN (LT L) T
+
−
4γ1 λ2 (L̂)
γ1 λ2 (L̂)
+

+

17N γ22 Mm2 a x λN (LT L)2 (λ2 (L̂) + 8)
4γ1 λ2 (L̂)3

g T g.

(24)

Let δ ∈ R> 0 be an arbitrary small positive constant. Since
g(x(t))T g(x(t)) ≤ g(x(0))T g(x(0)), for (x, v) ∈ H, when η T η ≥ δ,
there always exists a sufficiently large positive scalar ∈ R> 0 such that
ηT η ≥

2
T
2 N γ 22 M m
a x λN (L L)

γ 1 λ2 ( L̂)

+

2
T
2
1 7 N γ 22 M m
a x λN (L L) (λ2 ( L̂)+ 8 )

4 γ 1 λ2 ( L̂) 3

N λN (LT L) T
4γ1 λ2 (L̂)

T

.

(25)

For the reason that V̇ is continuous and negative definite for
η T η ≥ δ, V (t) is bounded and αi converge to some positive values. Applying LaSalle’s invariance principle, we can conclude that
converges
asymptotically
to zero and η converges to a residual set


D = η| η2 ≤ δ . Since δ can be chosen as an arbitrary small
constant, we assume that η converges to zero. It then follows that
(x, v) converges to the set I = {(x, v) ∈ H | x = x̃ + 1N ⊗ ξ, v =
ṽ + 1N ⊗ κ, ξ, κ ∈ Rn } as t → ∞. Note that , δ, and α are auxiliary
variables only used for convergence analysis. These variables are not
the parameters in the algorithm (3).
In the following, we prove ξ = 0n by seeking a contradiction. Assume ξ = 0n , based on (9) and (3b), the dynamics of ξ can be written
as
1
˙ = 1 (1T ⊗ In )ẋ
ξ˙ = (1TN ⊗ In )(x̃˙ + 1N ⊗ ξ)
N
N N

λ2 (L̂) + 2λN (LT L) T
λ2 (L̂) T
η η−
≤
4
λ2 (L̂)

T

T

g T g.

=−

N
N
γ2 
γ2 
fi (x + ξ) +
fi (x )
N i= 1
N i= 1

=−

γ2
f (x + ξ).
N

(26)

From (26), we can deduce that ξ moves toward the point that satisfies
f (x + ξ) = 0. For the reason that f is strongly convex, the critical
point x of f is unique, which, since ξ = 0, is a contradiction.
Finally, we can conclude that the trajectories of (3) that
start from x(0), v(0) ∈ RN n converge to the global minimizer
(x̃, ṽ + 1N ⊗ κ), for some κ ∈ Rn . The algorithm (3) solves the
distributed convex optimization problem.

Remark 1: For any xi (0), vi (0) ∈ Rn , we can always define a convex set H such that all the trajectories of algorithm (3) converge to the
global minimizer set I = {(x, v) ∈ H | x = x̃, v = ṽ + 1N ⊗ κ, κ ∈
Rn }. The convergence of (3) is global, while the results shown
in [12] and [13] can only achieve semiglobal convergence when
fi , i = 1, 2, . . . , N , are locally Lipschitz.
Remark 2: The global convergence of the algorithm can be guaranteed for any γ1 , γ2 ∈ R> 0 . The values of γ1 , γ2 , and αi (0) can be
used to improve the transient process of the algorithm. The static control gains γ1 , γ2 can be interpreted as the weights of local gradient
and agent connectivity, respectively; increasing γ1 and γ2 will increase
convergence rate, however, high static control gains may cause more oscillations. The transient performances of the proposed algorithms also
are related to the static control gain ratio m in (αγi2(0 ))γ 1 . By increasing
the ratio m in (αγi2(0 ))γ 1 , the consensus effects will be enhanced.
Remark 3: Many previous results have been obtained for the undirected connected graph, but the value of network connectivity is required when the communication graph is a weight-balanced directed
graph (e.g., [12] and [13]). By introducing the dynamic coupling gains
αi and βi and continuously updating the coupling gains using the relative errors ei , we solve the distributed optimization without using the
network connectivity. Of course, algorithm (3) can also be applied to
undirected connected graphs.

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

1438

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

Fig. 1. Simulation results of Example 1; trajectories of the states xi (top) and the adaptive coupling gains α i (bottom). (a) Weight-balanced graphs
case, γ1 = 4, γ2 = 1; (b) weight-balanced graphs case, γ1 = 4, γ2 = 8; and (c) unbalanced graphs case, γ1 = 4, γ2 = 1.

where f¯(x) = [f¯1 (x1 )T , f¯2 (x2 )T , . . . , f¯N (xN )T ]T . It follows
that the equilibrium point satisfies

N


x̃i = x , i = 1, 2, . . . , N

(30)

ai j (ṽi − ṽj ) = − γ2 fi (x )

(31)

j=1

Fig. 2.

Strongly connected weight-balanced communication graph G1 .

γ2

N


ri f¯i (x ) = γ2

i= 1

N


fi (x )

i= 1
T

= (r L ⊗ 1Tn )ṽ = 0.
Remark 4: In Assumption 1, only the global cost function is required to be strictly convex, while the local cost functions fi can be
any differentiable functions. Note that the local cost functions are also
assumed to be strongly convex in [12]–[15] and [22]. Different from
[14], there is no restriction on the local cost functions fi to be twice
differentiable with the proposed scheme.
V. DISTRIBUTED ADAPTIVE CONTINUOUS-TIME CONVEX
OPTIMIZATION ALGORITHM ON UNBALANCED DIRECTED GRAPHS
In the previous section, a distributed adaptive algorithm is proposed
to solve the problem (2) with strongly connected and weight-balanced
graphs. Here, we extend our analysis for strongly connected unbalanced
graphs. The distributed optimization algorithm with dynamic coupling
gains can be designed as
v̇i = γ1 (αi + βi )

N


ai j (xi − xj )

(27a)

j=1

ẋi = − γ2 f¯i (xi ) − γ1 (αi + βi )

N


Since f˜ is strongly convex, invoking (32), one can obtain that (x̃, ṽ)

is a solution of (2) and so is (x̃, ṽ + 1N ⊗ κ), κ ∈ Rn .
Theorem 2: Under Assumptions 1 and 2, if the communication
graph G is unbalanced, the dynamic algorithm (27) solves the distributed optimization problem (2) for any xi (0), vi (0) ∈ Rn . Moreover, the dynamic coupling gains αi will converge to some finite
steady-state values.
Proof: The proof is stated in the Appendix.

Remark 5: Similar to the algorithm (3), the adaptive algorithm (27)
guarantees the global stability for the local cost functions fi with
locally Lipschitz gradients. Only the global cost function is assumed
to be strongly convex, whereas the local cost functions fi can be any
differentiable functions.
Remark 6: In the algorithm (27), r is the left eigenvector of L
associated with the zero eigenvalue, which implies that the algorithm
(27) needs the information of the Laplacian matrix for unbalanced
directed graphs.

ai j (xi − xj )

VI. SIMULATION STUDIES

j=1

−

N


ai j (vi − vj )

(27b)

j=1

α̇i = eTi ei

(32)

(27c)

where βi = eTi ei , f¯i (xi ) = r1i fi (xi ), and r = [r1 , r2 , . . . , rN ]T

is the left eigenvector of L associated with the zero eigenvalue. Other
notations are the same as in the previous section.
Lemma 3: Under Assumptions 1 and 2 , if the communication graph
G is unbalanced, the equilibrium point of (27) is an optimal solution of
the distributed optimization problem (2).
Proof: The equilibrium point (x̃, ṽ) of (27) is obtained as
0 = γ1 [(α̂ + β̂)L ⊗ In ]x̃

(28)

0 = −γ2 f¯(x̃) − γ1 [(α̂ + β̂)L ⊗ In ]x̃ − (L ⊗ In )ṽ

(29)

In this section, we show two simulation examples. The first example illustrate the effectiveness of above-mentioned theoretical results.
The second example is an application of our algorithms in solving a
regression problem.
A. Example 1
Consider a network of 60 agents whose local cost functions on R
are described by
fj = sin(x + j),
4

f1 0 + j = cos(ln(x + j)2 + 1)),

f2 0 + j = (x + j) 3 + e0 . 1 (x + j ) ,

f3 0 + j = (x + j − 4)4 ,

f4 0 + j = (x + j + 3)2 ,

f5 0 + j = 

(x + j)2
(x + j)2 + 1

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

(33)

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

1439

Fig. 3. Simulation results of Example 2. (a) Estimated errors xi − x  (top) and the adaptive coupling gains α i (bottom) of Example 2 in
continuous-time mode; and (b) estimated errors xi − x  (top) and the adaptive coupling gains α i (bottom) in discrete-time mode.

for j = 1, 2, . . . , 10. Note that f1 , f2 , . . . , f2 0 are periodic functions
that are nonconvex. The gradients of f2 1 , f2 2 , . . . , f4 0 are locally Lipschitz. Moreover, the gradients of f2 1 , f2 2 , . . . , f3 0 are undifferentiable,
i.e., we cannot get the
Hessians of f2 1 , f2 2 , . . . , f3 0 . Since the global
cost function f (x) = 6i =0 1 fi (x) is strongly convex, the global minimizer x is unique.
Two cases of connection graphs are considered for this example.
When the connection graph is strongly connected weight balanced, the
adaptive algorithm (3) is applied to solve the distributed optimization
problem. The initial states of xi (0), vi (0) ∈ R are chosen randomly
within [−2, 0.5], and the initial values of coupling gains αi (0) = 0.01.
For the convergence performance comparisons, the static control gains
are chosen as γ1 = 4, γ2 = 1 and γ1 = 4, γ2 = 8, respectively.
When the connection graph is strongly connected unbalanced, the
adaptive algorithm (27) is applied. The initial values are the same as
that of the weight-balanced case, and the parameters are chosen as
γ1 = 4, γ2 = 1.
In Fig. 1(a) (top), (b) (top), and (c) (top), it can be observed that all
the trajectories of xi converge to the global minimizer x (in a black
dash-dot line). In Fig. 1(a) (bottom), (b) (bottom), and (c) (bottom), it
can be observed that the dynamic coupling gains αi converge to some
positive steady-state values. From the simulation results in Fig. 1(a) and
(b), we can see that, when γ2 is increased from 1 to 8, more consensus
efforts are needed to deal with the gradients of local objective functions.
The dynamic gains αi in Fig. 1(b) (bottom) converge to larger positive
values than the αi in Fig. 1(a) (bottom).
From Fig. 1(a) and (c), we can conclude that our adaptive optimization algorithms can solve the distributed convex optimization problem
with the unknown network connectivity and the nonconvex local objective functions on both balanced and unbalanced directed graphs.
B. Example 2
In this example, we examine the performance of our proposed algorithms in a practical scenario (e.g., regression problem [16]). Due to the
limitation of pages, we only show an example of applying algorithm (3).
The objective of this task is to obtain a predictor of house value by
using UCI Housing datasets (available at http://archive.ics.uci.edu/ml
/datastes/Housing). Sometimes datasets come from different users, and
they do not want to share their private information with others. Hence,
it is meaningful to employ distributed optimization algorithms.
Consider a network of 6 users interacting over G1 shown in Fig. 2,
and each user has 50 datasets. The local cost functions are obtained as
fi (xi ) =

50

1
j=1

2

(νj − dTj xi )2

∀i = 1, 2, . . . , 6

where xi ∈ R3 is the vector of coefficient for linear predictor, ν̂j =
dTj xi is the predicted median monetary value of the house, νj ∈ R
is the median monetary value of the house, dj = [cj , pj , 1]T ∈ R3 ,
and cj , pj ∈ R are the per capita crime rate by town and lower status
of the population, respectively. The static control gains are chosen
as γ1 = 1, γ2 = 0.2, and other parameters are chosen in the same
way as that of Example 1. Fig. 3(a) illustrates that the estimated xi
converge to the global optimal value x ∈ R3 , which is verified by a
centralized least squares method. The optimization errors xi − x 
are up bounded by 0.001 after 300 s, and the dynamic coupling gains αi
converge to positive steady-state values. We also emulate the simulation
in discrete-time mode, by setting sample time as 0.1 s. Fig. 3(b) shows
that the optimization errors are up bounded by 0.001 after 300 s (3000
iterations). Although the trajectories of xi − x  and αi are slightly
different in the discrete-time case, the algorithm still guarantee the
convergence.
VII. CONCLUSION
In this note, we have proposed two new adaptive algorithms to
solve the distributed optimization problem on directed graphs. By carefully designing adaptive laws, our proposed algorithms achieve global
asymptotic convergence when the global cost function is strongly convex and the gradients of local objective functions are locally Lipschitz.
For the strongly connected and weight-balanced graphs, the proposed
algorithm is independent of any global information of communication
graphs, and hence fully distributed. For strongly connected unbalanced
graphs, the left eigenvector of the Laplacian matrix associated with
the zero eigenvalue is required. Simulation results have illustrated the
effectiveness and potential applications of the theoretical results.

APPENDIX
PROOF OF THEOREM 2
Applying the two same state transformations used in Section IV, the
network dynamics can be written as
˙ = γ1 [L(α̂ + β̂) ⊗ In ]η

(34a)

η̇ = − γ2 (L ⊗ In )h̄ − γ1 [L(α̂ + β̂) ⊗ In ]η
− (L ⊗ In )
α̇i = ηiT ηi
where h̄ = f¯(g + x̃) − f¯(x̃) and βi = ηiT ηi .

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

(34b)
(34c)

1440

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

Let

1
ri (2αi + βi )ηiT ηi .
V4 =
2 i= 1
N

(35)

Following similar analysis in (14)–(20), for (x, v) ∈ H, it is easy to
get the time derivative of V1 + V4 along the trajectory of (34)
V̇1 + V̇4 ≤ −

·

T

+

2
T
4N rm
a x λN (L L)
γ1 λ2 (L̄)

where R = diag(r1 , r2 , . . . , rN ), L̄ = RL + LT R,
max(r1 , r2 , . . . , rN ), and rm in = min(r1 , r2 , . . . , rN ).
Consider the following positive definite function:
1
ri ( i + ηi )T ( i + ηi ).
2 i= 1

(36)
rm a x =

N

(37)

V̇5 =

ri ( Ti ˙ i + ηiT η̇i +

T
T
i η̇i + ηi ˙ i )

i= 1

≤

2
T
λ2 (L̄) + 2rm
λ2 (L̄) T
a x λN (L L) T
η η−
4
λ2 (L̄)

+

2
2
T
γ22 rm
a x M m a x λN (L L)(λ2 (L̄) + 8) T
g g
2
4rm
in λ2 (L̄)

(38)

where we have used Lemma 1 and the fact T (RL ⊗ In ) =
T ( L̄⊗I )
n
.
2
A Lyapunov function candidate for the whole closed-loop system is
chosen as
V̄ = V1 + V4 +

2
T
17N rm
a x λN (L L)
V5 .
2
γ1 λ2 (L̄)

(39)

Applying the results (36) and (38), and from (39), we can obtain

γ1 λ2 (L̄)
V̄˙ ≤ η T
(α̂ + β̂)2 + (α̂ + Rβ̂) − αIN
−
2N

2
T
2
T
17N rm
a x λN (L L)(λ2 (L̄) + 2rm a x λN (L L))
+
I
N
γ1 λ2 (L̄)3


2
2
T
4N γ22 rm
a x M m a x λN (L L)
⊗ In η +
2
γ1 rm in λ2 (L̄)

2 4
2
17N γ2 rm a x Mm a x λN (LT L)2 (λ2 (L̄) + 8)
+
gT g
2
3
4γ1 rm
in λ2 (L̄)
−

such

¯η T η ≥

that

4
2
T
2
1 7 N γ 22 r m
a x M m a x λN (L L) (λ2 ( L̄)+ 8 )
2
4γ1 rm
λ ( L̄) 3
in 2

2
2
T
4 N γ 22 r m
a x M m a x λN (L L)
2
γ1 rm
λ ( L̄)
in 2

g T g.

2
T
N rm
a x λN (L L) T
.
4γ1 λ2 (L̄)

By Young’s inequality, we have

γ1 λ2 (L̄) 2
γ1 λ2 (L̄) 2
η T (α̂ + Rβ̂)η ≤ η T
β +
α
4N
4N


2
N (rm
a x + 1)
IN ⊗ In η.
+
γ1 λ2 (L̄)

V̄˙ ≤ η T
−


−

4N
γ1 λ2 (L̄)
(α̂ + β̂)2 −
In
4N
γ1 λ2 (L̄)




⊗ IN η

2
T
N rm
a x λN (L L) T
4γ1 λ2 (L̄)

≤ − η T (α̂(0) ⊗ In )η −

2
T
N rm
a x λN (L L) T
4γ1 λ2 (L̄)

(42)

where we have used the fact that αi are monotonically increasing and
αi (0) ∈ R> 0 . The rest of proof follows similarly as that of Theorem 1.
REFERENCES

The time derivative of (37) is described by
N


+

¯ ∈ R> 0

2
T
2
T
N (r 2 + 1x )
17N rm
a x λN (L L)(λ2 ( L̄)+ 2 r m a x λN (L L))
+ γ 1 λm2 (aL̄)
+ γ 1 4λN
, we have
γ 1 λ2 ( L̄) 3
2 ( L̄)

2
2
T
4N γ22 rm
a x M m a x λN (L L) T
g g
2
γ1 rm in λ2 (L̄)

V5 =

scalar

By incorporating this fact and (41) into (40), and choosing α ≥ ¯ +

γ1 λ2 (L̄) T
η [(α̂ + β̂)2 ⊗ In ]η
2N

+ η T [(α̂ + Rβ̂ − αIN ) ⊗ In ]η +

Let δ̄ ∈ R> 0 be an arbitrary small positive constant. Since
for
(x, v) ∈ H,
when
g(x(t))T g(x(t)) ≤ g(x(0))T g(x(0)),
η T η ≥ δ̄, there always exists a sufficiently large positive

(40)

(41)

[1] X. Nguyen, M. I. Jordan, and B. Sinopoli, “A kernel-based learning approach to ad hoc sensor network localization,” ACM Trans. Sensor Netw.,
vol. 1, no. 1, pp. 134–152, Aug. 2005.
[2] S. S. Ram, V. V. Veeravalli, and A. Nedic, “Distributed and recursive
parameter estimation in parametrized linear state-space models,” IEEE
Trans. Automat. Control, vol. 55, no. 2, pp. 488–492, Feb. 2010.
[3] A. Cherukuri and J. Corts, “Distributed generator coordination for initialization and anytime optimization in economic dispatch,” IEEE Trans.
Control Netw. Syst., vol. 2, no. 3, pp. 226–237, Sep. 2015.
[4] S. S. Ram, A. Nedic, and V. V. Veeravalli, “A new class of distributed
optimization algorithms: Application to regression of distributed data,”
Optim. Methods Softw., vol. 27, no. 1, pp. 71–88, 2012.
[5] B. Johansson, M. Rabi, and M. Johansson, “A randomized incremental
subgradient method for distributed optimization in networked systems,”
SIAM J. Optim., vol. 20, no. 3, pp. 1157–1170, 2010.
[6] A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multiagent optimization,” IEEE Trans. Automat. Control, vol. 54, no. 1,
pp. 48–61, Jan. 2009.
[7] J. Koshal, A. Nedic, and U. V. Shanbhag, “Multiuser optimization: Distributed algorithms and error analysis,” SIAM J. Optim., vol. 21, no. 3,
pp. 1046–1081, 2011.
[8] I. Lobel and A. Ozdaglar, “Distributed subgradient methods for convex optimization over random networks,” IEEE Trans. Automat. Control, vol. 56,
no. 6, pp. 1291–1306, Jun. 2011.
[9] J. C. Duchi, A. Agarwal, and M. J. Wainwright, “Dual averaging for distributed optimization: Convergence analysis and network scaling,” IEEE
Trans. Automat. Control, vol. 57, no. 3, pp. 592–606, Mar. 2012.
[10] M. Zhu and S. Martinez, “On distributed convex optimization under inequality and equality constraints,” IEEE Trans. Automat. Control, vol. 57,
no. 1, pp. 151–164, Jan. 2012.
[11] J. Wang and N. Elia, “A control perspective for centralized and distributed
convex optimization,” in Proc. 50th IEEE Conf. Decis. Control Eur. Control Conf., Dec. 2011, pp. 3800–3805.
[12] B. Gharesifard and J. Corts, “Distributed continuous-time convex optimization on weight-balanced digraphs,” IEEE Trans. Automat. Control,
vol. 59, no. 3, pp. 781–786, Mar. 2014.
[13] S. S. Kia, J. Corts, and S. Martnez, “Distributed convex optimization via
continuous-time coordination algorithms with discrete-time communication,” Automatica, vol. 55, pp. 254–264, 2015.
[14] J. Lu and C. Y. Tang, “Zero-gradient-sum algorithms for distributed convex
optimization: The continuous-time case,” IEEE Trans. Autom. Control,
vol. 57, no. 9, pp. 2348–2354, Sep. 2012.
[15] Q. Liu and J. Wang, “A second-order multi-agent network for boundconstrained distributed optimization,” IEEE Trans. Automat. Control,
vol. 60, no. 12, pp. 3310–3315, Dec. 2015.
[16] D. Varagnolo, F. Zanella, A. Cenedese, G. Pillonetto, and L. Schenato,
“Newton–Raphson consensus for distributed convex optimization,” IEEE
Trans. Automat. Control, vol. 61, no. 4, pp. 994–1009, Apr. 2016.

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 63, NO. 5, MAY 2018

1441

[17] S. Lee, A. Ribeiro, and M. M. Zavlanos, “Distributed continuous-time
online optimization using saddle-point methods,” in Proc. 55th IEEE Conf.
Decis. Control, Dec. 2016, pp. 4314–4319.
[18] S. Yang, Q. Liu, and J. Wang, “Distributed optimization based on a multiagent system in the presence of communication delays,” IEEE Trans.
Syst., Man, Cybern., Syst., vol. 47, no. 5, pp. 717–728, May 2016.
[19] Y. Song and W. Chen, “Finite-time convergent distributed consensus optimisation over networks,” IET Control Theory Appl., vol. 10, no. 11,
pp. 1314–1318, 2016.
[20] Y. Zhao, Y. Liu, G. Wen, and G. Chen, “Distributed optimization
of linear multi-agent systems: Edge- and node-based adaptive designs,” IEEE Trans. Automat. Control, vol. 62, no. 7, pp. 3602–3609,
Jul. 2017.
[21] P. Bianchi and J. Jakubowicz, “Convergence of a multi-agent projected
stochastic gradient algorithm for non-convex optimization,” IEEE Trans.
Automat. Control, vol. 58, no. 2, pp. 391–405, Feb. 2013.

[22] B. Touri and B. Gharesifard, “Continuous-time distributed convex optimization on time-varying directed networks,” in Proc. 54th IEEE Conf.
Decis. Control, Dec. 2015, pp. 724–729.
[23] Z. Li, G. Wen, Z. Duan, and W. Ren, “Designing fully distributed
consensus protocols for linear multi-agent systems with directed
graphs,” IEEE Trans. Automat. Control, vol. 60, no. 4, pp. 1152–1157,
Apr. 2015.
[24] Z. Li and Z. Ding, “Distributed adaptive consensus and output tracking of unknown linear systems on directed graphs,” Automatica, vol. 55,
pp. 12–18, 2015.
[25] Z. Ding, “Consensus disturbance rejection with disturbance observers,” IEEE Trans. Ind. Electron., vol. 62, no. 9, pp. 5829–5837,
Sep. 2015.
[26] Z. Li and Z. Duan, Cooperative Control of Multi-Agent Systems:
A Consensus Region Approach. Boca Raton, FL, USA: CRC Press,
2014.

Authorized licensed use limited to: XiangTan University. Downloaded on June 05,2026 at 03:45:53 UTC from IEEE Xplore. Restrictions apply.

