Initialization-free Distributed Algorithms for Optimal Resource Allocation with
Feasibility Constraints and Application to Economic Dispatch of Power Systems✩
Peng Yia , Yiguang Hongb,, Feng Liuc
a Department of Electrical & Computer Engineering, University of Toronto
b Key Lab of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences,

arXiv:1510.08579v2 [math.OC] 9 Apr 2017

c

Department of Electrical Engineering, Tsinghua University

Abstract
In this paper, the distributed resource allocation optimization problem is investigated. The allocation decisions are made to minimize
the sum of all the agents’ local objective functions while satisfying both the global network resource constraint and the local
allocation feasibility constraints. Here the data corresponding to each agent in this separable optimization problem, such as the
network resources, the local allocation feasibility constraint, and the local objective function, is only accessible to individual agent
and cannot be shared with others, which renders new challenges in this distributed optimization problem. Based on either projection
or differentiated projection, two classes of continuous-time algorithms are proposed to solve this distributed optimization problem
in an initialization-free and scalable manner. Thus, no re-initialization is required even if the operation environment or network
configuration is changed, making it possible to achieve a “plug-and-play” optimal operation of networked heterogeneous agents.
The algorithm convergence is guaranteed for strictly convex objective functions, and the exponential convergence is proved for
strongly convex functions without local constraints. Then the proposed algorithm is applied to the distributed economic dispatch
problem in power grids, to demonstrate how it can achieve the global optimum in a scalable way, even when the generation cost, or
system load, or network configuration, is changing.
Keywords: Resource allocation, Distributed optimization, Multi-agent system, Plug-and-play algorithm, Gradient flow, Projected
dynamical system, Economic dispatch

1. Introduction
Resource allocation is one of the most important problems
in network optimization, which has been widely investigated
in various areas such as economics systems, communication
networks, sensor networks, and power grids. The allocation
decisions may be made centrally by gathering all the network
data together to a decision-making center, and then sent back
to corresponding agents (referring to Ibaraki & Katoh (1988)).
On the other hand, differing from this centralized policy, the
master-slave-type decentralized algorithms, either price-based
(Arrow & Hurwicz (1960)) or resource-based (Heal (1969)),
are constructed to achieve the optimal allocations by the local
computations in the slave agents under the coordinations of the
master/center through a one-to-all communication architecture.
However, these methods may not be suitable or effective for
the resource allocation in large-scale networks with numerous
✩ This paper was not presented at any IFAC meeting.
This work
is supported by Beijing Natural Science Foundation (4152057), NSFC
(61333001, 61573344), and Program 973 (2014CB845301/2/3).
This
work is also partly supported by the National Natural Science Foundation of China (No. 51377092), Foundation of Chinese Scholarship Council (CSC No. 201506215034). Corresponding author: Yiguang Hong, Tel.
+86(010)82541824. Fax +86(010)62587343.
Email addresses: peng.yi@utoronto.ca (Peng Yi),
yghong@iss.ac.cn (Yiguang Hong ), lfeng@mail.tsinghua.edu.cn
(Feng Liu)

Preprint submitted to Automatica

heterogeneous agents due to complicated network structures,
heavy communication burden, privacy concerns, unbearable
time delays, and unexpected single-point failures. Therefore,
fully distributed resource allocation optimization algorithms are
highly desirable.
Distributed optimization, which cooperatively achieves
optimal decisions by the local manipulation with private data
and the diffusion of local information through a multi-agent
network, has drawn more and more research attention in
recent years. To circumvent the requirement of control
center or master, various distributed optimization models or
algorithms have been developed (Nedic, Ozdaglar, & Parrilo
(2010), Sayed (2014), Lou, Shi, Johansson, & Hong (2014),
and Yi & Hong (2014)). In light of the increasing attention to distributed optimization and the seminal work on
distributed resource allocation in Ho, Servi, & Suri (1980),
some distributed algorithms for resource allocation optimization have been proposed in Xiao & Boyd (2006),
Lakshmanan & Farias (2008), Necoara, Nesterov, & Glineur
(2011),
Ghadimi, Shames, & Johansson
(2013),
and
Beck, Nedic, Ozdaglar, & Teboulle (2014).
Continuous-time gradient flow algorithms have been
widely investigated for convex optimization after the pioneer work Arrow, Huwicz, & Uzawa (1958), and detailed
references can be found in Liao, Qi, & Qi (2004) and
Bhaya & Kaszkurewicz (2006). Gradient flow algorithms
October 15, 2018

have been applied to network control and optimization (
Low, Paganini, & Doyle (2002), Feijer & Paganini (2010) and
Ferragut & Paganini (2014)), neural networks (Liao, Qi, & Qi
(2004)), and stochastic approximation (Dupuis & Kushner
(1987)).
Recently, continuous-time gradient flow algorithms have been adopted for solving unconstrained distributed optimization problems (see Wang & Elia (2011),
Gharesifard & Cortés (2014), Droge, Kawashima & Egerstedt
(2014), and Kia, Cortés, & Martinez (2015)). Furthermore, the
projection-based gradient flow dynamics have been employed
for solving the complicated constrained optimization problems
in Venets (1985), Nagurney & Zhang (1995), Xia & Wang
(2000), Gao (2003) and Cherukuri, Mallada, & Cortés (2016),
and the projected gradient flow ideas began to be applied to
distributed constrained optimization (see Liu & Wang (2015),
Qiu, Liu, & Xie (2014) and Yi, Hong & Liu (2015)).
The economic dispatch, one of the key concerns in power
grids, is to find the optimal secure generation allocation to balance the system loads, and hence, can be regarded as a special resource allocation problem. In recent years, there has
been increasing research attention in solving economic dispatch problems through a multi-agent system in a distributed
manner to meet the ever growing challenges raised by increasing penetration of renewable energies and deregulation of
power infrastructure (Cavraro, Carli, & Zampieri (2014) and
Zhang, Liu, Wang, Liu, & Ferrese (2015)). Mathematically,
this boils down to a particular distributed resource allocation optimization problem. Furthermore, there were various
continuous-time algorithms for the Distributed Economic Dispatch Problem (DEDP). For example, Zhao, Topcu, Li, & Low
(2014) showed that the physical power grid dynamics could
serve as a part of a primal-dual gradient flow algorithm to
solve the DEDP, and in fact, it considered physical network interconnections and generator dynamics explicitly, providing a
quite comprehensive method and inspiring insights. Moreover,
Cherukuri, & Cortés (2015) solved the DEDP by combining
the penalty method and the distributed continuous-time algorithm in Ho, Servi, & Suri (1980), and proposed a procedure to
fulfill the initialization requirement, while Cherukuri & Cortés
(2014) constructed a novel initialization-free distributed algorithm to achieve DEDP given one agent knowing the total system loads.
Motivated by various practical problems, including the
DEDP in power grids, we study a Distributed Resource Allocation Optimization (DRAO) problem, where each agent can only
manipulate its private data, such as the local objective function,
Local Feasibility Constraint (LFC), and local resource data.
Such data in practice cannot be shared or known by other
agents. As the total network resource is the sum of individual
agent’s local resources, the agents need to cooperatively achieve
the optimal resource allocation in a distributed way, so that
the global objective function (as the sum of all local objective
functions) is minimized with all the constraints (including the
network resource constraint and LFCs) satisfied. Note that the
LFC is critical for the (secure) operation of practical networks
(referring to the communication system in Johari & Tsitsiklis
(2004) and D’Amico, Sanguinetti & Palomar (2014) as an

example), even though it was not considered in most existing
DRAO works. Particularly, for the DEDP in power grids,
the generation of each generator must be limited within its
box-like capacity bounds. The consideration of LFCs brings
remarkable difficulties to existing distributed algorithms
designed for the DRAO without LFCs, because the KKT
(optimality) conditions for the DRAO with and without
LFCs are totally different (referring to Remark 3.3). So
far, many DEDP works (such as Zhao, Topcu, Li, & Low
(2014),
Zhang, Liu, Wang, Liu, & Ferrese
(2015) and
Cherukuri, & Cortés (2015) and Cherukuri & Cortés (2014))
have only considered the box-like LFCs. However, the requirement from power industries, such as the secure operation of
inverter-based devices in smart grids, promotes the demand to
deal with non-box LFCs. This extension is nontrivial, and we
will show how to handle it systematically by using projected
dynamics in this paper.
Another crucial albeit difficult problem is the initialization
coordination among all agents. Many existing results are based
on initialization coordination procedures to guarantee that the
initial allocations satisfy the network resource constraint, which
may only work well for static networks. However, for a dynamical network, the resource has to be re-allocated once the
network configuration changes. Therefore, the initialization coordination has to be re-performed whenever these optimization
algorithms re-start, which considerably degrades their applicability. Taking the DEDP as an example, the initialization needs
to be coordinated among all agents whenever local load demand
or generation capacity/cost changes, or any distributed generator plugs in or leaves off (see Cherukuri, & Cortés (2015) for
an initialization procedure). This issue has to be well addressed
for achieving highly-flexible power grids with the integration of
ever-increasing renewables.
The objective of this paper is to propose an initialization-free
methodology to solve the DARO with local LFCs. The main
technical contributions of this paper are highlighted as follows:
• By employing the (differentiated) projection operation, two fully distributed continuous-time algorithms
are proposed as a kind of projected dynamics, with
the local allocation of each agent kept within its
own LFC set. Moreover, the algorithms ensure the
network resource constraint asymptotically without requiring it being satisfied at the initial points.
Therefore, it is initialization-free, different from those
given in Xiao & Boyd (2006), Lakshmanan & Farias
(2008), Necoara, Nesterov, & Glineur (2011) and
Ghadimi, Shames, & Johansson (2013), and moreover,
provides novel initialization-free algorithms different
from the one given in Cherukuri & Cortés (2014).
• The convergence of the two projected algorithms is shown
by the properties of Laplacian matrix and projection operation as well as the LaSalle invariance principle. The
result can be regarded as an extension of some existing distributed optimization algorithms ( Wang & Elia (2011),
Kia, Cortés, & Martinez (2015), Qiu, Liu, & Xie (2014),
and Liu & Wang (2015)) and an application of projected
2

dynamics for variational inequalities (Gao (2003) and
Xia & Wang (2000)) to the DRAO problem.

The following notations describe the geometry properties of
the convex set. Denote CΩ (x) as the normal cone of Ω at x,
that is, CΩ (x) = {v : hv, y − xi ≤ 0, ∀y ∈ Ω}. Define cΩ (x)
as cΩ (x) = {v : ||v|| = 1, hv, y − xi ≤ 0, ∀y ∈ Ω} if x ∈ ∂Ω,
and cΩ = {0} if x ∈ int(Ω). The feasible direction cone of Ω
at x is given as KΩ (x) = {d : d = β(y − x), y ∈ Ω, β ≥ 0}.
The tangent cone of set Ω at x is defined as T Ω (x) = {v : v =
k
, τk ≥ 0, τk → 0, xk ∈ Ω, xk → x}. Then T Ω (x) is the
limk→∞ x τ−x
k
closure of KΩ (x) when Ω is a closed convex set, and is the polar
cone to CΩ (x), that is, T Ω (x) = {y : hy, di ≤ 0, ∀d ∈ CΩ (x)}
(referring to Lemma 3.13 of Ruszczynski (2006)).
Define the projection of x onto a closed convex set Ω by
PΩ (x) = arg miny∈Ω ||x − y||. The basic property of projection
operation is

• The proposed algorithms can be directly applied to
the DEDP in power systems considering generation capacity limitations. It enables the plug-and-play operation for power grids with high-penetration of flexible renewables. Our algorithms are essentially different from the ones provided in Cherukuri, & Cortés
(2015) and Cherukuri & Cortés (2014), and address
multi-dimensional decision variables and general non-box
LFCs. Simulation results demonstrate that the algorithm
effectively deals with various data and network configuration changes, and also illustrate the algorithm scalability.
The reminder of this paper is organized as follows. The preliminaries are given and then the DRAO with LFCs is formulated with the basic assumptions in Section 2. Then a distributed algorithm in the form of projected dynamics is proposed with its convergence analysis in Section 3. In Section 4,
a differentiated projected algorithm is proposed with its convergence analysis for DRAO with strongly convex objective functions, and an exponential convergence rate is obtained in the
case without LFCs. Moreover, the application to the DEDP
in power systems is shown in section 5 with numerical experiments. Finally, the concluding remarks are given in Section
6.
Notations: Denote R≥0 as the set of nonnegative real numbers. Denote 1m = (1, ..., 1)T ∈ Rm and 0m = (0, ..., 0)T ∈ Rm .
Denote col(x1 , ...., xn ) = (xT1 , · · · , xTn )T as the column vector
stacked with vectors x1 , ..., xn. In denotes the identity matrix
in Rn×n . For a matrix A = [ai j ], ai j or Ai j stands for the matrix
entry in the ith row and jth column of A. A ⊗ B denotes the
Kronecker product of matrixes A and B. Denote ×i=1,...,nΩi as
the Cartesian product of the sets Ωi , i = 1, ..., n. Denote the set
of interiors of set K as int(K), and the boundary set of set K as
∂K.

hx − PΩ (x), PΩ (x) − yi ≥ 0, ∀x ∈ Rm , ∀y ∈ Ω.
The following relationships can be derived from (2),

||x − PΩ (x)||22 + ||PΩ (x) − y||22 ≤ ||x − y||22 , ∀x ∈ Rm , ∀y ∈ Ω, (3)
and
||PΩ (x) − PΩ (y)|| ≤ ||x − y||, ∀x, y ∈ Rm .

(4)

The normal cone CΩ (x) can also be defined as (Lemma 2.38 of
Ruszczynski (2006))
CΩ (x) = {v : PΩ (x + v) = x}.

(5)

For a closed convex set Ω, point x ∈ Ω and direction v, we define the differentiated projection operator as
(Dupuis & Kushner (1987) and Nagurney & Zhang (1995)),
ΠΩ (x, v) = lim

δ→0

PΩ (x + δv) − x
.
δ

(6)

The basic properties of the differentiated projection operator are
given as follows (Brogliato, Daniilidis, Lemarchal, & Acary
(2006)).
Lemma 2.1. (i):If x ∈ int(Ω), then ΠΩ (x, v) = v; (ii): x ∈ ∂Ω,
and maxn∈cΩ (x) hv, ni ≤ 0, then ΠΩ (x, v) = v; (iii): x ∈ ∂Ω,
and maxn∈cΩ (x) hv, ni ≥ 0, then ΠΩ (x, v) = v − hv, n∗ in∗ , where
n∗ = arg maxn∈cΩ (x) hv, ni. Therefore, the operator ΠΩ (x, v) in
(6) is equivalent with the projection of v onto T Ω (x), i.e.,

2. Preliminaries and problem formulation
In this section, we first give the preliminary knowledge related to convex analysis and graph theory, and then formulate
the DRAO problem of interest.

ΠΩ (x, v) = PT Ω (x) (v).

2.1. Convex analysis and projection
2.2. Graph theory

The following concepts and properties about convex functions, convex sets, and projection operations come from
Ruszczynski (2006) and Bertsekas (2009). A differentiable
convex function f : Rm → R has the locally Lipschitz continuous gradient, if, given any compact set Q, there is a constant kQ such that ||∇ f (x) − ∇ f (y)|| ≤ kQ ||x − y||, ∀x, y ∈ Q. A
differentiable function f (x) is called µ-strongly convex on Rm
if there exists a constant µ > 0 such that, for any x, y ∈ Rm ,
f (y) ≥ f (x) + ∇T f (x)(y − x) + 12 µ||y − x||2 , or equivalently,
(x − y)T (∇ f (x) − ∇ f (y)) ≥ µ||x − y||2 .

(2)

The following concepts of graph theory can be found in
Mesbahi & Egerstedt (2010). The information sharing or exchanging among the agents is described by graph G = (N, E).
The edge set E ⊂ N × N contains all the information interactions. If agent i can get information from agent j, then ( j, i) ∈ E
and agent j belongs to agent i’s neighbor set Ni = { j|( j, i) ∈ E}.
G is said to be undirected when (i, j) ∈ E if and only if ( j, i) ∈ E.
A path of graph G is a sequence of distinct agents in N such
that any consecutive agents in the sequence corresponding to
an edge of graph G. Agent j is said to be connected to agent i if

(1)
3

there is a path from j to i. G is said to be connected if any two
agents are connected.
Define adjacency matrix A = [ai j ] of G with ai j = 1 if j ∈
Ni and ai j = 0 otherwise. Define the degree matrix Deg =
P
P
diag{ nj=1 a1 j , ..., nj=1 an j }. Then the Laplacian of graph G is
L = Deg − A. When G is a connected undirected graph, 0 is a
simple eigenvalue of Laplacian L with the eigenspace {α1n |α ∈
R}, and L1n = 0n , 1Tn L = 0Tn , while all other eigenvalues are
positive. Denote the eigenvalues of L in an ascending order as
0 < s2 ≤ · · · ≤ sn . Then, by the Courant-Fischer Theorem,
min xT Lx = s2 ||x||22 ,
x,0,
1T x=0

max xT Lx = sn ||x||22 .
x,0

Remark 2.2. Define the recession cone of a convex set Ω as
RΩ = {d : x + αd ∈ Ω, ∀ α ≥ 0, ∀x ∈ Ω}. Then the sufficient and necessary condition for the existence of finite optimal
solution to (8) is (referring to Proposition 3.2.2 of Bertsekas
(2009))
×i∈N RΩi ∩ {Null(1Tn ⊗ Im )} ∩ ×i∈N R fi = 0,
where RΩi is the recession cone of Ωi and R fi is the recession
cone of any nonempty level set of fi (xi ): {xi ∈ Rm | fi (xi ) ≤ γ}.
In many practical cases, we have R fi = 0 (taking the
quadratic function as an example). Furthermore, RΩi = 0 when
Ωi is compact. Therefore, the existence of a finite solution can
be easily guaranteed and verified in many practical problems.

(7)

2.3. Problem formulation
Consider a group of agents with the index set N = {1, ..., n} to
make an optimal allocation of network resource under both the
network resource constraint and LFCs. Agent i can decide its
local allocation xi ∈ Rm , and can access the local resource data
P
di ∈ Rm . The total network resource is i∈N di , and therefore,
the allocation should satisfy the network resource constraint:
P
P
i∈N xi =
i∈N di . Furthermore, the allocation of agent i should
satisfy the local feasibility constraint (LFC): xi ∈ Ωi , where
Ωi ⊂ Rm is a closed convex set only known by agent i. Agent i
also has a local objective function fi (xi ) : Rm → R associated
with its local allocation xi . Denote X = col(x1 , ..., xn ) ∈ Rmn as
the allocation vector of the whole network. Then the task for
the agents is to collectively find the optimal allocation corresponding to the DRAO problem as follows:
Distributed Resource
X Allocation Optimization :
min f (X) =
fi (xi )
xi ∈Rm , i∈N
X i∈N X
sub ject to
xi =
di ,
i∈N

The local objective function fi (xi ), resource data di and LFC
set Ωi are the private data for agent i, which are not shared
with other agents. This makes (8) a distributed optimization
problem. To fulfill the cooperations between agents for solving (8), the agents have to share their local information through
a network G = (N, E). Next follows an assumption about the
connectivity of G to guarantee that any agent’s information can
reach any other agents, which is also quite standard for distributed optimization (Liu & Wang (2015)).
Assumption 3. The information sharing graph G = (N, E) is
undirected and connected.
In a sum, the task of this paper is to design fully distributed
algorithms for the agents to cooperatively find the optimal resource allocation to (8) without any center. In other words,
agent i needs to find its optimal allocation x∗i by manipulating
its local private data di , Ωi , and fi (xi ) and by cooperations with
its neighbor agents through G.

(8)

i∈N

xi ∈ Ωi , i ∈ N.

3. Projected algorithm for DRAO

Clearly, problem (8) is an extension of the previous optimization models in Xiao & Boyd (2006), Lakshmanan & Farias
(2008) and Necoara, Nesterov, & Glineur (2011) by introducing the additional LFCs, that is, xi ∈ Ωi . Clearly,
xi ∈ Ωi also generalizes previous box constraints in
Johari & Tsitsiklis (2004), D’Amico, Sanguinetti & Palomar
(2014) and Cherukuri & Cortés (2014).
The following assumptions are given for (8), which were also
adopted for the distributed optimization or resource allocation
in Feijer & Paganini (2010), Kia, Cortés, & Martinez (2015),
and Liu & Wang (2015).

In this section, a distributed algorithm for (8) based on projected dynamics is proposed and analyzed. The distributed algorithm for agent i is given as follows:
Projected algorithm for agent i :
ẋi = PΩX
(xi − ∇ fi (xi ) +X
λi ) − x i
i
(zi − z j ) + (di − xi )
(λi − λ j ) −
λ̇i = −
j∈Ni
Xj∈Ni
(λi − λ j )
z˙i =
j∈Ni

Assumption 1. The functions fi (xi ), i ∈ N are continuously
differentiable convex functions with locally Lipschitz continuous gradients and positive definite Hessians over Rm .

In algorithm (9), xi ∈ Rm is the local allocation of agent i,
and λi , zi ∈ Rm are two auxiliary variables of agent i. Note
that the algorithm (9) is fully distributed because agent i only
needs the local data (including di , fi (xi ), and Ωi ) and the shared
information {λ j , z j , j ∈ Ni } from its neighbor agents. Thereby,
(9) does not need any center to handle all the data or coordinate
all the agents. With the distributed algorithm (9), each agent
has the autonomy and authority to formulate its own objective
function and feasibility set, and hence, the privacy is kept within
each agent. Because each agent can instantaneously react to

Assumption 1 implies that fi (xi )’s are strictly convex, and
hence guarantees the uniqueness of the optimal solution to (8).
Assumption 2. There exists a finite optimal solution X ∗ to
problem (8). The Slater’s constraint condition is satisfied for
DRAO (8), namely, there exists x̃i ∈ int(Ωi ), ∀i ∈ N, such that
P
P
i∈N x̃i =
i∈N di .

(9)

4

Also, at the equilibrium point, ẋi = 0 implies that PΩi (x∗i −
∇ fi (x∗i )+λ∗ ) = x∗i . It follows from Lemma 2.38 of Ruszczynski
(2006) that

its local data changes, it can quickly adapt its local decision.
Therefore, the algorithm can be easily applied to large-scale
networks.
The algorithm (9) can be understood based on the following
observations. The duality of (8) with multiplier λ ∈ Rm is
X
X
maxm q(λ) =
qi (λ) =
inf { fi (xi ) − λT xi + λT di }. (10)
λ∈R

i∈N

i∈N

−∇ fi (x∗i ) + λ∗ ∈ CΩi (x∗i ), i = 1, ..., n.
Therefore, the equilibrium point col(X ∗ , Λ∗ , Z ∗ ) of (9) satisfies
0mn ∈ ∇ f (X ∗ ) − 1n ⊗ λ∗ + CΩ (X ∗ ),
(14)
(1Tn ⊗ Im )X ∗ = (1Tn ⊗ Im )D, X ∗ ∈ Ω,

xi ∈Ωi

Although some existing distributed algorithms in
Nedic, Ozdaglar, & Parrilo (2010) and Sayed (2014) addressed the dual problem (10), they need to solve a subproblem
at each time (iteration) to calculate the gradients. In other
words, two “time scales” are needed if applying existing
distributed algorithms to (8). Here we aim to develop a simple
algorithm without solving any subproblems. To this end, we
formulate a constrained optimization problem with Laplacian
matrix L and Λ = col(λ1 , ..., λn ) ∈ Rmn as
P
maxΛ=(λ1 ,...,λn)
Q(Λ) = i∈N qi (λi )
(11)
sub ject to
(L ⊗ Im )Λ = 0mn .

which is exactly the optimality condition (KKT) for DRAO (8)
by Theorem 3.34 in Ruszczynski (2006). Thus, the conclusion
follows.
✷

Remark 3.2. It can be shown that the equilibrium point of (9)
has λ∗ as the dual optimal solution to problem (10), and Z ∗
as the dual optimal solution to problem (11), following a similar analysis routine of Theorem 3.1 and Proposition 5.3.2 in
Bertsekas (2009). It can also be shown that any col(X ∗ , 1m ⊗
λ∗ , Z ∗ ), with X ∗ as the optimal solution to (8), λ∗ as the dual
optimal solution to (10) and Z ∗ as the dual optimal solution to
(11), corresponds to an equilibrium point of (9). We do not
discuss their details here for space limitations.

The augmented Lagrangian duality of (11) with multipliers Z =
col(z1 , ..., zn ) ∈ Rmn is
X
1
min max Q(Λ, Z) =
qi (λi ) − Z T (L ⊗ Im )Λ − ΛT (L ⊗ Im )Λ.
Z
Λ
2
i∈N
(12)
Then the projected dynamics (9) is derived by applying the gradient flow to (10) and (12) along with the projection operation
to guarantee the feasibility of LFCs.
From the KKT condition, we can show that the equilibrium
point of (9) yields the optimal solution to problem (8). Denote X = col(x1 , ..., xn), ∇ f (X) = col(∇ f1 (x1 ), · · · , ∇ fn (xn )),
D = col(d1 , · · · , dn ), and Ω = ×i∈N Ωi . Write algorithm (9) in a
compact form as
Ẋ
Λ̇
Ż

=
=
=

PΩ (X − ∇ f (X) + Λ) − X,
−(L ⊗ Im )Λ − (L ⊗ Im )Z + (D − X),
(L ⊗ Im )Λ.

Remark 3.3. The differences between our work and some existing ones are listed as follows:
• The KKT condition for DRAO without LFC is
∇ f (X ∗ ) = 1n ⊗ λ∗ , (1Tn ⊗ Im )X ∗ = (1Tn ⊗ Im )D.

The KKT condition (14) for DRAO with LFC and the
condition (15) for DRAO without LFC are totally different. (15) requires the optimal allocations to be the points
satisfying the network resource constraint with the same
marginal costs (gradients), while (14) requires the optimal
allocations to be feasible in both network resource constraint and LFCs. The optimal allocations in (14) should
also satisfy a variational inequality related to both the objective functions’ gradients and the normal cones of the
LFC sets. In fact, the marginal costs (gradients) at the
optimal allocations of (14) do not necessarily reach the
same levels, and the differences can be seen as the “price
of allocation feasibility”.

(13)

Theorem 3.1. Under Assumptions 1-3, if the initial point
xi (0) ∈ Ωi , ∀i ∈ N, then xi (t) ∈ Ωi , ∀t ≥ 0, ∀i ∈ N, and
col(X ∗ , Λ∗ , Z ∗ ) is the equilibrium point of the distributed algorithm (9) with X ∗ as the optimal solution to (8).

• The previous algorithms (except the one in
Cherukuri & Cortés (2014)) kept the network resource
constraint satisfied (ensured its eventual feasibility) by
setting feasible initial points through the initialization
coordination procedure. In other words, the network
resource constraint can be guaranteed only if it is
satisfied at the initial moment. However, the initialization
for the network resource constraint is quite restrictive
for large-scale dynamical networks because it involves
global coordination and has to be performed every time
the network data/configuration changes. Moreover, it
is not trivial to achieve the initialization coordination
with both the LFCs and the network resource constraint
(refer to Cherukuri, & Cortés (2015) for an initialization
procedure with one dimensional interval constraint).

Proof: Note that ẋi ∈ T Ωi (xi ), ∀xi ∈ Ωi because PΩi (xi −
∇ fi (xi ) + λi ) ∈ Ωi . Given the initial point xi (0) ∈ Ωi , by
Nagumo’s theorem (referring to page 174 and page 214 of
Aubin & Cellina (1984)), xi (t) ∈ Ωi , ∀t ≥ 0 (Related proofs
can also be found in Liao, Qi, & Qi (2004)).
To obtain the equilibrium point, we set Ż = 0mn and get Λ∗ =
1n ⊗ λ∗ , λ∗ ∈ Rm , because the graph G is connected. Λ̇ = 0mn
implies that (L ⊗ Im )Z ∗ = D − X ∗ . Because the graph G is
undirected, 1Tn L = 0Tn and (1Tn ⊗ Im )(L ⊗ Im )Z = (1Tn L) ⊗ Im Z =
P
P
(1Tn ⊗ Im )(D − X) = 0m . Hence, i∈N di = i∈N x∗i . Then, at the
equilibrium point,
X
X
Λ∗ = 1n ⊗ λ∗ , λ∗ ∈ Rm , (L ⊗ Im )Z ∗ = D − X ∗ ,
di =
x∗i .
i∈N

(15)

i∈N

5

Define H(S ) = PΩ̄ (S − F(S )), and then give a Lyapunov function as

• In fact, proportional-integral (PI)-type consensus dynamics (Wang & Elia (2011) and Gharesifard & Cortés
(2014)) and projected gradient flows (Liu & Wang
(2015)), which both have been utilized for distributed optimization, are combined together to obtain the algorithm
(9) for the KKT condition (14). Local λi acts as the local
shadow price, and all the local shadow prices must reach
consensus to be the global market clearing price. Therefore, a second-order PI-based consensus dynamics is incorporated into (9) with the integral variable zi summing
up the disagreements between λi and {λ j , j ∈ Ni }. Meanwhile, the dynamics of xi adjusts the local allocation by
comparing the local shadow price and the local gradient,
and also utilizes the projection operation in order to make
the local allocation flowing within its LFC set all the time.

1
1
Vg = −hF(S ), H(S ) − S i − ||H(S ) − S ||22 + ||S − S ∗ ||22 ,
2
2
where S ∗ = col(X ∗ , Λ∗ , Z ∗ ), X ∗ is the optimal solution to (8),
Λ∗ is the optimal solution to (11), and Z ∗ is the dual optimal
solution to (12).
Notice that X ∗ is a finite point from Assumption 2. Because
the Slater’s condition is satisfied with Assumption 2, the dual
optimal solution λ∗ for (10) exists and is finite by Proposition
5.3.1 of Bertsekas (2009). Since the function fi (xi ) is strictly
convex, ∇qi (λ∗ ) = di − x∗i by Theorem 2.87 of Ruszczynski
(2006) and the saddle point property of (10). Then the KKT
condition of (11) is LΛ∗ = 0 and D − X ∗ − LΛ∗ − LZ ∗ = 0.
Hence, LZ ∗ = D − X ∗ implies the finiteness of the dual optimal
solution Z ∗ for (11). Therefore, S ∗ is a finite point.
In fact, with the KKT conditions to (10) and (11), F(S ∗ ) =
col((∇ f (X ∗ ) − Λ∗ ), 0n , 0n ), and −∇ f (X ∗ ) + Λ∗ ∈ CΩ (X ∗ ), we
have

Therefore, the algorithms given in Ho, Servi, & Suri (1980),
Lakshmanan & Farias (2008), Necoara, Nesterov, & Glineur
(2011) and Ghadimi, Shames, & Johansson (2013) failed to
solve (8) because they cannot ensure the LFCs given in (14).
Note that the algorithm (9) can ensure LFCs even during the
algorithm flow with projection operations. It only requires that
each agent has its initial allocation belonging to its LFC set,
which can be trivially accomplished by each agent with onestep local projection operation. Furthermore, algorithm (9) ensures the network resource constraint asymptotically without
caring about whether it is satisfied at the initial points, and
therefore, is free of initialization coordination procedure. Due
to free of any center and initialization, algorithm (9) can adaptively handle online data without re-initialization whenever the
network data/configuration changes. Moreover, it can work in a
“plug-and-play” manner for dynamical networks with leavingoff or plugging-in of agents.
Next, let us analyze the convergence of (9), The analysis techniques are inspired by the projected dynamical systems for variational inequalities (referring to Xia & Wang
(2000) and Gao (2003)) and distributed optimization (referring
to Shi, Johanssan, & Hong (2013 ), Kia, Cortés, & Martinez
(2015), Qiu, Liu, & Xie (2014) and Liu & Wang (2015)).

H(S ∗ ) = PΩ̄ (S ∗ − F(S ∗ )) = S ∗ , −F(S ∗ ) ∈ CΩ̄ (S ∗ ).

(16)

Due to (2) and (3),
−hF(S ), H(S ) − S i − 21 hH(S ) − S , H(S ) − S i
= 12 [||F(S )||22 − ||F(s) + H(S ) − S ||22 ]
= 21 [||S − F(S ) − S ||22 − ||H(S ) − (S − F(S ))||22]
≥ 12 ||S − H(S )||22.
Hence, Vg ≥ 12 ||S − H(S )||22 + 21 ||S − S ∗ || ≥ 0, and Vg = 0 if
and only if S = S ∗ .
By Theorem 3.2 of Fukushima (1992), any asymmetric
variational inequality can be converted to a differentiable optimization problem. As a result,
V̇g = (F(S ) − [JF (S ) − I](H(S ) − S ) + S − S ∗ )T (H(S ) − S ),
where JF (S ) is the Jacabian matrix of F(S ) defined as
 2

 ∇ f (X) −I 0 


I
L L  .
JF (S ) = 


0
−L 0

Theorem 3.4. Under Assumptions 1-3, and given bounded initial points xi (0) ∈ Ωi , ∀i ∈ N, the trajectories of the algorithm
(9) are bounded and converge to an equilibrium point of (9),
namely, agent i asymptotically achieves its optimal allocation
x∗i of (8) with (9).

With Assumptions 1 and 3,

S T JF (S̄ )S = X T ∇2 f (X̄)X+ΛT LΛ > 0,

∀S̄ ∈ Ω̄, ∀S , 0 ∈ R3n .

With (2), taking x = S − F(S ) and y = S ∗ gives hS − F(S ) −
H(S ), H(S ) − S ∗i ≥ 0, which implies hS − H(S ) − F(S ), H(S ) −
S + S − S ∗ i ≥ 0. Hence, −||H(S ) − S ||22 + hS − H(S ), S − S ∗ i +
h−F(S ), H(S ) − S i + h−F(S ), S − S ∗ i ≥ 0, or equivalently,

Proof: Take m = 1 without loss of generality. Denote Ω̄ =
Ω × Rn × Rn . Define a new vector S = col(X, Λ, Z) and the
vector function F(S ) : R3n → R3n as




∇ f (X) − Λ


F(S ) =  LΛ + LZ − (D − X)  .


−LΛ

hS − H(S ), S − S ∗ + F(S )i ≥ ||H(S ) − S ||22 + hF(S ), S − S ∗ i.
Consequently,
V˙g

Recalling the form in (13) and the fact that PRn (x) = x, ∀x ∈
Rn , the dynamics of all the agents can be written as
Ṡ = PΩ̄ (S − F(S )) − S .

6

≤ −(H(S ) − S )T JF (S )(H(S ) − S ) + ||H(S ) − S ||22
+hS − S ∗ + F(S ), H(S ) − S i
≤ −hF(S ), S − S ∗ i
≤ −hF(S ), S − S ∗ i + hF(S ∗ ), S − S ∗ i − hF(S ∗ ), S − S ∗ i
≤ −hF(S ) − F(S ∗ ), S − S ∗ i − hF(S ∗ ), S − S ∗ i.

′

′

′

′

In fact, hF(S ) − F(S ), S − S i = h∇ f (X) − ∇ f (X ), X − X i +
′
′
′
hΛ − Λ , L(Λ − Λ )i ≥ 0, ∀S , S ∈ Ω̄, because the local objective functions are convex, and the Laplacian matrix is positive semi-definite by Assumptions 1 and 3. With (16), we have
hF(S ∗ ), S − S ∗ i ≥ 0. Then V˙g ≤ 0, and any finite equilibrium
point S ∗ of (9) is Lyapunov stable. Furthermore, there exists a
forward compact invariance set given any finite initial points,
1
IS = {col(X, Λ, Z) ||S − S ∗ || ≤ Vg (S (0))}.
2

Table 1: Parameters setting of Example 3.6

a1 , d1
a2 , d2
a3 , d3
a4 , d4

≤
≤
≤
≤

600s ∼ 1200s
(0.1,0.3), (8,2)
(-17,15), (3,4)
(0.13,8), (-5,12)
(4,20), (1,15)

1200s ∼
(0.1,0.3), (12,-3)
(-17,15), (0,7)
(3,0.7), (-5,12)
(5,17), (1,15)

(17)

Therefore, with the local Lipstchitz continuity of the objective functions’ gradients in Assumption 1 and the nonexpansive property of projection operation (4), P(S − F(S )) − S
is Lipstchitz over the compact set IS in (17). There exists a unique solution to (9) with time domain [0, ∞). Also,
the compactness and convexity of IS implies the existence of
equilibrium point to dynamics (9) (referring to page 228 of
Aubin & Cellina (1984)). By Theorem 3.1 and without loss
of generality, the equilibrium point is assumed to be S ∗ .
Furthermore, there exists c∗ ∈ CΩ (X ∗ ) such that −∇ f (X ∗ ) +
Λ∗ = c∗ , LZ ∗ = D − X ∗ , and Λ∗ = 1n λ∗ .
V˙g

0 ∼ 600s
(8,2), (8,2)
(4,7), (3,4)
(0.13,8), (3,8)
(4,20), (10,2)

Figure 1: The trajectories of the allocations of agent 1 and agent 2.

network resource all the time in a time-varying resource case
in Cherukuri & Cortés (2014), while each agent only knows
its local resource in (9). Moreover, our techniques introduce a
variational inequality viewpoint in addition to Lyapunov methods with the invariance principle.

−hF(S ), S − S ∗ i = −hX − X ∗ , ∇ f (X) − Λi
−hΛ − Λ∗ , LΛ + LZ − (D − X ∗ )i − hZ − Z ∗ , −LΛi
−hX − X ∗ , ∇ f (X) − Λ − ∇ f (X ∗ ) + Λ∗ − c∗ i
−hΛ − Λ∗ , L(Λ − Λ∗ )i − hΛ − Λ∗ , L(Z − Z ∗ )i
−hΛ − Λ∗ , LZ ∗ − (D − X)i − hZ − Z ∗ , −L(Λ − Λ∗ )i
−hX − X ∗ , ∇ f (X) − ∇ f (X ∗ )i + hX − X ∗ , c∗ i
−hΛ − Λ∗ , L(Λ − Λ∗ )i
−hX − X ∗ , ∇ f (X) − ∇ f (X ∗ )i − hΛ − Λ∗ , L(Λ − Λ∗ )i,

The following example illustrates how (9) “adaptively”
achieves the optimal resource allocation without reinitialization for a dynamical network.
Notice that the
following example cannot be directly addressed by the
algorithm in Cherukuri & Cortés (2014).

where the last step follows from hX − X ∗ , c∗ i ≤ 0. Denote the
set of points satisfying V̇g = 0 as Eg = {(X, Λ, Z) V̇g = 0}.
Because the Hessian matrix of ∇2 f (X) is positive definite,
R1
∇ f (X) = ∇ f (X ∗ ) + 0 ∇2 f (τX + (1 − τ)X ∗ )T (X − X ∗ )dτ and
the null space for L imply that

Example 3.6. Four agents cooperatively optimize problem (8).
The allocation variable and resource data for agent i are xi =
(xi,1 , xi,2 )T ∈ R2 , di = (di,1 , di,2 )T ∈ R2 , respectively. The objective functions fi (xi ) are parameterized with ai = (ai,1 , ai,2 )T ∈
R2 as follows:

Eg = {(X, Λ, Z) X = X ∗ , Λ ∈ span{α1n }}.
Then we claim that the maximal invariance set within the
set Eg is exactly the equilibrium point of (9). In fact, Λ ∈
span{α1n } implies Z = Z ∗ . Hence, Λ̇ = LZ ∗ − (D − X ∗ ). However, LZ ∗ − (D − X ∗ ) must be zero; otherwise Λ will go to infinity, which contradicts that Eg is a compact set within IS . Thus,
Λ̇ = 0 and Λ = Λ∗ .
By the LaSalle invariance principle and Lyapunov stability
of the equilibrium point, the system (9) converges to its equilibrium point, which implies the conclusion.
✷

fi (xi ) = (xi,1 + ai,1 xi,2 )2 + xi,1 + ai,2 xi,2 + 0.001(x2i,1 + x2i,2 ).
The LFCs of the four agents are given as follows: Ω1 = {x1 ∈
R2 |(x1,1 − 2)2 + (x1,2 − 3)2 ≤ 25}, Ω2 = {x2 ∈ R2 |x2,1 ≥ 0, x2,1 ≥
0, x2,1 + 2x2,2 ≤ 4}, Ω3 = {x3 ∈ R2 |4 ≤ x3,1 ≤ 6, 2 ≤ x3,2 ≤ 5}
and Ω4 = {x4 ∈ R2 |0 ≤ x4,1 ≤ 15, 0 ≤ x4,2 ≤ 20}, respectively,
and their boundaries are shown in Figures 1 and 2.

Remark 3.5. Although an initialization-free algorithm has
also been proposed and investigated for the DEDP, a special case of DRAO, in Cherukuri & Cortés (2014), algorithm
(9) provides a different algorithm to address the DRAO problem without initialization. Additionally, algorithm (9) can
handle general multi-dimensional LFCs explicitly with the
projection operation, while Cherukuri & Cortés (2014) only
addressed one-dimensional box constraints with a penalty
method. Moreover, one agent was required to know the total

Figure 2: The trajectories of the allocations of agent 3 and agent 4.

7

algorithm for agent i given as follows:
Differentiated projected algorithm for agent i :
ẋi = ΠΩX
(xi , −∇ fi (xi ) +X
λi )
i
(zi − z j ) + (di − xi )
(λi − λ j ) −
λ̇i = −
j∈N
j∈N
i
X i
(λi − λ j )
z˙i =

(18)

j∈Ni

Remark 4.1. The algorithm (18) is a direct extension of (9)
by differentiating the projection operator, where each agent is
required to project −∇ fi (xi ) + λi onto the tangent cone T Ωi (xi ).
Thereby, (18) has the additional burden for the tangent cone
computation compared with (9). However, for some specific
convex sets such as polyhedron, Euclidean ball, and boxes, it is
not hard to get the close form of the tangent cone at any given
point.

Figure 3: The trajectories of the Lagrangian multiplies λi ’s

Similar to the algorithm (9), the algorithm (18) is also a distributed algorithm, and it does not need any initialization coordination procedure. Therefore, it can efficiently process online
data for dynamical networks.
Although algorithm (18) is a discontinuous dynamical system, the solution to (18) is well-defined in the Caratheodory
sense (an absolutely continuous function col(X(t), Λ(t), Z(t)) :
[0, T ] → R3mn is a solution of (18) if (18) is satisfied
for almost all t ∈ [0, T ], referring to Definition 2.5 in
Nagurney & Zhang (1995)). The existence of an absolutely
continuous solution to (18) can be found in Theorem 3.1 of
Cojocaru & Jonker (2004), and the condition when the solution can be extended to interval [0, ∞] is given in Theorem 1
of Brogliato, Daniilidis, Lemarchal, & Acary (2006).
The following result shows the correctness of algorithm (18).

Figure 4: Network resource constraint and optimality condition

The agents share information with a ring graph G:
1 ↔ 2 ↔ 3 ↔ 4 ↔ 1.
The initial allocation xi (0) of agent i in algorithm (9) is randomly chosen within its LFC set, and Λ, Z are set with zero
initial values. The data ai and di switches as Table 3, while the
allocation variables remain unchanged when the data switches.
The simulation results are shown in Figures 1, 2, 3, and 4.
Figures 1 and 2 show that the agents’ allocation variables always remain within the corresponding LFC sets, while Figure
3 shows that the Lagrangian multipliers reach consensus after
the transient processes. Figure 4 demonstrates that the network
resource constraint can be satisfied asymptotically even though
it is violated each time the data/configuration changes, and Figure 4 also reveals that ||Ẋ||22 + ||Λ̇||22 + ||Ż||22 always converges to
zero, guaranteeing the optimality of the resource allocations.

Theorem 4.2. Suppose that Assumptions 1-3 hold. If the initial
point xi (0) ∈ Ωi , ∀i ∈ N, then xi (t) ∈ Ωi , ∀t ≥ 0, ∀i ∈ N,
and there is the equilibrium point of the algorithm (18) with
X ∗ = col(x∗1 , ..., x∗n ) as the optimal solution to (8).
Proof: Obviously, ẋi ∈ T Ωi (xi ), ∀xi ∈ Ωi according to Lemma 2.1. It follows that (18) has an absolutely
continuous solution on interval [0, ∞] by Theorem 1 of
Brogliato, Daniilidis, Lemarchal, & Acary (2006). Moreover,
Theorem 3.2 of Cojocaru & Jonker (2004) shows that the solution of (18) coincides with a slow solution of a differential
inclusion. Given the initial point xi (0) ∈ Ωi , xi (t) ∈ Ωi , ∀t ≥
0 holds in light of the viability theorem in Aubin & Cellina
(1984).
By Lemma 2.1, we have that ΠΩi (xi , −∇ fi (xi ) + λi ) = 0 if at
least one of the following cases is satisfied: (i):xi ∈ int(Ωi ), and
−∇ fi (xi ) + λi = 0; (ii): xi ∈ ∂Ωi , and −∇ fi (xi ) + λi = 0; (iii):
xi ∈ ∂Ωi , and −∇ fi (xi ) + λi ∈ CΩi (xi ). Hence, ΠΩi (xi , −∇ fi (xi ) +
λi ) = 0 implies −∇ fi (xi ) + λi ∈ CΩi (xi ). Following similar
analysis of Theorem 3.1, at the equilibrium point we have

4. Differentiated projected algorithm for DRAO

In this section, the differentiated projection operator
(6) is applied to derive an algorithm for (8).
In
fact, the projected dynamics based on the operator in
(6) was firstly introduced in the study of constrained
stochastic approximation in Dupuis & Kushner (1987), and
later was utilized to solve variational inequalities and
constrained optimization problems in Nagurney & Zhang
(1995), Brogliato, Daniilidis, Lemarchal, & Acary (2006), and
Cherukuri, Mallada, & Cortés (2016). Here the operator (6) is
applied to the construction of the distributed resource allocation
8

Λ∗ = 1n ⊗ λ∗ , λ∗ ∈ Rm , x∗i ∈ Ωi
P
P
(L ⊗ Im )Z ∗ = D − X ∗ , i∈N x∗i = i∈N di ,
−∇ fi (x∗i ) + λ∗ ∈ CΩi (x∗i ).

(19)

Thus, the optimality condition (14) of (8) is satisfied by the
equilibrium point of (18).
✷
Next result shows the convergence of (18) when the local
objective functions are strongly convex.

Denote s1 ≤ s2 ≤ ... ≤ sn as the ordered eigenvalues of
Laplacian matrix L. Obviously, s1 = 0 and s2 > 0 when the
graph G is connected. By (7),
V̇1s ≤ −αµ̄θT θ − αs2 ηT2 η2 − γs2 δT2 δ2 − γηT2 θ2 − γδ2 θ2 .

Theorem 4.3. Suppose that Assumptions 1-3 hold, and the local objective functions fi (xi ) are µi -strongly convex functions
with ki -Lipschitz continuous gradients. Given bounded initial
points xi (0) ∈ Ωi , ∀i ∈ N, the trajectories of algorithm (18)
converge to its equilibrium point. Furthermore, if there are no
LFCs (that is, Ωi = Rm , i = 1, · · · , n), then algorithm (18) exponentially converges to its equilibrium point.

According to −γηT2 θ2 ≤ 12 γ2 θ2T θ2 + 21 ηT2 η2 , and −γδT2 θ2
1 T
1 2 T
2 γ θ2 θ2 + 2 δ2 δ2 , we have

1
1
V̇1s ≤ −(αµ̄ − γ2 )θT θ − (αs2 − )ηT2 η2 − (γs2 − )δT2 δ2 . (24)
2
2
2

Take γ and α such that γ > 2s12 , and α > max{ γµ̄ , 2s12 }. Then we
have V̇1s < 0, which leads to the convergence of algorithm (18).
Next, we estimate the convergence rate of (18) when Ωi =
Rm , i = 1, · · · , n. In this case ΠΩi (xi , −∇ fi (xi ) +λi ) = −∇ fi (xi ) +
λi , and β(xi )ni (xi ) = 0. Still take V1s in (23), and then (24) still
holds in this case.
Take V2s = ε(θ − η)T (θ − η), and we have

Proof: We still take m = 1 without loss of generality.
At first, we show the convergence of (18). By Lemma 2.1,
ΠΩi (xi , −∇ fi (xi ) + λi ) = −∇ fi (xi ) + λi − β(xi )ni (xi ),

(20)

where ni (xi ) ∈ cΩi (xi ), β(xi ) ≥ 0. Notice that there exist
β(x∗i ) ≥ 0 and ni (x∗i ) ∈ cΩi (x∗i ) at the equilibrium point such
that ∇ fi (x∗i ) − λ∗ = −β(x∗i )ni (x∗i ).
Define the following variables
Y = X − X∗,
V = Λ − Λ∗ ,
W = Z − Z∗,

θ = [r R]T Y,
η = [r R]T V,
δ = [r R]T W,

Y = [r R]θ,
V = [r R]η,
W = [r R]δ,

V̇2s

(21)
≤

V̇ s
(22)

=
+

1
1
T
T
T
2 α(θ θ + η η) + 2 (α + γ)δ2 δ2
1
T
2 γ(η2 + δ2 ) (η2 + δ2 ),

(23)

where α, γ are positive constants to be determined later. Obviously, 12 α||p||22 ≤ V1s ≤ ( 21 (α+γ)+γ)||p||22 where p = col(θ, η, δ2 ).
The derivative of V1s along (18) is
V̇1s

=
+
+

=
−
−

−(αµ̄ − γ2 + ε(µ − 1 − s2n − 21 M 2 ))θT θ
1
1
T
T
2 εη η − (γs2 − 2 − ε)δ2 δ2
1
1 2
1
(αs2 − 2 + ε(s2 − 2 − 2 sn ))ηT2 η2 .

(25)

1
1
Choose γ ≥ 3ε+1
2s2 such that γs2 − 2 − ε ≥ 2 ε. Select α such
that
1
1
(26)
αµ̄ − γ2 + ε(µ − 1 − s2n − M 2 ) ≥ ε,
2
2
and
1
1 1
αs2 − + ε(s2 − − s2n ) ≥ 0.
(27)
2
2 2
As a result,
1
V̇ s ≤ ε||p||22.
2
1
1
2
s
Then, with 2 α||p||2 ≤ V ≤ ( 2 (α + γ) + γ + 2ε)||p||22, we have
r
2ε
α + 3γ + 4ε
||p(0)||e− α+3γ+4ε t ,
(28)
||p|| ≤
α

where h = ∇ f (Y + X ∗ ) − ∇ f (X ∗ ) + NΩ (X) − NΩ (X ∗ ),
NΩ (X) = col(β(x1 )n1 (x1 ), ..., β(xn)nn (xn )), and NΩ (X ∗ ) =
col(β(x∗1 )n1 (x∗1 ), ...., β(x∗n)nn (x∗n )).
Construct the following function
V1s

−εY T h + εθT θ + εθ2 RT LRη2 + εθ2 RT LRδ2
+εηT [r, R]T h − εηT η − εη2 RT LRη2 − εη2 RT LRδ2
−(εµ − ε)θT θ + 12 εs2n θT θ + 12 εηT2 ηT2 + 21 εs2n θT θ
+ 21 εδ2 δT2 + 12 εηT η + 21 εM 2 θT θ − εηT η − εs2 ηT2 η2
+ 12 εs2n ηT2 η2 + 12 εδT2 δ2
−ε(µ − 1 − s2n − 12 M 2 )θT θ − 12 εηT η
+εδT2 δ2 − ε(s2 − 21 − 12 s2n )ηT2 η2 ,

with M = max{k1 , · · · , kn }, by using the inequality xT y ≤
1
2
2
2 ||x||2 + ||y||2 and (7) in the first step of (25).
s
With V = V1s + V2s , it is easy to see that

Then the dynamics of the variables θ, η, δ can be derived with
(18), (19), (20) and (21) as follows,
θ̇2 = −RT h + η2 ;
η̇2 = −θ2 − RT LRη2 − RT LRδ2 ;
δ̇2 = RT LRη2 ,

=
≤

with r = √1n 1n and rT R = 0Tn such that RT R = In−1 and RRT =
In − n1 1n 1Tn . We partition the variables θ, η, δ as col(θ1 , θ2 ),
col(η1 , η2 ), col(δ1 , δ2 ) with θ1 , η1 , δ1 ∈ R and θ2 , η2 , δ2 ∈ Rn−1 .

θ̇1 = −rT h + η1 ;
η̇1 = −θ1 ;
δ̇1 = 0;

≤

α(−Y T h − η2 RT LRη2 − η2 RT LRδ2 )
γ(−ηT2 θ2 − η2 RT LRδ2 − δ2 θ2 − δT2 RT LRδ2 )
(α + γ)δT2 RT LRη2 .

which leads to the exponential convergence of algorithm (18)
to its equilibrium point.
✷

ni (x∗i )

Because
∈ cΩi (x∗i ) and β(x∗i ) ≥ 0, β(x∗i )hxi −
∗
∗
xi , ni (xi )i ≤ 0. Moreover, ni (xi ) ∈ cΩi (xi ) and β(xi ) ≥ 0 imply that β(xi )hxi − x∗i , ni (xi )i ≥ 0. Because the local objecT
∗ T

Remark 4.4. In fact, the exponential convergence speed
2ε
(α+3γ+4ε) in (28) can be estimated by solving the following optimization problem

tive functions are strongly convex, Y h = (X − X ) (∇ f (X) −
∇ f (X ∗ ) + NΩ (X) − NΩ (X ∗ )) = (X − X ∗ )T (∇ f (X) − ∇ f (X ∗ )) +
Pn
Pn
∗
∗
∗
∗
T
i=1 hxi − xi , +β(xi )ni (xi )i + i=1 hxi − xi , −β(xi )ni (xi )i ≥ µ̄θ θ
where µ̄ = min{µ1 , · · · , µn }.

max

2ε

α,γ,ε≥0 (α + 3γ + 4ε)

9

s. t. γ ≥

3ε + 1
, (26), (27).
2s2

Denote
To get a simple estimation, we take γ = 3ε+1
2s2 .
2
2
2
2
̺1 = (sn + 1 − 2s2 ) and ̺2 = s2 (6 + 4sn + 2M − 4µ̄).
2
2ε
1 9ε +ε(6+̺2 )+1
≥
}, we have (α+3γ+4ε)
With taking α = max{ 1+ε̺
2s2 ,
4s2 µ̄
2s2
min{ 8+6s
,
+s2
2

n

The algorithm (30) ensures generation capacity bounds explicitly, and converges without the initialization procedure,
which is crucially important for the “plug-and-play” operation
in the future smart grid.

2

4s22 µ̄

(3+2s2n +M2 +6µ̄)s22 +9µ̄s2 +3

√

6µ̄s2 +1+3

}.

Remark 5.1. The optimization problem (8) can also be applied
to model the multi-period demand response in power systems.
The objective functions describe the dis-utility of cutting loads
in each area. The resource constraint specifies the amount
of loads to be shed in the multi-periods. Particularly, the local load shedding constraints concern with the total power demands in the multi-periods and other specifications of that area.
Then, the previous algorithms (9) and (18) can be applied to
solve this multi-dimensional DRAO with general LFCs in a distributed manner. This issue is interesting but beyond the scope
of this paper, and will be discussed elsewhere.

5. Distributed economic dispatch in power grids
In this section, the algorithm proposed in Section 4 is applied
to the DEDP in power grids to find the optimal secure generation allocations for power balancing in a distributed manner.
Example 5.2 is given to show that the distributed algorithm can
efficiently adapt to online network data/configuration changes,
including load demands, generation costs/capacities, and
plugging-in/leaving-off of buses, while Example 5.3 with a
large-scale network illustrates the scalability of the proposed
algorithm.
Suppose that there exist control areas N = {1, ..., n} with
area i having local generators to supply power Pgi ∈ R≥0 and
local load demands Pdi ∈ R≥0 to be met. The local generation must be kept within the capacity or security bounds
g
g
Pi ≤ Pi ≤ P̄i , Pi , P̄i ∈ R≥0 . fi (Pi ) : R≥0 → R represents the
local generation cost in control area i with respect to its local
generation Pgi , and it satisfies Assumption 1. Then the DEDP
formulation can be derived in the form of (8)
Distributed Economic
Dispatch Problem :
X
g
min f (P ) =
fi (Pgi )
Pgi ,i∈N
X i∈N X
sub ject to
Pgi =
Pdi
i∈N

The following two examples are presented to further show
the online adaptation property and scalability of our algorithm.
Firstly, the standard IEEE 118-bus system is adopted to illustrate the performance of algorithm (30).
Example 5.2 (Optimality and adaptability). Consider the
DEDP (29) in the IEEE 118-bus system with 54 generators. Each generator has a local quadratic generation cost
2
function as fi (Pgi ) = ai Pgi + bi Pgi + ci , whose coefficients
belong to the intervals ai ∈ [0.0024, 0.0679](M$/MW 2),
bi ∈ [8.3391, 37.6968](M$/MW), and ci ∈ [6.78, 74.33](M$).
The generation capacity bounds of the generators are drawn
from Pi ∈ [5, 150](MW) and P̄i ∈ [150, 400](MW), while the
load of each bus varies as Pdi ∈ [0, 300](MW).
The corresponding agents share information on an undirected ring graph with additional undirected edges (1, 4),
(15, 25), (25, 35), (35, 45) and (45, 50). The simulations are
performed with the differentiated projected algorithm (30). The
initial generation Pgi in algorithm (30) is set within its local capacity bounds, while variables λi ’s and zi ’s are set with zero
initial values.
Next explains how the network data/configuration changes at
different times.

(29)

i∈N

Pi ≤ Pgi ≤ Pi ,

N = {1, ..., n}

Here a multi-agent network is introduced to solve
the DEDP (29) motivated by recent DEDP works like
Cherukuri, & Cortés (2015), Cherukuri & Cortés (2014),
Zhao, Topcu, Li, & Low (2014), Cavraro, Carli, & Zampieri
(2014) and Zhang, Liu, Wang, Liu, & Ferrese (2015). Agent
g
i is responsible to decide the generation Pi in control area
i to minimize the global cost as the sum of all individuals’
generation costs, while meeting the total load demands within
its capacity bounds. In addition, each agent can react to the
changing local environment in real time, and adapt its own
behavior or preference by adjusting its local data, including
Pdi , Pi , P̄i , fi (Pgi ). The agents can also share information with
their neighbors to facilitate the cooperations.
Then applying (18) to (29), the distributed algorithm for
agent i is

• Load variations: At time 100s, 18 buses are randomly
chosen with randomly varying their loads by −20 ∼
+20%.
• Generation capacity variations: At time 200s, 18 generators randomly vary their capacity lower bounds by
−50 ∼ +50%, and another 18 generators randomly vary
their capacity upper bounds by −20 ∼ +20%.

Distributed algorithm for agent i :
P̄ −Pg

Ṗgi = [−∇ fi (Pgi ) + λi ]Pig −Pi
i
i
X
X
(zi − z j ) + (Pdi − Pgi )
(λi − λ j ) −
λ̇i = −
j∈Ni
Xj∈Ni
(λi − λ j )
z˙i =

• Generation cost variations: At time 300s, 18 generators
randomly vary their ai by 0 ∼ 50%, and another 18 generators randomly vary their bi by −50 ∼ 0%.

(30)

j∈Ni

• Leaving-off of buses: At time 400s, generator 2 and 3
disconnect from the system, and the communication edges
associated to them are also removed.

2 −x
= 0 if x − c1 = 0 and v ≤ 0 or c2 − x = 0 and
where [v]cx−c
1
2 −x
= v.
v ≥ 0, otherwise [v]cx−c
1

10

Figure 6: The total load curve

Figure 5: Algorithm performance indexes: (i), ||LΛ||’s always decrease to zero
even with different L’s, implying that λ′ s always reach consensus. (ii), The
power balance constraint (i.e., resource constraint) is violated if any network
configuration changes. But the power balance gap 1Tn (Pd − Pg ) asymptotically
decreases to zero, even without any re-initialization coordination. (iii), The
optimality condition ||Ṗg ||2 + ||Λ̇||2 + ||Ż||2 = 0 can always be satisfied asymptotically, implying the economic efficiency of the generation dispatch.

• Plugging-in of bus: At time 500s, generator 3 plugs in the
system with re-generated configurations ai , bi , ci , Pdi , Pi ,
and P̄i . The undirected edge (3, 4) is also added to the
communication graph.
When the data/configuration changes, each agent only
projects its local generation onto to its local capacity bounds if
necessary. The trajectrories of dynamics (30) are derived with
a first-order Euler discretization using Matlab. The trajectories
of some algorithm performance indexes are shown in Figure 5.
It indicates that algorithm (30) can adaptively find the optimal
solutions to (29) in a fully distributed way, even without any
initialization coordination procedure, when the network data
or configuration changes.

Figure 7: The performance indexes at time t = 80s in histograms: (i), The
consensus error ||LΛ||2 at t = 80s always decreases to a rather lower level due
to the second-order proportional-integral consensus dynamics. (ii), The power
balance is almost achieved at t = 80s without the initialization coordination.
(iii), The optimality condition can always be satisfied at t = 80s.

nominal values.
In each period, a connected graph is re-generated with
random graph model G(1000, P) as the information sharing
graph of that period. In G(1000, P), every possible edge occurs independently with the probability of P. Here, the probability P in each period is randomly drawn from the interval
[0.0015, 0.005].
For each period, the computation time is set as 80s. Figure 7
shows the histogram of consensus error ||LΛ||2 , power balance
P g P
gap Pi − Pdi and optimality condition ||Ṗg ||2 + ||Λ̇||2 + ||Ż||2
at time 80s. It indicates that the agents can always find the
economic power dispatch with varying loads and generation
costs/capacities, and evidently demonstrates the scalability of
the proposed method.

Example 5.3 (Scalability). This example considers a network of 1000 control areas to achieve economic dispatch
during a normal day. Control area i has cost function
2
fi (Pgi ) = ai Pgi + bi Pgi as well as generation capacity upper bound P̄i and lower bound Pi . The control areas are
divided equally into two groups. The first group, named
as fuel group, is mainly supported with traditional thermal
generators, and has relatively higher generation costs and
larger capacity ranges with the nominal values of ai , bi , Pi , P̄i
randomly drawn from the intervals, [3, 7](M$/MW 2 ),
[5, 9](M$/MW), [2, 6](MW), [15, 23](MW), respectively. The
second group, named as renewable group, is mainly supported
with renewable energies, and has lower generation costs and
smaller capacity ranges with the nominal values of ai , bi , Pi , P̄i
randomly drawn from the intervals, [ 21 , 2](M$/MW 2 ),
[ 12 , 4](M$/MW), [0, 1](MW), [ 23 , 7](MW), respectively.
The daily 96-point load data (15 minutes for each period) of
each control area is generated from a typical load curve for a
distribution system added with certain random perturbations.
The total load curve of the network is given in Fig 6. In each
period, 10% of the control areas in the renewable group are
randomly chosen to change their generation costs and capacities like Example 5.2 with the variations less that ±20% of its

6. Conclusions
In this paper, a class of projected continuous-time distributed
algorithms have been proposed to solve resource allocation optimization problems with the consideration of LFCs. The proposed algorithms are scalable and free of initialization coordination procedure, and therefore, are adaptable to working condition variations. These salient features have important implications in the DEDP in power systems. Firstly it allows quite
11

general non-box LFCs, which is crucial when inverter-based
devices are involved because their LFCs are usually depicted in
a quadratic form. Secondly, our method is initialization free,
which may facilitate the implementation of the so-called “plugand-play” operation for future smart grids in a dynamic environment. Our application examples illustrate such implications,
showing an appealing potential in the smart operation of future
power grids.
We would like to note that many challenging DRAO problems still remain to be investigated, including the design of
algorithms for the non-smooth objective functions based on
differential inclusions, the estimation of convergence rates
for the proposed algorithms with general LFCs, and the development of stochastic algorithms to achieve the DRAO
with noisy data observations. Furthermore, inspired by
Zhao, Topcu, Li, & Low (2014), we also hope to combine our
algorithms with physical dynamics of power grids to derive a
more comprehensive solution for the DEDP in power systems.

Ferragut, A., & Paganini, F. (2014). Network resource allocation for users
with multiple connections: fairness and stability. IEEE/ACM Transactions
on Networking (TON), 22(2), 349-362.
Fukushima, M. (1992). Equivalent differentiable optimization problems and
descent methods for asymmetric variational inequality problems, Mathematical programming, 53(1-3), 99-110.
Gao, X. B. (2003). Exponential stabiliby of globally projected dynamic systems. IEEE Transactions on Neural Networks, 14(2), 426-431.
Ghadimi, E., Shames, I., & Johansson, M. (2013). Multi-step gradient methods for networked optimization. IEEE Transactions on Signal Processing,
61(21), 5417-5429.
Gharesifard, B., & Cortés, J. (2014). Distributed continuous-time convex optimization on weight-balanced digraphs. IEEE Transactions on Automatic
Control, 59(3), 781-786.
Heal, G. M. (1969). Planning without Prices. Review of Economic Studies,
36(107), 347-362.
Ho, Y. C., Servi, L., & Suri, R. (1980). A class of center-free resource allocation algorithms. Large Scale Systems, 1: 51-62.
Ibaraki, T., & Katoh, N. (1988). Resource Allocation Problems: Algorithmic
Approaches, Cambridge: MIT Press.
Johari, R., & Tsitsiklis, J. N. (2004). Efficiency Loss in a Network Resource
Allocation Game, Mathematics of operations research, 29(3), 407C435.
Kia, S., Cortés, J., & Martinez, S. (2015). Distributed convex optimization
via continuous-time coordination algorithms with discrete-time communication. Automatica, 55, 254-264.
Lakshmanan, H., & Farias, D. P. (2008). Decentralized Resource Allocation
in Dynamic Networks of Agents. SIAM Journal of Optimization, 19(2), 911940.
Liao, L. Z., Qi, H., & Qi, L. (2004). Neurodynamical optimization. Journal
of Global Optimization, 28(2), 175-195.
Liu, Q., & Wang, J. (2015). A Second-order Multi-agent Network for BoundConstrained Distributed Optimization. IEEE Transactions on Automatic
Control, 60(12), 3310-3315.
Lou, Y., Shi, G., Johansson, K.H., & Hong, Y. (2014). Approximate Projected Consensus for Convex Intersection Computation: Convergence Analysis and Critical Error Angle. IEEE Transactions on Automatic Control,
59(7), 1722-1736.
Low, S. H., Paganini, F., & Doyle, J. C. (2002). Internet congestion control.
IEEE Control Systems magazine , 22(1), 28-43.
Mesbahi, M., & Egerstedt, M. (2010). Graph Theoretic Methods for Multiagent Networks, Princeton: Princeton University Press.
Nagurney, A., & Zhang, D. (1995). Projected dynamical systems and variational inequalities with applications (Vol. 2). Springer Science & Business
Media.
Necoara, I., Nesterov, Y., & Glineur, F. (2011). A random coordinate descent
method on large optimization problems with linear constraints. Technical
Report, University Politehnica Bucharest, Romanian.
Nedic, A., Ozdaglar, A., & Parrilo, A. P.(2010). Constrained Consensus and
Optimization in Multi-Agent Networks. IEEE Transactions on Automatic
Control, 55(4), 922-938.
Qiu, Z., Liu, S., & Xie, L.(2016). Distributed constrained optimal consensus
of multi-agent systems. Automatica, 68, 209-215.
Ruszczynski, A. P. (2006). Nonlinear Optimization, Princeton: Princeton university press.
Sayed, A. H. (2014). Adaptation, learning, and optimization over networks.
Foundations and Trends in Machine Learning, 7(4-5), 311-801.
Shi, G., Johansson, K. H., & Hong, Y. (2013). Reaching an optimal consensus:
dynamical systems that compute intersections of convex sets. IEEE Transactions on Automatic Control, 58(3), 610-622.
Venets, V. I. (1985). Continuous algorithms for solution of convex optimization
problems and finding saddle points of contex-coneave functions with the use
of projection operations. Optimization, 16(4), 519-533.
Wang, J., & Elia, N.(2011). A control perspective for centralized and distributed convex optimization. Proceding of IEEE Conference on Decision
and Control, Orlando, USA (pp. 3800-3805)
Xia, S., & Wang, J. (2000). On the stability of globally projected dynamical
systems. Journal of Optimization Theory and Applications, 106(1), 129-150.
Xiao, L., & Boyd, S. (2006). Optimal Scaling of a gradient method for distributed resource allocation. Journal of Optimization Theory and Applications, 129(3), 469-488.
Yi, P. & Hong, Y. (2014). Quantized Subgradient Algorithm and Data-Rate

7. References
References
Arrow, K., Huwicz, L., & Uzawa, H. (1958). Studies in Linear and Non-linear
Programming, Stanford University Press.
Arrow, K. J., & Hurwicz, L. (1960). Decentralization and Computation in
Resource Allocation. Economics and Econometrics, 34-104, University of
North Carolina Press.
Aubin, J.P., & Cellina, A. (1984). Differential Inclusions, Berlin: SpringerVerlag.
Beck, A., Nedic, A., Ozdaglar, A., & Teboulle, M. (2014). Optimal
Distributed Gradient Methods for Network Resource Allocation Problems.
IEEE Transactions on Control of Network Systems, 1(1),64-74.
Bertsekas, D. P. (2009). Convex Optimization Theory, MIT: Athena Scientific.
Bhaya, A., & Kaszkurewicz, E. (2006). Control perspectives on numerical
algorithms and matrix problems, 10, SIAM.
Brogliato, B., Daniilidis, A., Lemarchal, C., & Acary, V. (2006). On the
equivalence between complementarity systems, projected systems and differential inclusions. Systems & Control Letters, 55(1), 45-51.
Cavraro, G., Carli, R., & Zampieri, S. (2014). A distributed control algorithm for the minimization of the power generation cost in smart micro-grid.
IEEE 53rd Annual Conference on Decision and Control (CDC), Los Angeles, USA (pp. 5642-5647).
Cherukuri, A., & Cortés, J. (2014) Initialization-free distributed coordination for economic dispatch under varying loads and generator commitment.
Available in arXiv at:http://arxiv.org/abs/1409.4382
Cherukuri, A., & Cortés, J. (2015). Distributed generator coordination for
initialization and anytime optimization in economic dispatch. IEEE Transactions on Control of Network Systems, 2(3): 226 - 237.
Cherukuri, A, Mallada E, & Cortés J. (2016). Asymptotic convergence of
constrained primal-dual dynamics. Systems & Control Letters, 87: 10-15.
Cojocaru, M. G., & Jonker, L. (2004). Existence of solutions to projected
differential equations in Hilbert spaces. Proceedings of the American Mathematical Society, 132(1), 183-193.
D’Amico, A., Sanguinetti, L. & Palomar, D.P., (2014). Convex separable problems with linear constraints in signal processing and communictions, IEEE
transaction on signal processing, 62(22), 6045-6058.
Droge, G., Kawashima, H., & Egerstedt, M. B. (2014). Continuous-time
proportional-integral distributed optimisation for networked systems. Journal of Control and Decision, 1(3), 191-213.
Dupuis, P., & Kushner, H. J. (1987). Asymptotic behavior of constrained
stochastic approximations via the theory of large deviations. Probability theory and related fields, 75(2), 223-244.
Feijer, D., & Paganini, F. (2010). Stability of primalCdual gradient dynamics
and applications to network optimization. Automatica, 46(12), 1974-1981.

12

Analysis for Distributed Optimization. IEEE Transactions on Control of
Network Systems, 1(4), 380-392.
Yi, P., Hong, Y., & Liu, F. (2015). Distributed Gradient Algorithm for Constrained Optimization with Application to Load Sharing in Power Systems.
Systems & Control Letter, 83, 45-52.
Zhang, W., Liu, W., Wang, X., Liu, L., & Ferrese, F. (2015). Online Optimal
Generation Control Based on Constrained Distributed Gradient Algorithm.
IEEE Transactions on Power Systems, 30(1), 35-45.
Zhao, C., Topcu, U., Li, N., & Low, S. H. (2014). Design and stability of
load-side primary frequency control in power systems. IEEE Transactions
on Automatic Control, 59(5), 1177-1189.

13

