CN112069631A - Distributed projection method considering communication time delay and based on variance reduction technology - Google Patents

Distributed projection method considering communication time delay and based on variance reduction technology Download PDF

Info

Publication number
CN112069631A
CN112069631A CN202010614853.XA CN202010614853A CN112069631A CN 112069631 A CN112069631 A CN 112069631A CN 202010614853 A CN202010614853 A CN 202010614853A CN 112069631 A CN112069631 A CN 112069631A
Authority
CN
China
Prior art keywords
local
optimization problem
agent
follows
variance reduction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010614853.XA
Other languages
Chinese (zh)
Other versions
CN112069631B (en
Inventor
李华青
胡锦辉
夏大文
陈欣
王政
吕庆国
黄廷文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University
Original Assignee
Southwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University filed Critical Southwest University
Priority to CN202010614853.XA priority Critical patent/CN112069631B/en
Publication of CN112069631A publication Critical patent/CN112069631A/en
Application granted granted Critical
Publication of CN112069631B publication Critical patent/CN112069631B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a distributed projection method considering communication time delay and based on a variance reduction technology, which comprises the following steps: step 1, providing an original optimization problem model (1) aiming at a multi-intelligent system simultaneously provided with local set constraint and local equality constraint; step 2, equivalently converting the original optimization problem model (1) obtained in the step 1 into a convex optimization problem model (2) convenient for distributed processing; step 3, a distributed projection algorithm (3) based on a variance reduction technology is provided to solve a convex optimization problem model (2) with constraints, namely, a local random average gradient is adopted to estimate a local full gradient unbiased, so that heavy calculation burden caused by calculation of full gradients of all local objective functions in each iteration is relieved; step 4, carrying out convergence analysis; the invention can greatly reduce the calculation cost of all the agents in the network, thereby reducing the communication and calculation pressure of the whole multi-agent system and having higher practicability.

Description

Distributed projection method considering communication time delay and based on variance reduction technology
Technical Field
The invention relates to the technical field of intelligent communication, in particular to a distributed projection method considering communication time delay and based on a variance reduction technology.
Background
In recent years, with the rapid development of high-tech technology, particularly, emerging fields such as cloud computing and big data have appeared. Distributed optimization theory and application are paid more and more attention and gradually permeate into various aspects of scientific research, engineering application and social life, the distributed optimization is a task of effectively realizing optimization through cooperative coordination among a plurality of intelligent agents, and the distributed optimization can be used for solving the large-scale complex optimization problem which is hard to be competent by a plurality of centralized algorithms. However, when the existing distributed optimization algorithm faces a convex optimization problem which is large in scale and has relatively complex local constraints, the gradient calculation amount is large, and the calculation burden of the intelligent agents in the network is heavy, so that the calculation and communication efficiency of a multi-intelligent-agent system is low, and the like, and therefore the requirements of people cannot be met.
Disclosure of Invention
The invention provides a distributed projection algorithm based on variance reduction technology, which can greatly reduce the calculation cost of the intelligent agents in the network, thereby reducing the communication and calculation pressure of the whole multi-intelligent-agent system.
The invention adopts the following technical scheme:
a distributed projection method based on variance reduction technology and considering communication delay comprises the following steps:
step 1, providing an original optimization problem model (1) for a multi-intelligent system simultaneously provided with local set constraint and local equality constraint;
step 2, equivalently converting the original optimization problem model (1) obtained in the step 1 into a convex optimization problem model (2) convenient for distribution processing;
step 3, a distributed projection algorithm (3) based on a variance reduction technology is provided to solve a convex optimization problem model (2) with constraints, namely, a local random average gradient is adopted to estimate a local full gradient unbiased, so that heavy calculation burden caused by calculation of full gradients of all local objective functions in each iteration is relieved;
step 4, carrying out convergence analysis on the distributed projection algorithm (3) based on the variance reduction technology, which is provided in the step 3;
as a preferred technical scheme of the invention, the specific construction process and form of the original optimization problem model (1) in the step 1 are as follows:
firstly: defining an agent cluster V ═ {1, …, m }, communication network edge set
Figure RE-GDA0002752674210000021
And a contiguous matrix
Figure RE-GDA0002752674210000022
Directed communication network
Figure RE-GDA0002752674210000023
And simple network G has no self-loops; when agent (i, j) is E, aij=aji> 0, otherwise aij=aji0; degree of agent i is represented as
Figure RE-GDA0002752674210000024
For diagonal matrix D ═ diag { D1,d2,...,dmThe Laplacian matrix of the undirected network G is defined as
Figure RE-GDA0002752674210000025
If the undirected network G is connected, then the Laplace matrix
Figure RE-GDA0002752674210000026
Are symmetrical and semi-positive;
secondly, the original optimization problem model (1) is embodied as follows
Figure BDA0002563395080000028
In the above formula, the objective function
Figure BDA0002563395080000029
Representing samples of a real problem requiring processing, said
Figure BDA00025633950800000219
Representing a decision vector, qiRepresenting the total number of local questions assigned to agent i; while in the above equation the local objective function is further decomposed into
Figure BDA00025633950800000210
Wherein
Figure BDA00025633950800000211
h∈{1,…,qiIs a sub-function of the h local objective function; based on the above, define
Figure BDA00025633950800000212
For closed convex sets, and intersection X is non-empty, a column full rank matrix is defined
Figure BDA00025633950800000213
And
Figure BDA00025633950800000214
defining an optimal solution for the constrained convex optimization problem (1) as
Figure BDA00025633950800000215
As a preferred technical solution of the present invention, the convex optimization problem model (2) in step 2 has the following specific form:
Figure BDA00025633950800000216
wherein xiPairing decision vectors for agent i
Figure BDA00025633950800000217
An estimated value of (d);
defining matrix B as a diagonal matrix with full rank column and diagonal elements of { B }1,...,BmI.e. that
Figure BDA00025633950800000218
Stacked vector
Figure BDA0002563395080000031
Order to
Figure BDA0002563395080000032
Is the Cartesian product; order to
Figure BDA0002563395080000033
Wherein
Figure BDA0002563395080000034
Represents a symbol of kronecker product; q. q.siIs represented by qmaxAnd q ismin(wherein q isminNot less than 1, namely: each agent processes at least one sample); from the above statements, λ can be obtainedmin(BTB)qminIs greater than 0; based on the convex optimization problem model (2), the following assumptions and definitions are made:
assume that 1: each local sub-targeting function fi hAre both strongly convex and all have a risch continuous gradient. Namely: for all i ∈ V, h ∈ {1,. qiAre multiplied by
Figure BDA0002563395080000035
The following formula holds:
Figure BDA0002563395080000036
Figure BDA0002563395080000037
wherein mu is more than 0 and less than or equal to Lf(ii) a Then, under the assumption that one is true, the globally optimal solution of the constrained convex optimization problem (2) is unique and expressed as
Figure BDA0002563395080000038
Assume 2: the undirected network G is connected;
assume that 3: for the
Figure BDA00025633950800000320
And
Figure BDA0002563395080000039
exist of
Figure BDA00025633950800000310
Wherein B is0Is a positive integer.
Definition 1: defining global vectors to collect local variables xi,k,yi,k,wi,k,gi,kAnd
Figure BDA00025633950800000311
the following were used:
Figure BDA00025633950800000312
Figure BDA00025633950800000313
Figure BDA00025633950800000314
Figure BDA00025633950800000315
Figure BDA00025633950800000316
and a global vector xkAnd wkVersion of local delay:
Figure BDA00025633950800000317
Figure BDA00025633950800000318
then, at the k-th iteration, the communication delay
Figure BDA00025633950800000319
i, j ∈ V, determined by agent i and agent j simultaneously, so the global delay vector xk[i]And wk[i]Held only by agent i.
As a preferred technical solution of the present invention, a specific iterative process of the distributed projection algorithm (3) based on the variance reduction technique in step 3 is as follows:
initialization: for all agents i ∈ V, initialize xi,0,
Figure BDA0002563395080000041
Setting: k is 0
For agent i 1
1: from the set { 1.,. q., }iArbitrarily select a sample
Figure BDA0002563395080000042
2: the local random mean gradient is calculated as follows
Figure BDA0002563395080000043
3: is provided with
Figure BDA0002563395080000044
And store
Figure BDA0002563395080000045
4: updating variable xi,k+1As follows
Figure BDA0002563395080000046
5: updating variable yi,k+1As follows
yi,k+1=yi,k+Bixi,k+1-bi
6: updating variable wi,k+1As follows
wi,k+1=wi,k+βxi,k+1
End of cycle
Setting k to k +1, and repeating the cycle until a stop condition is met;
wherein, the
Figure BDA0002563395080000047
As a sub-function of a local objective function
Figure BDA0002563395080000048
h∈{1,...,qiThe iteration value at the kth iteration,
Figure BDA0002563395080000049
representing an n-dimensional real column vector.
As a preferable technical means of the present invention, the above
Figure BDA00025633950800000410
The iteration rule of (1) is as follows:
Figure BDA00025633950800000411
at iteration k, for agent i, a local random mean gradient is defined:
Figure BDA00025633950800000412
wherein
Figure BDA0002563395080000051
The calculation can be performed using the following iterations:
Figure BDA0002563395080000052
let FkRepresenting the σ -algebra produced by the local random mean gradient at iteration k, the following equation can be obtained:
Figure BDA0002563395080000053
as a preferred technical solution of the present invention, the convergence analysis process in step 4 is as follows:
the following definitions are first made:
definition 2: for 0 < alpha < 1/lambdamax(L), defining a semi-positive definite matrix P as:
Figure BDA0002563395080000054
where W ═ I- α L is a positive definite matrix, then:
Figure BDA0002563395080000055
wherein the vector
Figure BDA0002563395080000056
And U*=[(x*)T,(y*)T,(w*)T]T
Then, combining hypotheses 1-3 and definitions 1-2 yields:
consider the variance reduction technique based distributed projection algorithm (3) and definition 2U under the assumption that 1-3 holdskWith the definition of U, if the parameters η, Φ and ξ satisfy:
Figure BDA0002563395080000057
0<φ<2μ (21b)
Figure BDA0002563395080000058
then, the constant step α and the algorithm parameter β satisfy:
Figure BDA0002563395080000059
Figure BDA00025633950800000510
then the sequence Uk}k≥0Is bounded and converged, then the sequence xk}k≥0Is uniquely converged on x*In (1).
The invention has the beneficial effects that:
1. the algorithm provided by the invention estimates the local full gradient unbiased by means of the local random average gradient, so that the calculation cost of the intelligent agent in the network can be greatly reduced, the communication and calculation pressure of the whole multi-intelligent-agent system is reduced, less gradient calculation cost is spent when the same convergence precision is reached, and less communication times are required;
2. compared with the existing distributed random gradient optimization algorithm, the algorithm provided by the invention can solve the more complex optimization problem, namely: a convex optimization problem with both local set constraints and local equality constraints;
3. compared with most of the existing optimization algorithms considering communication time delay, the algorithm provided by the invention also considers the privacy of the local information of the intelligent agent while introducing the communication time delay, and has higher practical value.
Drawings
FIG. 1 is a undirected network connectivity diagram;
FIG. 2 is a graph comparing the performance of the algorithm of the present invention with that of the prior art;
FIG. 3 illustrates the instantaneous behavior of an agent without communication delay in accordance with the present invention;
FIG. 4 illustrates the instantaneous behavior of an agent in the presence of communication delays in accordance with the present invention;
Detailed Description
The invention will now be described in further detail with reference to the drawings and examples.
First, the following is defined for each symbol in the following formula:
Figure BDA0002563395080000061
a set of real numbers is represented as,
Figure BDA0002563395080000062
representing an n-dimensional real column vector,
Figure BDA0002563395080000063
the dimension-real matrix represents m × n;
the identity matrix is represented by I, the dimensions of which are determined by the context;
λ2(. -) represents the minimum non-zero eigenvalue of a semi-positive definite matrix, λ, for a real symmetric matrix Amax(A) And λmin(A) Respectively representing the maximum characteristic value and the minimum characteristic value;
Figure BDA0002563395080000064
and
Figure BDA0002563395080000065
respectively represent ith row and ith column of the matrix A;
Figure BDA0002563395080000066
is a kronecker product notation;
Figure BDA0002563395080000067
and
Figure BDA0002563395080000068
xTand ATRepresents the transpose of vector x and the transpose of matrix a;
the Euclidean norm of the vector and the spectral norm of the matrix are uniformly expressed by | | · |;
for a semi-positive definite matrix
Figure BDA0002563395080000071
And a vector x of the sum vector x,
Figure BDA0002563395080000072
defining a scalar product<x,y>A=<x,Ay>And is and
Figure BDA0002563395080000073
an A matrix weighted norm representing vector x;
e [ x ] represents the expectation for a random variable x;
vector quantity
Figure BDA0002563395080000074
In closed convex set
Figure BDA0002563395080000075
Is represented as PX[x]Namely: pX[x]=arg minv∈X||v-x||。
The following embodiments of the present invention are described below:
a distributed projection method based on variance reduction technology and considering communication delay comprises the following steps:
step 1, providing an original optimization problem model (1) for a multi-intelligent system simultaneously provided with local set constraint and local equality constraint;
step 2, equivalently converting the original optimization problem model (1) obtained in the step 1 into a convex optimization problem model (2) convenient for distribution processing;
step 3, a distributed projection algorithm (3) based on a variance reduction technology is provided to solve a convex optimization problem model (2) with constraints, namely, a local random average gradient is adopted to estimate a local full gradient unbiased, so that heavy calculation burden caused by calculation of full gradients of all local objective functions in each iteration is relieved;
step 4, carrying out convergence analysis on the distributed projection algorithm (3) based on the variance reduction technology, which is provided in the step 3;
the specific construction process and form of the original optimization problem model (1) in the step 1 are as follows:
firstly: defining an agent cluster V ═ {1, …, m }, communication network edge set
Figure RE-GDA0002752674210000076
And a contiguous matrix
Figure RE-GDA0002752674210000077
Directed communication network
Figure RE-GDA0002752674210000078
And simple network G has no self-loops;
when agent (i, j) is E, aij=aji> 0, otherwise aij=aji=0;
Degree of agent i is represented as
Figure BDA00025633950800000710
For diagonal matrix D ═ diag { D1,d2,...,dmThe Laplace matrix of the undirected network G is defined as
Figure BDA00025633950800000711
If the undirected network G is connected, then the Laplace matrix
Figure BDA00025633950800000712
Are symmetrical and semi-positive;
secondly, the original optimization problem model (1) is embodied as follows
Figure BDA0002563395080000081
In the above formula, the objective function
Figure BDA0002563395080000082
Representing samples of a real problem requiring processing, said
Figure BDA00025633950800000816
Representing a decision vector, qiRepresenting the total number of local questions assigned to agent i;
while in the above equation the local objective function is further decomposed into
Figure BDA0002563395080000083
Wherein
Figure BDA0002563395080000084
h∈{1,...,qiIs a sub-function of the h local objective function;
based on the above formula, define
Figure BDA0002563395080000085
For closed convex sets, and with the intersection X non-empty, a column full rank matrix is defined
Figure BDA0002563395080000086
And
Figure BDA0002563395080000087
defining an optimal solution for the constrained convex optimization problem (1) as
Figure BDA0002563395080000088
The concrete form of the convex optimization problem model (2) in the step 2 is as follows:
Figure BDA0002563395080000089
wherein xiPairing decision vectors for agent i
Figure BDA00025633950800000810
An estimated value of (d);
defining matrix B as a diagonal matrix with full rank column and diagonal elements of { B }1,...,BmI.e. that
Figure BDA00025633950800000811
Stacked vector
Figure BDA00025633950800000812
Order to
Figure BDA00025633950800000813
Is the Cartesian product;
order to
Figure BDA00025633950800000814
qiIs represented by qmaxAnd q ismin(wherein q isminNot less than 1, namely: each agent processes at least one sample);
from the above statements, λ can be obtainedmin(BTB)qmin>0;
Based on the convex optimization problem model (2), the following assumptions and definitions are made:
assume that 1: each local sub-targeting function fi hAre both strongly convex and all have a risch continuous gradient. Namely: for all i ∈ V, h ∈ {1,. qiAre multiplied by
Figure BDA00025633950800000815
The following formula holds:
Figure BDA0002563395080000091
Figure BDA0002563395080000092
wherein mu is more than 0 and less than or equal to Lf
Then, assuming a true condition, the globally optimal solution of the constrained convex optimization problem (2) is unique and expressed as
Figure BDA0002563395080000093
Assume 2: the undirected network G is connected;
assume that 3: for the
Figure BDA0002563395080000094
And
Figure BDA0002563395080000095
exist of
Figure BDA0002563395080000096
Wherein B is0Is a positive integer.
Definition 1: defining global vectors to collect local variables xi,k,yi,k,wi,k,gi,kAnd
Figure BDA0002563395080000097
the following were used:
Figure BDA0002563395080000098
Figure BDA0002563395080000099
Figure BDA00025633950800000910
Figure BDA00025633950800000911
Figure BDA00025633950800000912
and a global vector xkAnd wkVersion of local delay:
Figure BDA00025633950800000913
Figure BDA00025633950800000914
then, at the k-th iteration, the communication delay
Figure BDA00025633950800000915
i, j ∈ V, determined by agent i and agent j simultaneously, so the global delay vector xk[i]And wk[i]Held only by agent i.
The specific iterative process of the distributed projection algorithm (3) based on the variance reduction technology in the step 3 is as follows:
initialization: for all agents i ∈ V, initialize xi,0,
Figure BDA00025633950800000916
Setting: k is 0
For agent i 1
1: from the set { 1.,. q., }iArbitrarily select a sample
Figure BDA00025633950800000917
2: the local random mean gradient is calculated as follows
Figure BDA0002563395080000101
3: is provided with
Figure BDA0002563395080000102
And store
Figure BDA0002563395080000103
4: updating variable xi,k+1As follows
Figure BDA0002563395080000104
5: updating variable yi,k+1As follows
yi,k+1=yi,k+Bixi,k+1-bi
6: updating variable wi,k+1As follows
wi,k+1=wi,k+βxi,k+1
End of cycle
Setting k to k +1, and repeating the cycle until a stop condition is met;
wherein, the
Figure BDA0002563395080000105
As a sub-function of a local objective function
Figure BDA0002563395080000106
h∈{1,...,qiThe iteration value at the kth iteration,
Figure BDA0002563395080000107
representing an n-dimensional real column vector.
The above-mentioned
Figure BDA0002563395080000108
The iteration rule of (1) is as follows:
Figure BDA0002563395080000109
at iteration k, for agent i, a local random mean gradient is defined:
Figure BDA00025633950800001010
wherein
Figure BDA00025633950800001011
The calculation can be performed using the following iterations:
Figure BDA00025633950800001012
let FkRepresenting the σ -algebra produced by the local random mean gradient at iteration k, the following equation can be obtained:
Figure BDA00025633950800001013
the convergence analysis process in step 4 is as follows:
firstly, in the practical application process, the following 7 arguments are adopted in the convergence analysis of the embodiment: introduction 1: for any non-empty closed convex set X, the following two inequalities hold
Figure BDA0002563395080000111
Figure BDA0002563395080000112
Wherein P isX[·]Is a projection operator;
2, leading: if there is
Figure BDA0002563395080000113
And
Figure BDA0002563395080000114
globally optimal solution of constrained convex optimization problem (2) under the assumption that one is true
Figure BDA0002563395080000115
Exists exclusively and has:
Figure BDA0002563395080000116
wherein the constant step length alpha is more than 0, and the parameter beta is more than 0;
and 3, introduction: considering a sequence generated by a distributed projection algorithm (3) based on a variance reduction technique under the condition that 1-2 is assumed to be established
Figure BDA0002563395080000117
And { gk}k≥0To a
Figure BDA0002563395080000118
Is provided with
Figure BDA0002563395080000119
Wherein the auxiliary sequence { pk}k≥0Is defined as:
Figure BDA00025633950800001110
sequence { pk}k≥0Non-negative under the assumption that one is true;
and (4) introduction: considering a distributed projection algorithm (3) and a sequence (13) based on a variance reduction technique, for the assumption that 1 holds
Figure BDA00025633950800001111
Is provided with
Figure BDA00025633950800001112
And (5) introduction: consider a global vector v under the condition that assumption 3 holdsk=[(v1,k)T,...,(vm,k)T]TAnd its delayed version vk[i]The method comprises the following steps:
Figure BDA0002563395080000121
wherein
Figure BDA0002563395080000122
For a given sequence vt}t≥0We give
Figure BDA0002563395080000123
Where l and d are two non-negative scalars; then, will
Figure BDA0002563395080000124
The superposition with respect to k from 0 to n can be obtained
Figure BDA0002563395080000125
And (6) introduction: considering a distributed projection algorithm (3) based on variance reduction technique under the condition that 1-3 are assumed to be true, the following inequality is true
Figure BDA0002563395080000126
Wherein
Figure BDA0002563395080000127
And W ═ I- α L, Φ, η are positive constants;
the specific demonstration process of the above conclusion is as follows:
according to definition 1, we present a shorthand form of the distributed projection algorithm (3) based on the variance reduction technique as follows:
Figure BDA0002563395080000131
yi,k+1=yi,k+Bixi,k+1-bi (9b)
wi,k+1=wi,k+βxi,k+1 (9c)
wherein
Figure BDA0002563395080000132
And v isi,kThe definition is as follows:
Figure BDA0002563395080000133
according to (9a), we have:
Figure BDA0002563395080000134
Figure BDA0002563395080000141
wherein the inequality uses the following equation:
(i) note that xk+1=PX[vk]Then, according to theorem 1, the following equation holds:
Figure BDA0002563395080000142
wherein
Figure BDA0002563395080000143
And is
Figure BDA0002563395080000144
(ii) Similar to [12], we have
Figure BDA0002563395080000145
Then continuing the analysis
Figure BDA0002563395080000146
Figure BDA0002563395080000147
Wherein eta and
Figure BDA0002563395080000148
for positive constants, the first inequality applies the young's inequality, the second inequality applies the function f is strongly convex and has a Lipschitz continuous gradient, and substituting (27) into (24) yields:
Figure BDA0002563395080000151
next to inner pair 2 alpha (x)k+1-x*)TBTB(xk+1-xk) And (3) processing:
Figure BDA0002563395080000152
substituting the result of equation (29) into equation (28), and obtaining the desired result:
Figure BDA0002563395080000153
Figure BDA0002563395080000161
from the formula (8), we know
Figure BDA0002563395080000162
Therefore, we next pair
Figure BDA0002563395080000163
To perform treatment
Figure BDA0002563395080000164
Wherein p iskDefinition in (13), the first equation in (31) uses the standard variance decomposition E [ | | a-E [ a | F [ ]k]||2|Fk]=E[||a||2|Fk]-||E[a|Fk]||2The inequality uses the strong convexity sum of f
Figure BDA0002563395080000165
Liphoz continuity of (a). Next, substituting the conclusion of (31) into (30) can result in:
Figure BDA0002563395080000166
next, we will introduce an important relation,
Figure BDA0002563395080000167
where V is a semi-positive definite matrix. From this relationship, we can obtain the following three equations:
Figure BDA0002563395080000171
finally, the result of expression (33) is substituted into expression (32).
And (3) introduction 7: under the condition that the assumption 3 is established, the following two inequalities are established
Figure BDA0002563395080000172
Figure BDA0002563395080000173
In which ξ1,ξ2Are two arbitrary positive constants; it is noted that when there is no network determination,
Figure BDA0002563395080000174
and is thus determined.
The above conclusion is specifically demonstrated as follows:
we first demonstrated (19a) in lemma 7
Figure BDA0002563395080000175
The second inequality uses the lemma 5, the last inequality uses the young inequality, and xi1Is a positive constant; (19b) the certification process of (19a) is similar to that of (19a), and thus will not be described in detail;
next, for the convenience of analysis, the following definitions are made:
definition 2: for 0 < alpha < 1/lambdamax(L), defining a semi-positive definite matrix P as:
Figure BDA0002563395080000181
where W ═ I- α L is a positive definite matrix, then:
Figure BDA0002563395080000182
wherein the vector
Figure BDA0002563395080000183
And U*=[(x*)T,(y*)T,(w*)T]T
Then combining hypotheses 1-3 and definitions 1-2 can conclude as follows:
consider the variance reduction technique based distributed projection algorithm (3) and definition 2U under the assumption that 1-3 holdskWith the definition of U, if the parameters η, Φ and ξ satisfy:
Figure BDA0002563395080000184
0<φ<2μ (21b)
Figure BDA0002563395080000185
then, the constant step α and the algorithm parameter β satisfy:
Figure BDA0002563395080000186
Figure BDA0002563395080000187
then the sequence Uk}k≥0Is bounded and converged, then the sequence xk}k≥0Is uniquely converged on x*In (1).
The specific demonstration process is as follows:
for α > 0 and β > 0, substituting the results of theorem 7 into theorem 6 yields:
Figure BDA0002563395080000188
Figure BDA0002563395080000191
wherein
Figure BDA0002563395080000192
Is defined in theorem 6. Next, according to lemma 4, we will convert c (E [ p ]k+1|Fk]-pk) To (35)
Both ends were obtained:
Figure BDA0002563395080000193
we know the sequence p according to the lemma 3kNot less than 0; thus, if η > 2Lf[Lfqmax+qmin(Lf-μ)]/λmin(BTB)qminAnd 4 α qmaxLfAnd/η ≦ c, then equation (36) may be rewritten as:
Figure BDA0002563395080000194
according to definition 2, if 0 < alpha < 1/lambdamax(L) and 0 < beta < 1, we have
Figure BDA0002563395080000201
To handle the first term on the right hand side of the (38) inequality number, we set ξ below1=ξ2Xi, 0 < xi < 2 mu, and 0 < xi < (2 mu-phi)/(1 + beta), and we next define an nonnegative constant
Figure BDA0002563395080000202
Based on this definition, we can rewrite equation (38) as:
Figure BDA0002563395080000203
summing (39) from 0 to n with respect to k, yields:
Figure BDA0002563395080000204
under conditions (21) and (22), we define a semi-positive definite matrix
Figure BDA0002563395080000205
Thus, the inequality (40) can be rewritten as:
Figure BDA0002563395080000206
when n approaches infinity, we have
Figure BDA0002563395080000211
The above formula indicates that the right side of formula (39) is harmonizable. Thus, the sequence
Figure BDA0002563395080000212
Internal accumulation<·,·>PFitting Fej pir monotonous; we can directly derive the sequence
Figure BDA0002563395080000213
Is bounded and converged; thus, the sequence { U }k}k≥0Is bounded and converged; finally we can get the sequence xk}k≥0Converge on x*(ii) a Under the condition that the assumption 1 holds, we know the global optimal solution x*Are present only.
Detailed description of the preferred embodiments example 1
To demonstrate the effectiveness of the proposed algorithm, we considered using a multi-agent network with m-10 to solve the following least squares optimization problem:
Figure BDA0002563395080000214
wherein
Figure BDA0002563395080000215
And is
Figure BDA0002563395080000216
The abscissa represents the amount of calculation for calculating all samples at once. Let us set n 10, p i1, and the overall sample is Q1000; the total samples are randomly and evenly distributed among agents in the network; thus, each agent i ∈ V needs to process qiQ/m samples; local parameter
Figure BDA0002563395080000217
And
Figure BDA0002563395080000218
respectively in [ -1,1 [)]And [ -n, n [ -n]Randomly selecting the two groups; the equality constraint is defined as
Figure BDA0002563395080000219
When j is i, BiIs 1, otherwise is 0; biIs always 1; the local set constraint for agent i is defined as Xi=[-1n,1n]In which 1 isnRepresenting a column vector of all 1 dimensions n.
The results of the network application and the application of the application embodiment are shown in fig. 1 to 4, and specifically include:
fig. 1 shows a diagram of an experimental communication network, in which the communication rate of the network is 0.5;
fig. 2 is a comparison graph of performance of the algorithm of the present invention and the algorithm of the prior art, wherein the algorithm of the prior art is the algorithm disclosed in the documents "q.liu, s.yang, and y.hong," structured senses algorithms with fixed step size for distributed Control over multi-agent networks, "IEEE Transactions on Automatic Control, vol.62, No.8, pp.4259-4265,2017", and it is apparent from fig. 2 that the performance of the algorithm proposed by the present invention is optimal, that is: the convergence rate is fastest;
FIG. 3 is the instantaneous behavior of agent Nos. 2, 4, 6, 8, 10 without communication delay;
FIG. 4 is the instantaneous behavior of agent Nos. 2, 4, 6, 8, 10 in the presence of communication delay (and maximum communication delay per iteration of 10) in accordance with the present invention;
it can be known from the combination of fig. 3 and fig. 4 that the communication delay has a large influence on the instantaneous behavior of the intelligent agent.
Finally, it should be noted that: these embodiments are merely illustrative of the present invention and do not limit the scope of the present invention. Moreover, it will be apparent to those skilled in the art that various other changes and modifications can be made based on the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications of the invention may be made without departing from the scope of the invention.

Claims (6)

1. A distributed projection method based on variance reduction technology and considering communication delay comprises the following steps:
step 1, providing an original optimization problem model (1) aiming at a multi-intelligent system simultaneously provided with local set constraint and local equality constraint;
step 2, equivalently converting the original optimization problem model (1) obtained in the step 1 into a convex optimization problem model (2) convenient for distributed processing;
step 3, a distributed projection algorithm (3) based on a variance reduction technology is provided to solve a convex optimization problem model (2) with constraints, namely, a local random average gradient is adopted to estimate a local full gradient unbiased, so that heavy calculation burden caused by calculation of full gradients of all local objective functions in each iteration is relieved;
and 4, carrying out convergence analysis on the distributed projection algorithm (3) based on the variance reduction technology, which is provided in the step 3.
2. The distributed projection method based on variance reduction technology considering communication delay as claimed in claim 1, wherein:
the specific construction process and form of the original optimization problem model (1) in the step 1 are as follows:
firstly: defining an agent cluster V ═ {1, …, m }, communication network edge set
Figure RE-FDA0002752674200000011
And adjacency matrix
Figure RE-FDA0002752674200000012
Directed communication network
Figure RE-FDA0002752674200000013
And simple network G has no self-loops;
when agent (i, j) is E, aij=aji> 0, otherwise aij=aji=0;
Degree of agent i is represented as
Figure RE-FDA0002752674200000014
For diagonal matrix D ═ diag { D1,d2,...,dmThe Laplace matrix of the undirected network G is defined as
Figure RE-FDA0002752674200000015
If the undirected network G is connected, then the Laplace matrix
Figure RE-FDA0002752674200000016
Are symmetrical and semi-positive;
secondly, the original optimization problem model (1) is embodied as follows
Figure RE-FDA0002752674200000017
In the above formula, the objective function
Figure RE-FDA0002752674200000018
Representing samples of a real problem requiring processing, said
Figure RE-FDA0002752674200000019
Representing a decision vector, qiRepresenting the total number of local questions assigned to agent i;
while in the above equation the local objective function is further decomposed into
Figure RE-FDA0002752674200000021
Wherein
Figure RE-FDA0002752674200000022
Is a sub-function of the h local objective function;
based on the above formula, define
Figure RE-FDA0002752674200000023
For closed convex sets, and with the intersection X non-empty, a column full rank matrix is defined
Figure RE-FDA0002752674200000024
And
Figure RE-FDA0002752674200000025
defining an optimal solution for the constrained convex optimization problem (1) as
Figure RE-FDA0002752674200000026
3. The distributed projection method based on variance reduction technology considering communication delay as claimed in claim 2, wherein:
the concrete form of the convex optimization problem model (2) in the step 2 is as follows:
Figure FDA0002563395070000027
wherein xiPairing decision vectors for agent i
Figure FDA0002563395070000028
An estimated value of (d);
defining matrix B as a diagonal matrix with full rank column and diagonal elements of { B }1,...,BmI.e. that
Figure FDA0002563395070000029
Stacked vector
Figure FDA00025633950700000210
Order to
Figure FDA00025633950700000211
Is the Cartesian product; note the book
Figure FDA00025633950700000212
Wherein
Figure FDA00025633950700000213
Represents a symbol of kronecker product; q. q.siRespectively maximum and minimum values ofShown as qmaxAnd q ismin(wherein q isminNot less than 1, namely: each agent processes at least one sample); from the above statements, λ can be obtainedmin(BTB)qmin>0;
Based on the convex optimization problem model (2), the following assumptions and definitions are made:
assume that 1: each local sub-targeting function fi hAre both strongly convex and all have a risch continuous gradient. Namely: for all i ∈ V, h ∈ {1,. qiAre multiplied by
Figure FDA00025633950700000214
The following formula holds:
Figure FDA00025633950700000215
Figure FDA00025633950700000216
wherein mu is more than 0 and less than or equal to Lf
Then, assuming a true condition, the globally optimal solution of the constrained convex optimization problem (2) is unique and expressed as
Figure FDA00025633950700000217
Assume 2: the undirected network G is connected;
assume that 3: for the
Figure FDA0002563395070000031
And
Figure FDA0002563395070000032
exist of
Figure FDA0002563395070000033
Wherein B is0Is a positive integer.
Definition 1: defining global vectors to collect local variables xi,k,yi,k,wi,k,gi,kAnd
Figure FDA0002563395070000034
the following were used:
Figure FDA0002563395070000035
Figure FDA0002563395070000036
Figure FDA0002563395070000037
Figure FDA0002563395070000038
Figure FDA0002563395070000039
and a global vector xkAnd wkVersion of local delay:
Figure FDA00025633950700000310
Figure FDA00025633950700000311
then, at the k-th iteration, the communication delay
Figure FDA00025633950700000312
Determined by agent i and agent j simultaneously, and thus, the global delay vector xk[i]And wk[i]Held only by agent i.
4. The distributed projection method based on variance reduction technology considering communication delay as claimed in claim 3, wherein:
the specific iterative process of the distributed projection algorithm (3) based on the variance reduction technology in the step 3 is as follows:
initialization: for all agents i e V, initialize
Figure FDA00025633950700000313
Setting: k is 0
For agent i 1
1: from the set { 1.,. q., }iArbitrarily select a sample
Figure FDA00025633950700000314
2: the local random mean gradient is calculated as follows
Figure FDA00025633950700000315
3: is provided with
Figure FDA00025633950700000316
And store
Figure FDA00025633950700000317
4: updating variable xi,k+1As follows
Figure FDA0002563395070000041
5: updating variable yi,k+1As follows
yi,k+1=yi,k+Bixi,k+1-bi
6: updating variable wi,k+1As follows
wi,k+1=wi,k+βxi,k+1
End of cycle
Setting k to k +1, and repeating the cycle until a stop condition is met;
wherein, the
Figure FDA0002563395070000042
As a sub-function of a local objective function
Figure FDA0002563395070000043
The iteration value at the k-th iteration,
Figure FDA0002563395070000044
representing an n-dimensional real column vector.
5. The distributed projection method based on variance reduction technology considering communication delay as claimed in claim 4, wherein:
the above-mentioned
Figure FDA0002563395070000045
The iteration rule of (1) is as follows:
Figure FDA0002563395070000046
at iteration k, for agent i, a local random mean gradient is defined:
Figure FDA0002563395070000047
wherein
Figure FDA0002563395070000048
The calculation can be performed using the following iterations:
Figure FDA0002563395070000049
let FkRepresenting the σ -algebra produced by the local random mean gradient at iteration k, the following equation can be obtained:
Figure FDA00025633950700000410
6. the distributed projection method based on variance reduction technology considering communication delay as claimed in claim 5, wherein:
the convergence analysis process in step 4 is as follows:
the following definitions are first made:
definition 2: for 0 < alpha < 1/lambdamax(L), defining a semi-positive definite matrix P as:
Figure FDA0002563395070000051
where W ═ I- α L is a positive definite matrix, then:
Figure FDA0002563395070000052
wherein the vector
Figure FDA0002563395070000053
And U*=[(x*)T,(y*)T,(w*)T]T
Then, combining hypotheses 1-3 and definitions 1-2 yields:
consider the variance reduction technique based distributed projection algorithm (3) and definition 2U under the assumption that 1-3 holdskAnd U*If the parameters η, φ and ξ satisfy:
Figure FDA0002563395070000054
0<φ<2μ (2]b)
Figure FDA0002563395070000055
then, the constant step α and the algorithm parameter β satisfy:
Figure FDA0002563395070000056
Figure FDA0002563395070000057
then the sequence Uk}k≥0Is bounded and converged, then the sequence xk}k≥0Is uniquely converged on x*In (1).
CN202010614853.XA 2020-06-30 2020-06-30 Distributed projection method based on variance reduction technology and considering communication time delay Active CN112069631B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010614853.XA CN112069631B (en) 2020-06-30 2020-06-30 Distributed projection method based on variance reduction technology and considering communication time delay

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010614853.XA CN112069631B (en) 2020-06-30 2020-06-30 Distributed projection method based on variance reduction technology and considering communication time delay

Publications (2)

Publication Number Publication Date
CN112069631A true CN112069631A (en) 2020-12-11
CN112069631B CN112069631B (en) 2024-05-24

Family

ID=73656196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010614853.XA Active CN112069631B (en) 2020-06-30 2020-06-30 Distributed projection method based on variance reduction technology and considering communication time delay

Country Status (1)

Country Link
CN (1) CN112069631B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076662A (en) * 2021-05-01 2021-07-06 群智未来人工智能科技研究院(无锡)有限公司 Linear convergence distributed discrete time optimization algorithm for constraint optimization problem
CN115691675A (en) * 2022-11-10 2023-02-03 西南大学 Efficient mushroom toxicity identification method based on asynchronous distributed optimization algorithm

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130198372A1 (en) * 2011-12-15 2013-08-01 Massachusetts Institute Of Technology Distributed newton method and apparatus for network utility maximization
CN108430047A (en) * 2018-01-19 2018-08-21 南京邮电大学 A kind of distributed optimization method based on multiple agent under fixed topology
WO2019134254A1 (en) * 2018-01-02 2019-07-11 上海交通大学 Real-time economic dispatch calculation method using distributed neural network
CN110311388A (en) * 2019-05-28 2019-10-08 广东电网有限责任公司电力调度控制中心 Control method for frequency of virtual plant based on distributed projection subgradient algorithm
CN111259327A (en) * 2020-01-15 2020-06-09 桂林电子科技大学 Subgraph processing-based optimization method for consistency problem of multi-agent system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130198372A1 (en) * 2011-12-15 2013-08-01 Massachusetts Institute Of Technology Distributed newton method and apparatus for network utility maximization
WO2019134254A1 (en) * 2018-01-02 2019-07-11 上海交通大学 Real-time economic dispatch calculation method using distributed neural network
CN108430047A (en) * 2018-01-19 2018-08-21 南京邮电大学 A kind of distributed optimization method based on multiple agent under fixed topology
CN110311388A (en) * 2019-05-28 2019-10-08 广东电网有限责任公司电力调度控制中心 Control method for frequency of virtual plant based on distributed projection subgradient algorithm
CN111259327A (en) * 2020-01-15 2020-06-09 桂林电子科技大学 Subgraph processing-based optimization method for consistency problem of multi-agent system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AVINASH KUMAR ROY: "Development of event-triggered-based minimum variance recursive estimator for the NLNS using multi-model approach", 《IET SIGNAL PROCESS.》, vol. 13, no. 9, pages 766 - 777, XP006088022, DOI: 10.1049/iet-spr.2018.5546 *
任芳芳;李德权;: "时延情形下的分布式随机无梯度优化算法", 安徽理工大学学报(自然科学版), no. 01, pages 34 - 39 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076662A (en) * 2021-05-01 2021-07-06 群智未来人工智能科技研究院(无锡)有限公司 Linear convergence distributed discrete time optimization algorithm for constraint optimization problem
CN115691675A (en) * 2022-11-10 2023-02-03 西南大学 Efficient mushroom toxicity identification method based on asynchronous distributed optimization algorithm
CN115691675B (en) * 2022-11-10 2023-06-06 西南大学 Efficient fungus toxicity identification method based on asynchronous distributed optimization algorithm

Also Published As

Publication number Publication date
CN112069631B (en) 2024-05-24

Similar Documents

Publication Publication Date Title
Liu et al. Frequency-domain dynamic pruning for convolutional neural networks
WO2021259357A1 (en) Privacy-preserving asynchronous federated learning for vertical partitioned data
Delfosse et al. Adaptive blind separation of independent sources: a deflation approach
CN112069631A (en) Distributed projection method considering communication time delay and based on variance reduction technology
CN107480685B (en) GraphX-based distributed power iterative clustering method and device
Wierman Substitution method critical probability bounds for the square lattice site percolation model
Zhang et al. Bi-alternating direction method of multipliers over graphs
Mokhtari et al. Network newton-part i: Algorithm and convergence
CN112529193A (en) Data processing method based on quantum system and quantum device
Li et al. Surrogate-based distributed optimisation for expensive black-box functions
Ye et al. PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction
Mandilara et al. Quantum entanglement via nilpotent polynomials
Ling Generalized power method for generalized orthogonal Procrustes problem: global convergence and optimization landscape analysis
CN112258410B (en) Differentiable low-rank learning network image restoration method
Matei et al. Distributed algorithms for optimization problems with equality constraints
Vladimirov et al. Directly coupled observers for quantum harmonic oscillators with discounted mean square cost functionals and penalized back-action
Fang et al. Faster convergence of a randomized coordinate descent method for linearly constrained optimization problems
Hegde et al. A Kaczmarz algorithm for solving tree based distributed systems of equations
CN108520027A (en) A kind of Frequent Itemsets Mining Algorithm that the GPU based on CUDA frames accelerates
Ge et al. From an interior point to a corner point: smart crossover
Wang et al. Tensor Decomposition based Personalized Federated Learning
Gkillas et al. Federated dictionary learning from non-iid data
Wei et al. Weighted averaged stochastic gradient descent: Asymptotic normality and optimality
Ashurbekova et al. Robust structure learning using multivariate T-distributions
Karakasis et al. Alternating optimization for tensor factorization with orthogonality constraints: Algorithm and parallel implementation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant