CN111950194B

CN111950194B - Newton momentum-based distributed acceleration composite optimization method and system

Info

Publication number: CN111950194B
Application number: CN202010709580.7A
Authority: CN
Inventors: 李华青; 郑李逢; 夏大文; 严羽; 吕庆国; 王政; 胡锦辉; 程胡强; 冉亮; 丁文韬; 苏恩冰
Original assignee: Southwest University
Current assignee: Southwest University
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2023-04-07
Anticipated expiration: 2040-07-22
Also published as: CN111950194A

Abstract

The invention discloses a Newton momentum-based distributed acceleration composite optimization method and system, which are characterized in that on the basis that a plurality of intelligent agents are connected into a non-directional network, a smooth structure and a non-smooth structure are combined to establish an objective function, so that the coverage range of the processed problem is wider, the established model is more accurate, the problem can be converged to a global optimal solution at a linear speed, the convergence speed is higher than that of a similar method by introducing a momentum acceleration item and a gradient tracking item, and the processing speed of large-scale intelligent automation equipment data can be effectively improved.

Description

Newton momentum-based distributed acceleration composite optimization method and system

Technical Field

The invention relates to the technical field of computers, in particular to a Newton momentum-based distributed acceleration composite optimization method and system.

Background

Some optimization problems need to be solved in the fields of machine learning, statistical learning, unmanned aerial vehicle formation navigation, non-inductive sensor networks and the like, and the problems can be solved only through a single intelligent body when the problems are simpler. However, as information technology is continuously developed, in order to obtain more accurate solutions, the size of data to be considered and processed is larger and more accurate problem models need to be established, and the problem models are no longer simple smooth functions capable of representing problems, and may involve problems in a non-smooth form.

Considering the limited computing resources of the existing computer, a single agent cannot easily cope with the optimization problem of the large-scale compound form (smooth + non-smooth), resulting in slow data processing speed of a large number of intelligent automation devices.

Disclosure of Invention

The invention aims to solve the technical problem of the prior art, provides a Newton momentum-based distributed acceleration composite optimization method and system, and can effectively improve the data processing speed of large-scale intelligent automation equipment.

The technical scheme for solving the technical problems is as follows: a Newton momentum-based distributed acceleration composite optimization method comprises the following steps:

s1, connecting a plurality of agents into a non-directional communication network, and establishing an objective function combining a smooth structure and a non-smooth structure based on the plurality of agents:

wherein the content of the first and second substances,

is a smooth local objective function known only to agent i>

Is a non-smooth local function known only to agent i, x is the set of feasible solutions, m is the number of agents;

s2, each agent calculates local estimation value of each agent and sends the local estimation value to a first neighbor agent, wherein the first neighbor agent is a neighbor agent corresponding to the agent, and the neighbor agents are agents directly communicating between the two agents and are neighbor agents;

s3, the first neighbor agent calculates momentum acceleration items according to the received local estimated values and sends the momentum acceleration items to a second neighbor agent, wherein the second neighbor agent is a neighbor agent of the first neighbor agent;

s4, the second neighbor agent calculates a gradient tracking item according to the momentum acceleration item and sends the gradient tracking item to a third neighbor agent, wherein the third neighbor agent is an agent of the second neighbor agent;

and S5, circulating S2 to S4 until a preset condition is met, and terminating the circulation.

The method has the advantages that on the basis that a plurality of intelligent agents are connected into a non-directional network, the coverage range of the processed problems is wider by establishing the target function combining the smooth structure and the non-smooth structure, the established model is more accurate, the problem can be converged to the global optimal solution at a linear speed, the convergence speed is higher than that of a similar method by introducing the momentum acceleration item and the gradient tracking item, and the processing speed of large-scale intelligent automatic equipment data can be effectively improved.

Further, the calculation process of the local estimation in S2 is:

s201, each agent calculates local optimal solution of each agent

The calculation formula is as follows:

s202, calculating local estimation value of the local optimal solution according to the local optimal solution

The calculation formula is as follows:

wherein the content of the first and second substances,

is->

In the form of a sequential convex approximation>

Is f _i In or on>

α is a positive constant step.

The method has the advantages that the variable is updated instead of the target function by using the distributed optimization strategy and utilizing the continuous convex approximation replacement of the target function, so that the method can still solve the fixed point for the target problem when the target problem is not convex, and can converge to the global optimal solution at a linear speed for the problem which can be modeled as the convex function when the introduced step length alpha is positive and smaller than a given upper bound.

Further, the calculation process of the momentum acceleration term in S3 is:

s301, carrying out weighted average on the local estimation values to obtain local average estimation values

The calculation formula is as follows:

s302, estimating according to the local average

Calculating the momentum acceleration term according to the following calculation formula:

wherein w _ij Is weight, 0 is less than or equal to w _ij Is < 1, and

beta is a momentum term parameter.

The method has the advantages that the Newton momentum method is used for calculating the gradient in the steps S301 and S302, and the method has the advantages that under the condition that the updating direction is the same as the previous moment, the convergence speed can be accelerated to a certain extent, the updating direction of the gradient is adjusted, the stability of the distributed optimization method is improved, and the time overhead for solving the global optimal solution is reduced. The similar method also has a common momentum method, but the common momentum method is easy to have the condition of large fluctuation of variable values in the iteration process, and the system is unstable.

Further, the specific calculation formula of the gradient tracking term in S4 is as follows:

wherein, the first and the second end of the pipe are connected with each other,

is a function f _i Gradient of (. Cndot.).

The method has the advantages that by the aid of gradient tracking, the local intelligent agent can also track global gradient values, and the situation that the intelligent agent only can master local information and gets into solving a local optimal solution is avoided. Further, w is _ij The value rule is as follows:

defining an undirected graph

Wherein +>

Is the intelligent bank set, is asserted>

Is a set of edges that are to be considered,

is a weighted adjacency matrix in which the weights w for the edges (i, j) _ij The following conditions are satisfied: if (i, j) ∈ epsilon, then w _ij > 0, otherwise w _ij ＝0，/>

Wherein d is _i Is the number of neighbor agents for agent i.

A Newton momentum-based distributed acceleration composite optimization system comprises an objective function establishing module and a plurality of intelligent agents which are connected into a directionless communication network;

the objective function establishing module is used for establishing an objective function combining a smooth structure and a non-smooth structure according to the plurality of agents:

is a smooth local objective function known only to agent i>

Is a non-smooth local function known only by agent i, χ is the set of feasible solutions, m is the number of agents;

the intelligent agents are used for calculating local estimation values of the intelligent agents and sending the local estimation values to a first neighbor intelligent agent, the first neighbor intelligent agent is a neighbor intelligent agent corresponding to the intelligent agent, the neighbor intelligent agents are intelligent agents which directly communicate between the two intelligent agents, and the neighbor intelligent agents are neighbor intelligent agents;

the first neighbor agent is used for calculating momentum acceleration items according to the received local estimation values and sending the momentum acceleration items to a second neighbor agent, and the second neighbor agent is a neighbor agent of the first neighbor agent;

the second neighbor agent is used for calculating a gradient tracking item according to the momentum acceleration item and sending the gradient tracking item to a third neighbor agent, and the third neighbor agent is an agent of the second neighbor agent;

the plurality of agents are further configured to loop the local estimates, the momentum acceleration term, and the gradient tracking term until a predetermined condition is met.

Further, the calculation process of the local estimation is as follows:

s201, each agent calculates local optimal solution of each agent

The calculation formula is as follows:

s202, calculating the local estimation value of the local optimal solution according to the local optimal solution

The calculation formula is as follows:

wherein the content of the first and second substances,

is->

Successive convex approximation ofOr (iv) is present>

Is f _i In or on>

α is a positive constant step.

The method has the advantages that on the basis that a plurality of intelligent agents are connected into a directionless network, the coverage range of the processed problems is wider by establishing the target function combining the smooth structure and the non-smooth structure, the established model is more accurate, the problem can be converged to the global optimal solution at a linear speed, the convergence speed is higher than that of a similar method by introducing the momentum acceleration item and the gradient tracking item, and the processing speed of large-scale intelligent automatic equipment data can be effectively improved.

Further, the computation process of the momentum acceleration term is as follows:

The calculation formula is as follows:

s302, estimating according to the local average

wherein w _ij Is weight, w is more than or equal to 0 _ij Is < 1, and

beta is a momentum term parameter.

The method has the advantages that the continuous convex approximation of the objective function is used for replacing and updating variables instead of the objective function by using a distributed optimization strategy, so that when the objective problem is not convex, the fixed point can still be solved for the objective problem, and when the step length alpha is introduced to be positive and smaller than a given upper bound, the problem which can be modeled into the convex function can be converged to the global optimal solution at a linear speed.

Further, the specific calculation formula of the gradient tracking term is as follows:

is a function f _i Gradient of (. Cndot.).

The beneficial effect of adopting the above further scheme is that by carrying out gradient tracing, the local agent can also trace the global gradient value, and the situation that the agent falls into solving the local optimal solution because the agent can only master the local information is avoided.

Further, w is _ij The value rule is as follows:

defining an undirected graph

Wherein +>

Is the intelligent bank set, is asserted>

Is a set of edges that are to be considered,

is a weighted adjacency matrix in which the weights w for the edges (i, j) _ij The following conditions are satisfied: if (i, j) ∈ then w _ij > 0, otherwise w _ij ＝0，/>

Wherein d is _i Is the number of neighbor agents of agent i.

Reference 1: W.Shi, Q.Ling, G.Wu, and W.yin, "A precursor gradient for localized composition optimization," IEEE Transactions on Signal Processing, vol.63, no.22, pp.6013-6023,2015.

Drawings

FIG. 1 is a graph comparing the convergence of PG-EXTRA according to the present invention;

FIG. 2 is a graph comparing the test accuracy of the present invention with PG-EXTRA;

FIG. 3 is a block diagram of a four-class network in one embodiment;

fig. 4 is a graph comparing the performance of four types of networks using the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived from the embodiments of the present invention by a person skilled in the art, are within the scope of the present invention.

Example 1

A Newton momentum-based distributed acceleration composite optimization method comprises the following steps:

s1, connecting a plurality of agents into a directionless communication network, and establishing an objective function combining a smooth structure and a non-smooth structure based on the plurality of agents:

wherein the content of the first and second substances,

is a smooth local objective function known only to agent i>

On the basis that a plurality of agents are connected into a non-directional network, a smooth structure and a non-smooth structure are combined to form an objective function, so that the coverage range of the processed problems is wider, the established model is more accurate, the problem can be converged to a global optimal solution at a linear speed, the convergence speed is higher than that of a similar method by introducing a momentum acceleration item and a gradient tracking item, and the processing speed of large-scale intelligent automatic equipment data can be effectively improved. The intelligent agent is a device with computing capability, storage capability and communication capability, and can be a computer, a server, an unmanned aerial vehicle, an automobile and the like. The corresponding neighbor agent should be understood as: since each agent calculates its own local estimate in S2, each agent transmits a local estimate at the same time, and each agent has its own neighbor agent, i.e., the first neighbor agent. Undirected networks should be understood as: and the connection mode enables a plurality of intelligent agents to mutually send and receive information. The preset conditions include: the iteration number, the running time or the value of the target problem are within a preset interval, and the like. A smooth function is a function of infinite order, continuously derivable within its domain of definition. A non-smooth function is a function that is not infinitely derivable within its domain of definition. The calculation process of the local estimation in the S2 is as follows:

s201, each agent calculates local optimal solution of each agent

The calculation formula is as follows:

/>

The calculation formula is as follows:

wherein the content of the first and second substances,

is->

In the form of a sequential convex approximation>

Is f _i In or on>

Is a positive constant step.

The method has the advantages that when the target problem is not convex, the fixed point can still be solved for the target problem, and when the step length alpha is introduced to be positive and smaller than a given upper bound, the problem which can be modeled as a convex function can be converged to the global optimal solution at a linear speed.

The calculation process of the momentum acceleration term in S3 is as follows:

s301, carrying out weighted average on the local estimated value to obtain a local average estimated value

The calculation formula is as follows:

s302, estimating values according to local average

And calculating a momentum acceleration term, wherein the calculation formula is as follows:

wherein, w _ij Is weight, w is more than or equal to 0 _ij Is < 1, and

beta is a momentum term parameter.

In the steps S301 and S302, the gradient is calculated by using a Newton momentum method, and the method has the advantages that under the condition that the updating direction is the same as the previous moment, the convergence speed can be accelerated to a certain extent, the updating direction of the gradient is adjusted, the stability of the distributed optimization method is improved, and the time overhead for solving the global optimal solution is reduced. The similar method also has a common momentum method, but the common momentum method is easy to have the condition of large fluctuation of variable values in the iteration process, and the system is unstable.

In this embodiment:

defining an undirected graph

Wherein->

Is an intelligent agent set, <' > is present>

Is a set of edges that are to be considered,

is a weighted adjacency matrix in which the weights w for the edges (i, j) _ij Satisfying the following condition w if (i, j) ∈ epsilon _ij > 0, otherwise w _ij ＝0，/>

Wherein d is _i Is the number of neighbor agents for agent i, has a self-loop exist, i.e., (i, j) ∈ ε, and has £ bright>

Agents i and j can communicate directly if and only if there is an edge (i, j) epsilon.

The specific calculation formula of the gradient tracking term in the S4 is as follows:

is a function f _i Gradient of (. Cndot.).

By carrying out gradient tracking, the local intelligent agent can also track the global gradient value, and the situation that the intelligent agent only can master local information and gets into the situation of solving a local optimal solution is avoided.

To verify the convergence of the present invention, the following assumptions are made:

assume that 1: (i) Collection of

Is a closed and convex set; (ii) Local objective function

Is first order consecutive, wherein->

Is an open set; gradient->

In the set->

Upper L _i Liphoz continuous; (iii) Function->

Is convex and may be non-smooth; (iv) Function U is in>

The upper boundary is lower boundary.

Assume 2: the function F is mu-strong convex in the set χ, the strong convex is used in optimization, and particularly one of the conditions for ensuring the linear convergence rate of a plurality of algorithms based on the gradient descent method is defined as follows:

it is noted that strong convexity does not require that the function be differentiable from place to place, and when the function is not smooth, the gradient is replaced by a sub-gradient in which strong convexity is more strictly a quadratic term than a normal convex function

This strongly convex nature is important. Intuitive from a one-dimensional function, a convex function generally only requires that the function curve be above its tangent, and there is little to no requirement for "up", meaning that the curve can "follow" the tangent indefinitely, as long as it remains above it. It goes without saying that in optimization, in particular in gradient optimization, such weak gradient changes make it difficult to achieve fast optimization, possibly with a limited number of times that convergence has not yet been reached. This is also difficult if we take a solution close to the minimum. "very" close is only a qualitative understanding, in which case a bad situation occurs where the optimal solution is very similar but the decision variables differ greatly. At this time, a secondary term is added, so that a secondary lower bound is ensured, the condition of 'clinging' to a tangent line is avoided, and the optimization is simpler.

Assume 3: undirected graph G is connected.

Definition 1: for a function with continuous first order gradient

Wherein->

And the set χ is a closed and convex set. If +>

Is continuous and satisfies the condition that (i) for all x e x,

(ii) Gradient->

Is/>

-rishoz continuous; (iii) Function->

In the set->

Up is>

And (4) strong convex. Then the function->

Is f _i Function>

-smooth,. Or>

Successive convex approximation alternatives of strong convex, in which ∑ is @>

Is referred to as>

Partial derivatives in the parameters (x, y).

Assume 4: function(s)

Is f _i Is/are>

Smooth and->

Successive convex approximations of strong convex to the substitution function.

And (3) convergence analysis:

introduction 1: let 1-4 be true, for all k ≧ 0 available,

p _k+1 ≤σ(α，β)p _k +η(α+β)||δ _k || ² (4)

wherein the parameters σ (α, β) and η (α, β) are defined as follows

/>

Note that 0 < beta < 1,

and->

And (3) proving that: according to the proposed method and p _k Definition of (1), to know

Wherein beta is more than 0 and less than 1.

By utilizing the continuous property of the Lipruztz,

by using

The first-order optimal condition of (1) is derived

The combined formulas (8) and (9) are obtained,

this means that->

And can be derived from (7) and (11)

Wherein the content of the first and second substances,

in the next step the determination will be made>

The lower bound of (c). Review->

Can be defined by

Using the mu-strong convex nature of the function F, it can be demonstrated that the following holds

A sorting type sub-unit (13) which can be known as->

Can thus be obtained>

To determine p _k+1 Upper bound of (1), analysis

And obtain

The combination of formulas (15) and (16) can be found

This is equivalent to

And finishing the guiding certification.

2, leading: let hypothesis 1-3 hold, for all k ≧ 0, the following holds

Wherein L is _max ＝max{L _i }，i∈v

And (3) proving that: according to | | δ _k || ² By definition in Lesion 1, it is understood that

Because of the gradient of the magnetic field, the gradient,

is/>

-Liphoz continuous, analytically available

And finishing the guiding certification.

And 3, introduction: let hypothesis 3 be true, for all k ≧ 0, the following equation holds

Wherein epsilon _s ＞0

And (3) proving that: review of

Can be given by

Thus, it is known that

Wherein epsilon _s Is greater than 0. And finishing the guiding certification.

And (4) introduction: the following equation holds under the condition that 1 to 4 hold

Wherein epsilon _y ＞0。

Prove that consider

By definition of (1)

Thus, it is possible to obtain

Wherein epsilon _y Is greater than 0. And finishing the guiding certification.

And (5) introduction: let 1-4 be true, the following equation holds

And (3) proving that: according to

In:>

strength properties, obtainable

Thus, the analysis can be found

Using x ^* Global optimality of (c) and convexity of G (-) can be obtained

The combination of formulas (26) and (27) is known

Further, utilize

In:>

strong convexity and according to->

Is->

To obtain an optimal solution

Thus, the analysis can be found

This means that

Thus, can be->

And finishing the guiding certification.

And (6) introduction: according to the sequence s _k For all k ≧ 0, define

And

where z ∈ (0, 1). If S (z) is bounded, | | S _k ||＝O(z _k )。

To analyze the linear convergence speed of the present invention using lemma 6, the following variables were defined:

the next step will be to process the sequence { p ] using the lemmas 1,3-6 _k }，/>

And { | | d _k And thus demonstrates linear convergence.

The main results are:

proposition 1: let assumptions 1-4 hold. Considering sigma (alpha), eta (alpha) and two free variables epsilon _s > 0 and ε _y > 0, for arbitrary

z∈(max{σ(α，β)，(1+ε _s )((1-β)ρ+β) ² ，(1+ε _y )ρ ² }，1) (29)

The following inequality holds

S _K (z)≤G _S (α，β，z)D _K (z)+R _S (31)

Y _K (z)≤G _Y (β，z)(8S _K (z)+2α ² D _K (z))+R _Y (32)

D _K (z)≤C ₁ P _K (z)+C ₂ K _K (z) (33)

Wherein the content of the first and second substances,

/>

and (3) proving that: using theorem 1 and considering s for positive sequences _k And z ∈ {0,1}, having

Can obtain the product

When z ∈ (σ (α, β), 1), the expression (42) is found to hold. Similar to the analysis process for equation (30), equations (31) and (32) hold.

Consider the introduction of 5.3.5 and P _k (z) and Y _K (z) is defined in

And finishing the guiding certification.

Theorem 1: let assumptions 1-4 hold if α and β satisfy

And 0 < beta < 1, objective function

Will be based on speed>

α∈[min{α ^* α _max }，α _max ) When the utility model is used, the water is discharged,

and when α ∈ (0,min { α) ^* α _max Z =1- α (1- β) M).

And (3) proving that: according to proposition 1, it can be known

D _K (z)≤Ω(α，β，z)D _K (z) + R (44), wherein,

and is

Using lemma 6, it can be seen that if some parameters exist, then

I.e. omega (alpha, beta, z) < 1, then->

Will be at a linear rate O (z) ^k ) Converge to 0. For this purpose, a suitable parameter is chosen to minimize G _P (α，β，z)，G _S (alpha, beta, z) and G _Y (β，z)。

Considering z > σ (α, β), there is therefore a parameter θ > 0 such that

Further analysis revealed that if

Then

In that

The minimum value is obtained. In other words, it is possible to provide a high-quality image

If step size is selected

Derived from the above

Selecting

Can know and be->

Wherein z > ((1- β) ρ + β) ² . By similar analysis it can be seen

And z > p ² . Based on the previous analysis, the appropriate 3 variables ε were selected _opt ，ε _s ，ε _y So that the sufficient condition of omega (alpha, beta, z) < 1 becomes

in addition, due to

Can obtain the product

Wherein the content of the first and second substances,

summarize the above analysis and order->

It can be known that

Wherein the content of the first and second substances,

to ensure that the value range of z is not null, α should satisfy

The value range of z is analyzed if

Then

Thus, it can be seen that if α ∈ [ min { α ] ^* ，α _max }，α _max ) Then the

If α ∈ (0,min { α [) ^* ，α _max }) z =1- α (1- β) M is certified.

In this embodiment, logistic regression simulation experiments are performed based on breast cancer data provided by the UCI machine learning database to verify the effectiveness of the method. Features of this data include Radius (Radius), texture (Texture), circumference (Perimeter), area (Area), and Smoothness (Smoothness) of the nucleus, etc., as calculated from digitized images of breast masses. The experiment is intended to predict whether a patient's condition is malignant based on the sample values given in the data set. The prediction probability can be expressed as

Where c and l are the data and label of the sample, respectively. From 683 data in the dataset, N =200 samples were assigned to m networked agents for training

Remainder of483 samples were used for the test. The jth data and sample of agent i are @, respectively>

And l _i，h E { -1,1}, wherein

Based on the model, classifier

About sample data (c) _i，h ，l _i，h ) The maximum log-likelihood estimate of (c) is the optimal solution of the following optimization problem:

wherein the regularization term

For avoiding overfitting>

For increasing the sparsity of the solution. The residual is defined as ≥ in the following simulation>

In this example, the convergence of the PG-EXTRA method and the proposed method is compared in reference 1. Defining initial values

And &>

Setting step length α =0.01, momentum term coefficient β =0.5, and presetting condition as iteration number, setting as 70, it should be understood that different data samplesThe iteration times are different and are set according to actual requirements. A undirected network of m =10 agents is randomly generated with a 70% probability of being able to communicate directly between each pair of agents. The evolution of the residual with respect to the different methods is shown in fig. 1, and the test accuracy is shown in fig. 2. As can be seen from fig. 1, when α =0.01, the convergence rate of the proposed method is faster than that of reference 1, and the data processing speed is greatly improved.

It should be noted that the disclosure in reference 1 is mainly used for comparison with the present invention, and does not disclose the technical contents of the present invention, nor suggest the technical problems and technical solutions solved by the present invention.

In the present embodiment, a network including a star network a, a ring network b, a tree network c, and a fully connected network d as shown in fig. 3 is also studied. Setting an initial value to

And &>

And step size α =0.01 and momentum parameter β =0.5 are set. The performance of the proposed method under each type of network is shown in fig. 4, and the result shows that the convergence speed is higher and the data processing speed is higher when the network is dense.

Example 2

On the basis of the embodiment 1, the Newton momentum-based distributed acceleration composite optimization system comprises an objective function establishing module and a plurality of intelligent agents which are connected into a non-directional communication network;

the target function establishing module is used for establishing a target function combining a smooth structure and a non-smooth structure according to a plurality of agents:

wherein the content of the first and second substances,

is a smooth local objective function known only to agent i>

the system comprises a plurality of intelligent agents, a first neighbor intelligent agent and a second neighbor intelligent agent, wherein the plurality of intelligent agents are used for calculating local estimated values of the intelligent agents and sending the local estimated values to the first neighbor intelligent agent;

the first neighbor agent is used for calculating momentum acceleration terms according to the received local estimated values and sending the momentum acceleration terms to the second neighbor agent, and the second neighbor agent is a neighbor agent of the first neighbor agent;

In this embodiment, a single agent is a drone with traffic capacity, computing capacity and storage capacity, and a undirected network connected by a plurality of agents means that the agents can communicate with each other. The first neighbor agent, the second neighbor agent and the third neighbor agent are all contained in a plurality of agents, and the target function is solved by the cooperation of the plurality of agents; the preset conditions include: the iteration number, the running time or the value of the target problem are within a preset interval and the like.

The calculation process of the local estimation is as follows:

s201, each agent calculates local optimal solution of each agent

The calculation formula is as follows:

The calculation formula is as follows:

wherein the content of the first and second substances,

is->

In the form of a successive convex approximation>

Is f _i Is at>

Is a positive constant step.

On the basis that a plurality of intelligent agents are connected into a directionless network, the coverage range of the processed problems is wider by establishing an objective function combining a smooth structure and a non-smooth structure, the established model is more accurate, the problem can be converged to a global optimal solution at a linear speed, the convergence speed is higher than that of a similar method by introducing a momentum acceleration item and a gradient tracking item, and the processing speed of large-scale intelligent automation equipment data can be effectively improved.

The momentum acceleration term is calculated as follows:

s301, carrying out weighted average on the local estimation to obtain local average estimation

The calculation formula is as follows:

s302, estimating according to local average

wherein, w _ij Is weight, 0 is less than or equal to w _ij Is < 1, and

beta is a momentum term parameter.

The variable updating is carried out by using a distributed optimization strategy and using continuous convex approximation replacement of the objective function instead of the objective function, so that the advantage that when the objective problem is not convex, the immobile point can still be solved for the objective problem, and when the introduced step length alpha is positive and smaller than a given upper bound, the problem which can be modeled as a convex function can be converged to the global optimal solution at a linear speed.

The specific calculation formula of the gradient tracking term is as follows:

wherein the content of the first and second substances,

is a function f _i Gradient of (. Cndot.).

By carrying out gradient tracking, the local agent can also track the global gradient value, and the situation that the local optimal solution is solved because the agent can only master local information is avoided.

w _ij The value rule is as follows:

defining an undirected graph

Wherein->

Is the intelligent bank set, is asserted>

Is a set of edges that are to be considered,

Wherein d is _i Is the number of neighbor agents for agent i.

In this embodiment, adopt a plurality of unmanned aerial vehicles to solve the problem of target location, every unmanned aerial vehicle can all be regarded as an agent, and specific implementation process is as follows:

a sound source/energy source is firstly drawn up to send signals outwards continuously, a plurality of unmanned aerial vehicles establish an objective function about distance and information intensity according to the received intensity as the volume transmission is attenuated gradually along with the increase of the distance, and the unmanned aerial vehicles are communicated and calculate information to finally obtain the target position so as to realize quick positioning.

Example 3

On the basis of embodiment 1, solve the resource allocation problem with the intelligent generator of many microprocessor control, be intelligent agent at every microprocessor:

for example, assuming that there are several different power generators, the power generators generate power with coal, the relationship between the amount of coal used and the amount of power generated is positively correlated, and each power generator has different utilization rates of coal, some of them have high utilization rates, and some of them have low utilization rates. How to effectively utilize limited coal is the problem solved by the case.

Aiming at the performances of different generators, a mathematical model between the generated energy and the coal consumption is established, and an objective function about the generated energy is obtained, and a function value is the coal consumption. The microprocessors are combined with the specific conditions of the corresponding generators, communication and information calculation are carried out among the microprocessors, and finally the coal consumption of each generator is obtained.

The technical solutions provided by the embodiments of the present invention are described in detail above, and the principles and embodiments of the present invention are explained herein by using specific examples, and the descriptions of the embodiments are only used to help understanding the principles of the embodiments of the present invention; also, to those skilled in the art that changes may be made in the embodiment of the present invention described above without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims.

Claims

1. A Newton momentum-based distributed acceleration composite optimization method is characterized by comprising the following steps:

s1, connecting a plurality of agents into a directionless communication network, and establishing an objective function combining a smooth structure and a non-smooth structure based on the agents:

wherein the content of the first and second substances,

is a smooth local objective function known only to agent i>

Is a non-smooth local function known only to agent i>

Is the set of feasible solutions, m is the number of agents;

2. The method of claim 1, wherein the local estimation in S2 is calculated by:

s201, each agent calculates local optimal solution of each agent

The calculation formula is as follows:

The calculation formula is as follows:

is->

In the form of a sequential convex approximation>

Is f _i Is at>

Is a positive constant step.

3. The method of claim 2, wherein the momentum acceleration term in S3 is calculated by:

The calculation formula is as follows:

s302, estimating according to the local average

wherein, w _ij Is weight, w is more than or equal to 0 _ij Is < 1, and

beta is a momentum term parameter. />

4. The method according to any one of claims 1 to 3, wherein the gradient tracking term in S4 is calculated by the following formula:

wherein the content of the first and second substances,

is a function f _i Gradient of (. Cndot.).

5. The method of claim 4, wherein w is _ij The value rule is as follows: defining an undirected graph

Wherein->

Is the intelligent bank set, is asserted>

Is a side set, is asserted>

Is a weighted adjacency matrix in which the weights w for the edges (i, j) _ij The following conditions are satisfied: if (i, j) ∈ then w _ij > 0, otherwise w _ij ＝0，

Wherein d is _i Is the number of neighbor agents for agent i.

6. A Newton momentum-based distributed acceleration composite optimization system is characterized by comprising an objective function establishing module and a plurality of agents which are connected into a directionless communication network;

wherein the content of the first and second substances,

is a smooth local objective function known only to agent i>

Is a non-smooth local function known only to agent i>

Is a feasible solutionM is the number of agents;

the plurality of agents are further configured to loop the local estimates, the momentum acceleration term, the gradient tracking term until a predetermined condition is met and terminate the loop.

7. The system of claim 6, wherein the local estimate is calculated by:

s201, each agent calculates local optimal solution of each agent