CN110768841A

CN110768841A - Acceleration distributed online optimization method based on condition gradient

Info

Publication number: CN110768841A
Application number: CN201911045411.1A
Authority: CN
Inventors: 申修宇; 李德权; 董翘
Original assignee: Anhui University of Science and Technology
Current assignee: Anhui University of Science and Technology
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2020-02-07
Also published as: LU102143B1

Abstract

Based on the condition gradient, a method for accelerating the optimization of the distributed online condition gradient is provided, and the problem of high time complexity of a distributed online optimization algorithm can be effectively solved. The method decomposes the network optimization objective function into the sum of local objective functions of each node or individual, each individual only knows the objective function of the individual, and the optimization problem is cooperatively solved through information transfer between the individual and adjacent individuals. According to the method, a regularization term is added into local cost information of each individual, a new time-varying cost function is constructed, and the defect that a traditional conditional gradient algorithm is insensitive to the gradient size is overcome. The method also replaces projection operation with local linear optimization setting, so that the convergence speed of the algorithm is doubled. Finally, various tasks are verified through experiments, and the result shows that the method runs well in practical application and has certain superiority compared with the existing optimization method.

Description

Acceleration distributed online optimization method based on condition gradient

Technical Field

The invention relates to an accelerated distributed online optimization method based on condition gradients, and belongs to the field of machine learning.

Background

Distributed convex optimization has been of great interest to researchers in many fields. The classical problems of distributed tracking, estimation and detection are also optimization problems in nature. The distributed optimization problem is mainly to perform global optimization tasks assigned to each node in the network. Since each node has limited resources or partial information about the task, the nodes cooperate to perform data collection and update local estimates by sharing the collected information. Distributed optimization imposes a lower computational burden on the nodes and the network system remains robust even if the nodes experience local failures, so it can effectively overcome the deficiencies in a single information processing unit in a centralized scenario.

Distributed optimization has been widely applied in the case of time-invariant cost functions. However, in practical applications, distributed network systems are often in a dynamic and uncertain environment. For example, consider the problem of tracking moving objects, whose purpose is to track the position, velocity and acceleration of the object. These problems have been the main focus of online learning in the field of machine learning. Therefore, the online optimization and the distributed optimization are combined, the uncertainty of the multi-agent network system is represented by any variable cost function, and the dynamic data flow of the network nodes can be effectively processed in real time.

With the rapid development of distributed online optimization, many conventional optimization algorithms have been extended into distributed online cases. In recent years, traditional optimization algorithms such as a gradient descent method and a dual-mode averaging method are widely applied to distributed online optimization. The conditional gradient method (also called Frank-Wolfe, FW) is essentially a first-order optimization method, and can theoretically achieve a convergence rate lower than that of other effective optimization algorithms, but a large number of variables exist in various variables at present. High dimensionality practical optimization problems, so using second order information or other super linear operations is practically infeasible. Furthermore, the FW method has proven to be a powerful tool for solving large-scale optimization problems, since it can effectively avoid key problems such as the difficulty of computing orthogonal projections in a first-order optimization method. Therefore, a local linear optimization step is introduced into the original condition gradient algorithm, an accelerated distributed online condition gradient algorithm is provided, and the condition gradient online optimization algorithm is expanded to a distributed condition.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: an acceleration distributed online optimization method based on conditional gradient is provided, and aims to accelerate convergence of a model in a distributed network.

In order to solve the technical problems, the invention adopts the following technical scheme:

in the distributed online convex optimization setting, each node represents an individual, in each iteration, the individual generates decision information, and the individual independently submits the decision information and obtains a corresponding cost function. Each individual has a degree of importance in information communication relative to other individuals, and the information is more valuable by giving higher weight to the individual with high degree of importance through the form of weighted average, so that the error of the whole distributed system is reduced. A local linear optimization step is introduced into an original conditional gradient algorithm, so that the convergence of the whole network model is accelerated.

Drawings

FIG. 1 is a convergence diagram of the method of the present invention at L1 for regularizing a logistic regression model.

FIG. 2 is a convergence diagram of the method of the present invention at L2 for regularizing the logistic regression model.

Detailed Description

The invention solves the problem of distributed optimization on the connected undirected network, and avoids the defect of a single information processing unit in a centralized scene, which causes the overhigh communication cost of the central node.

The method comprises the following specific steps:

step 1: revealing a loss function f_t(t)＝f_i,t(t)

Step 3: calculating the sub-gradient of the individual-generated information, g_it∈f_i,t(x_i,t)

For each individual:

in a distributed network, the passing of information by the brokers is performed by weighted averaging (third line in Step 4) to ensure that the information of the important brokers is fully utilized. Our method also introduces a local linear optimization setup

The objective is to accelerate the convergence of the entire network model ρ is a parameter for the local linear optimization setup α is the learning rate.

The invention is further described below with reference to the accompanying drawings.

FIG. 1 is a convergence diagram of the method of the present invention at L1 for regularizing a logistic regression model. Consider an online distributed learning environment: our goal is to solve the L1 regularized logistic regression problem, with numerical results as shown in fig. 1 for the synthetic dataset. It can be seen that the accelerated distributed online conditional gradient algorithm is superior to other algorithms. Fig. 1 also shows that the convergence rate of the algorithm is significantly faster at the beginning than other algorithms.

FIG. 2 is a convergence diagram of the method of the present invention at L2 for regularizing the logistic regression model. Experiments were performed on the actual data set with satisfactory results. As can be seen from fig. 2, the algorithm proposed herein achieves an acceleration effect. As can be seen from fig. 2, the loss of the algorithm reaches a small level quickly, and the performance is better than that of other algorithms, which may be more suitable for practical application.

Claims

1. An accelerated distributed online optimization method based on conditional gradients is characterized in that individuals in a distributed network independently submit local information and then obtain a local cost function. The individuals are communicated with each other by a weighted average method. The next iteration direction is found by a local linear optimization step after the individual communication.

2. Individuals in the distributed network of claim 1 independently submit local information and then obtain a local cost function, wherein: in the distributed online convex optimization setting, each node represents an individual, in each iteration, the individual generates decision information, and the individual independently submits the decision information and obtains a corresponding cost function.

3. The method of claim 1 wherein the individuals communicate with each other by weighted averaging, wherein: each individual has a degree of importance in information communication relative to other individuals, and the information is more valuable by giving higher weight to the individual with high degree of importance through the form of weighted average, so that the error of the whole distributed system is reduced.