CN109508785A - A kind of asynchronous parallel optimization method for neural metwork training - Google Patents
A kind of asynchronous parallel optimization method for neural metwork training Download PDFInfo
- Publication number
- CN109508785A CN109508785A CN201811265027.8A CN201811265027A CN109508785A CN 109508785 A CN109508785 A CN 109508785A CN 201811265027 A CN201811265027 A CN 201811265027A CN 109508785 A CN109508785 A CN 109508785A
- Authority
- CN
- China
- Prior art keywords
- computer
- neural network
- computers
- data
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, belongs to deep learning field.This method will be used to train data in the data set of neural network to being assigned to n platform computer first;The Communication topology for determining n platform computer obtains the corresponding set of computers for sending data and receiving data of every computer;The respective neural network of every computer initialization and relevant parameter, then training is iterated to respective neural network, by the weighting parameters of update after each iteration, weights consistency variable and total step-length variable sends jointly to all computers communicated with;After all computers terminate repetitive exercise, final neural network parameter is that finally trained parameter, Neural Network Optimization finish neural network on any one computer.The present invention is implemented simply, and network training speed is fast, to the favorable expandability of large-scale dataset and computer cluster.
Description
Technical field
The invention belongs to deep learning fields, and in particular to a kind of asynchronous parallel optimization side for neural metwork training
Method.
Background technique
The fields such as artificial intelligence and its relevant computer learning, deep learning have received widespread attention in recent years.
Its application in fields such as recognition of face, Face datection, natural language processing, speech recognitions also shows powerful ability.
A very common and important technology is artificial neural network (abbreviation nerve net in artificial intelligence and deep learning
Network).One neural network is made of several neurons, and a neuron can be regarded as receiving several input signals, then
According to the function of certain rule one signal of output, this function corresponds to some adjustable parameters.Between neuron according to
Certain rule carries out series connection and parallel connection just constitutes a neural network, and common neural network structure has full connection nerve net
Network, convolutional neural networks (CNN) etc..Fig. 1 is a simple three-layer neural network schematic diagram, wherein input layer and middle layer point
Not there are three neuron, output layer has a neuron, all connects entirely between layers.The network receives three tunnels input letter
Number, then export signal all the way.In general, a neural network externally can be regarded as receiving several input letters
Number, the function of several signals is then exported according to certain rule, rule here is by all neurons inside neural network
State modulator, the also referred to as parameter of this neural network.The neural network of same structure, if parameter is different, nerve net
The function of network is not often also identical, i.e., different signals can be exported for same input signal.The training of neural network refers to
The process of parameter in neural network is constantly adjusted according to some way, so that the output of network meets expectation.
Currently used neural network training method is the stochastic gradient method based on data set.Data set is one group given
The set of the data of input/output relation, it is related with problem to be solved.It is every in data set such as in image classification problem
One data generally comprises a width picture (input) and the corresponding classification of the picture (output).Data set is usually from practical life
It is collected in work, and needs some processes manually demarcated.The purpose of training of the neural network based on data set is exactly to adjust
Parameter in neural network makes given network specifically input (such as a width picture), and the output of network is as close possible to data
Concentrate corresponding output (such as classification of the picture).Stochastic gradient method is a kind of training method being widely used, training
Process is that each step first takes out a certain number of data in data set at random, is then defined on nerve using the calculating of these data
The gradient of target function value and objective function about network parameter on network adjusts nerve further according to the gradient of objective function
Parameter in network.The step for being repeated when training, until the effect of neural network output is satisfactory.Therefore, stochastic gradient
" random " in method takes out a part of data when referring to trained from data set at random, and " gradient " refers to update neural network parameter
Using gradient method.
Since the data set that training uses in real problems is generally very big, corresponding neural network is also larger, uses
Stochastic gradient method training neural network in single computer or equipment needs a large amount of time that can just obtain satisfied effect.
In order to accelerate the training of network, a kind of possible way is using multiple stage computers or equipment while to carry out the training of network, I
Be referred to as distributed method.Needing the main problem solved using distributed method is information between different computers
Exchange with it is synchronous, mainly take two methods at present.One is choosing one in multiple stage computers to be used as host, remaining is calculated
The result of calculating such as gradient etc. is issued host as slave, every slave of each step of training process by machine, and then host is integrated
The data obtained from different slaves update neural network again.This method requires every slave that can be communicated with host,
Therefore restriction of the calculating speed by the processing speed and main-machine communication bandwidth of host.
Another method is the copy that every computer all retains a neural network, and then each step of training process is every
Platform computer first updates the neural network copy of itself according to data set thereon, then the copy is issued to neighbouring several calculating
Machine, and their neural network copy is received from these computers, finally take the copy and receipts of certain strategy fusion oneself
The copy arrived.The difference of host and slave is not present in this method, the communication load basic one of all computers in training process
It causes, therefore training speed is not readily susceptible to the restriction of single computer.
In above-mentioned second method, although overall training speed is not limited by the bandwidth of single computer,
It is that all computers must be updated with identical clock and frequency, i.e. the training process synchronization that requires different computers.
This is required prevent every computer according to the clock of oneself from being updated, it is necessary to wait the information for receiving other computers
It together updates afterwards.When the computing capability of different computers has difference, this method can make computing capability strong, updating decision
Computer will wait computing capability weak, update slow computer, cause the waste of resource.
Summary of the invention
The purpose of the invention is to overcome the shortcomings of existing neural network training method and improve the instruction of neural network
Practice efficiency, proposes a kind of asynchronous parallel optimization method for neural metwork training.The present invention implements simple, network training speed
Degree is fast, to the favorable expandability of large-scale dataset and computer cluster.
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, which is characterized in that including following
Step:
(1) by the data in the data set D for being used to train neural network to being assigned to n >=1 computer, the data
Collecting D includes N group inputoutput data pair;The Sub Data Set being assigned on i-th computer is expressed as Di, DiCorresponding data
It is N to group numberi;
(2) Communication topology for determining n platform computer obtains the corresponding reception computer of every computer and sends number
According to set of computers;It enablesIndicate the set of computers for the data that i-th computer of reception is sent,It is counted including i-th
Calculation machine itself,The quantity of middle computer is usedIt indicates, enablesExpression sends data to the computer set of i-th computer
It closes;
(3) the respective neural network of every computer initialization and relevant parameter;Specific step is as follows:
(3-1) every computer initializes a neural network according to the neural network structure defined in advance;Every is counted
The initial parameter of calculation machine epineural network is denoted as wi(0), the corresponding the number of iterations k of every computer is setiIt is 0, every meter is set
The consistency variable initial value z of calculation machineiAnd total step-length variable initial value l (0)=1i(0)=1;
(3-2) every computer establishes three buffer area W respectivelyrec,i,Zrec,iAnd Lrec,iAnd it is initialized as sky, three
Buffer area stores the weighting parameters of this computer received from other computers respectively, weights consistency variable, Yi Jizong
Step-length variable;
(3-3) every computer defines the corresponding trigger event of the computer;
(3-4) every computer i calculates initial weighting parametersInitial weighting consistency variableThen willAnd li(0) it is sent toIn all computers;
(4) every computer is iterated training to respective neural network, until Neural Network Optimization finishes;Specific step
It is rapid as follows:
Before the arrival of corresponding trigger event, every computer receives (4-1) every computer iIn other calculating
Weighting parameters, weighting consistency variable and the total step-length variable of machine, and it is stored in corresponding buffer area W respectivelyrec,i,Zrec,iWith
And Lrec,i;
(4-2) when the arrival of the corresponding trigger event of computer i, this computer is performed the following operation:
(4-2-1) computer i updates each variable according to the following formula:
Wherein, ρ (t) is sequence, and t is the call number of sequence,It is loss function L (D, w) about nerve
Network parameter wiStochastic gradient;
(4-2-2) computer i updates weighting parameters respectivelyWeight consistency variableThen willAnd li(ki+ 1) it is sent toIn all computers;
(4-2-3) updates the number of iterations ki, enable kiIncrease by 1, then return to step (4-2-1), when computer i meets
When stopping criterion for iteration, this computer terminates repetitive exercise;
(4-3) after all computers terminate repetitive exercise, final neural network parameter is equal on any one computer
For neural network, finally trained parameter, Neural Network Optimization are finished.
The features of the present invention and beneficial effect are:
(1) present invention in neural network be by multiple stage computers come and meanwhile train, for same network structure, do so
Improve training speed.In addition, the distributed feature of this method makes it possible using bigger network, i.e. neural network
Design is not only restricted to the computing capability of single computer.Compared with the single computer training method being widely used at present, we
Method can use multiple stage computers cluster to significantly improve training speed.
(2) in the present invention different computers it is equal, the traffic of the every computer in each step is roughly the same,
And every computer only needs to communicate with sub-fraction computer, do not need a host or similar central computer come with
Every other computer communication.Compared to one host of currently used setting, other computers are slave, then host and all
The training method that slave is communicated, this method make the traffic of single computer will not be as the bottleneck of whole system, more
It is applied in the case that suitable number of computers is especially more.
(3) every computer updates itself neural network pair according to oneself clock or trigger event in the present invention
This, is a kind of asynchronous algorithm without synchronous with the holding of other computers.In asynchronous algorithm every computer according to oneself
Clock and frequency update network, and the communication between computer is allowed to there is delay, can make full use of the meter of every computer
Calculation ability will not cause computer to have standby time to accelerate training process because of synchronizing, it is easier to expand to a large amount of
The case where computer.
(4) future of the invention can be applied to include the image classification using nerual network technique, target detection, intensified learning
Deng there is higher application value.
Detailed description of the invention
Fig. 1 is a three-layer neural network schematic diagram.
Fig. 2 is to change over time schematic diagram using the loss function error of the method for the present invention and distributed random gradient method.
Specific embodiment
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, with reference to the accompanying drawing and specific real
It is as follows to apply example further description.
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, with n platform computer to one
For neural network is trained.Enable (xi,yi) one group of inputoutput data pair is represented, wherein xiFor corresponding input feature vector to
Amount, yiFor xiCorresponding true output.Enable D={ (xi,yi), i=1 .., N } indicate the data sets of N group data pair.Nerve
It is x that network, which can be regarded as an input,iFunction, export as to true output yiEstimated valueI.e.Wherein w is parameter to be adjusted in the neural network.The corresponding function f of the network of different structure
(xi, w) it is also different.The purpose of training neural network is to makeWith yiBetween gap it is as small as possible, for this purpose, training specific mesh
Here to minimize loss functionFor, remaining alternative loss function further includes handing over
Pitch entropy loss functionAnd the loss of regular terms is added on the basis of both loss functions
Function:WithHere λ is one fixed
Positive number, size will be selected according to practical problem.
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, comprising the following steps:
(1) data set for being used to train neural network is assigned to n platform computer;
According to the computing capability of different computers, storage capacity or other practical factors will have the number of N group data pair
Being assigned to n >=1 computer according to the data in collection D, (computer generally has CPU, memory, the general-purpose computations of hard disk
Machine) on, N is typically much deeper than n in practical problem.The method of salary distribution can be any mode, for example data can be evenly distributed to n
On platform computer, then every computer obtains N/n data.Here the Sub Data Set being assigned on i-th computer is indicated
For Di, corresponding data are N to group numberi.Every computer can only access the data being assigned to thereon, i.e., i-th in training process
Platform computer can only access DiData set.
(2) Communication topology for determining n platform computer obtains the corresponding computer that can receive of every computer and sends out
Send the set of computers of data.
In the light of actual conditions determine which the communication mode of this n platform computer, i.e. every computer can send data to
Platform computer or the data for receiving which platform computer.Ideally, every computer can give all other computer
Send data, but bandwidth is limited in practice, whole efficiency may be made lower in this way, if therefore every computer may only to
Dry platform computer sends data.In order to indicate convenient, it is first 1 ..., n to every computer number, then usesExpression can
To receive the set of computers for the data that i-th computer is sent.Here it providesIncluding i-th computer itself.In
The quantity of computer is usedIt indicates.It enables againExpression sends data to the set of computers of i-th computer.
(3) the respective neural network of every computer initialization and relevant parameter;Specific step is as follows:
(3-1) every computer initializes a neural network according to the neural network structure defined in advance, wherein every
The parameter of neural network can randomly select when computer initialization.By the initial parameter note of i-th computer epineural network
For wi(0), subscript i here and later indicates i-th computer.The neural network structure of different computers is identical, but
It is that parameter may be different.In addition, every computer is by corresponding the number of iterations kiIt is set as 0, at the beginning of then consistency variable is set
Initial value ziAnd total step-length variable initial value l (0)=1i(0)=1.(3-2) every computer needs in respective memory respectively
Establish three buffer area Wrec,i,Zrec,iAnd Lrec,i, it is respectively intended to store on other computers received from other computers
Weighting parameters, weight consistency variable, and total step-length variable.
It is (different that (3-3) every computer also needs to define the corresponding trigger event of the computer according to the actual situation
The trigger event of computer may be different), the operation in step (4) will be executed when triggering every time.Trigger event is according to practical feelings
Condition has different definition ways, and such as every primary by triggering in 1 second, triggering one is inferior after receiving data every time.
(3-4) every computer i calculates initial weighting parametersInitial weighting consistency variableThen willAnd li(0) it is sent toIn all computers.Here 0 generation in bracket
The initial value of the table parameter.
(4) every computer is iterated training to respective neural network, until Neural Network Optimization finishes;Specific step
It is rapid as follows:
(4-1) every computer i waits the arrival of the trigger event of predefined.During this period, every computer is all
It will receiveIn the weighting parameters of other computers weight consistency variable, and total step-length variable be stored in respectively it is corresponding slow
Deposit area Wrec,i,Zrec,iAnd Lrec,i(transmission for the first time of every computer is initial value, therefore what is received for the first time is also
Initial value);Before the corresponding first time trigger event of every computer arrives, three initial parameters of the computerAnd li(0) it will not change, computer can update these three according to the following steps after triggering for the first time
Parameter, changes will occur for these three variables after triggering every time later;The frequency that each computer is sent is not fixed, can basis
Actual conditions are sent with optional frequency, and the transmission frequency of all computers is recommended to be consistent in certain practical operation as far as possible, this
Sample effect is more preferable;Each computer only needs to retain one group of parameterAnd li, but this group of parameter can be in optimization process
In constantly change.
For example, before the trigger event next time of computer i arrives, if computer i receives computer j transmission
Data twice:Andlj(6), then computer iWithDeposit
Wrec,i,WithIt is stored in Zrec,i, lj(5) and lj(6) it is stored in Lrec,i.The trigger event of different computers is mutually only
It is vertical, before certain computer is not triggered, all parameters on the computer (being assumed to be j) (And lj) keep not
Become, and do not send data, the possible data to arrive to be received such as only.
(4-2) is then performed the following operation once computer i is triggered:
(4-2-1) computer i is calculated according to the following formula and is updated each variable
Wherein ρ (t) is predefined sequence, such as constant sequence ρ (t)=ρ or sequenceT refers to
The call number of sequence, such as sequenceThe sequence of representative is exactly For loss
Function L (D, w) is about neural network parameter wiStochastic gradient.C in above formulai(ki+1),zi(ki+1),mi(ki+1),
And αi(kiIt+1) is to update neural network parameter wi(ki+ 1), consistency variable zi(ki, and total step-length variable l+1)i
(ki+ 1) temporary variable defined during for simplified formula does not have tool no need to send other computers are given yet
Body physical significance.
When loss function isWhen, i-th computer calculates the method for its stochastic gradient such as
Under: first from its Sub Data Set DiIn randomly select p data and be denoted as (xi,1,yi,1),(xi,2,yi,2),...,(xi,p,yi,p)。p
For a specified in advance positive integer, such as 16,32 etc..Then the corresponding output of these data of neural computing is utilized,
I.e.Finally calculate stochastic gradient:
Here ▽ fw(xi,j,mi(kiIt+1) is) nerve
The output of network is being inputted about the gradient of w as xi,j, parameter mi(ki+ 1) value under.When using other loss functions,It is correspondingly changed to stochastic gradient of other loss functions under this p data, is such as usedWhen as loss function,It adopts
With
When as loss function,
(4-2-2) computer i updates weighting parameters and weighting consistency variable respectively, then together with updated total step-length
Variable is sent jointly toIn all computers.
Computer i updates weighting parametersWeight consistency variableThen willAnd li(ki+ 1) it is sent toIn all computers.
(4-2-3) updates the number of iterations kiEven kiIncrease by 1, then returns to step (4-2-1);When computer i thing
When the termination condition first defined obtains meeting, this computer terminates repetitive exercise.When training can be set into termination condition
Between exceed schedule time or li(ki+ 1) size is more than number etc. specified in advance.
(4-3) when the repetitive exercise on all computers all after, final neural network on any one computer
Parameter wi(ki+ 1) all can serve as neural network, finally trained parameter, Neural Network Optimization finish.
In the present invention, if certain computer has first reached the stop condition defined in advance, so that it may end first training.It is real
In the operation of border, different computers terminate the trained time will not difference it is too many, approximate can regard as while terminate.
Fig. 2 is in one embodiment of the invention respectively using this method and currently used distributed random gradient method
Loss function error with time change comparison.Wherein abscissa indicates training time in seconds, ordinate table
Show the value of loss function in training process under logarithmic coordinates system.What solid line indicated is the table of the mentioned method of the present invention in this example
Existing, dotted line is using the performance of distributed random gradient method in this example.It can be seen from the figure that the mentioned algorithm of the present invention is received
Its validity will be embodied far faster than distributed random gradient method by holding back speed.And the disadvantage is that in training process error concussion compared with
Greatly.
Claims (1)
1. a kind of asynchronous parallel optimization method for neural metwork training, which comprises the following steps:
(1) by the data in the data set D for being used to train neural network to being assigned to n >=1 computer, the data set D
Include N group inputoutput data pair;
The Sub Data Set being assigned on i-th computer is expressed as Di, DiCorresponding data are N to group numberi;
(2) Communication topology for determining n platform computer obtains the corresponding computer that receives of every computer and sends data
Set of computers;It enablesIndicate the set of computers for the data that i-th computer of reception is sent,Including i-th computer
Itself,The quantity of middle computer is usedIt indicates, enablesExpression sends data to the set of computers of i-th computer;
(3) the respective neural network of every computer initialization and relevant parameter;Specific step is as follows:
(3-1) every computer initializes a neural network according to the neural network structure defined in advance;By every computer
The initial parameter of epineural network is denoted as wi(0), the corresponding the number of iterations k of every computer is setiIt is 0, every computer is set
Consistency variable initial value ziAnd total step-length variable initial value l (0)=1i(0)=1;
(3-2) every computer establishes three buffer area W respectivelyrec,i,Zrec,iAnd Lrec,iAnd it is initialized as sky, three cachings
Area stores the weighting parameters of this computer received from other computers respectively, weights consistency variable and total step-length
Variable;
(3-3) every computer defines the corresponding trigger event of the computer;
(3-4) every computer i calculates initial weighting parametersInitial weighting consistency variableThen willAnd li(0) it is sent toIn all computers;
(4) every computer is iterated training to respective neural network, until Neural Network Optimization finishes;Specific steps are such as
Under:
Before the arrival of corresponding trigger event, every computer receives (4-1) every computer iIn other computers
Weighting parameters, weighting consistency variable and total step-length variable, and it is stored in corresponding buffer area W respectivelyrec,i,Zrec,iAnd
Lrec,i;
(4-2) when the arrival of the corresponding trigger event of computer i, this computer is performed the following operation:
(4-2-1) computer i updates each variable according to the following formula:
Wherein, ρ (t) is sequence, and t is the call number of sequence,It is loss function L (D, w) about neural network
Parameter wiStochastic gradient;
(4-2-2) computer i updates weighting parameters respectivelyWeight consistency variableThen willAnd li(ki+ 1) it is sent toIn all computers;
(4-2-3) updates the number of iterations ki, enable kiIncrease by 1, step (4-2-1) is then returned to, when computer i meets iteration
When termination condition, this computer terminates repetitive exercise;
(4-3) after all computers terminate repetitive exercise, final neural network parameter is mind on any one computer
Through network, finally trained parameter, Neural Network Optimization are finished.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811265027.8A CN109508785A (en) | 2018-10-29 | 2018-10-29 | A kind of asynchronous parallel optimization method for neural metwork training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811265027.8A CN109508785A (en) | 2018-10-29 | 2018-10-29 | A kind of asynchronous parallel optimization method for neural metwork training |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109508785A true CN109508785A (en) | 2019-03-22 |
Family
ID=65746922
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811265027.8A Pending CN109508785A (en) | 2018-10-29 | 2018-10-29 | A kind of asynchronous parallel optimization method for neural metwork training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109508785A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110175680A (en) * | 2019-04-03 | 2019-08-27 | 西安电子科技大学 | Utilize the internet of things data analysis method of the online machine learning of distributed asynchronous refresh |
CN111582494A (en) * | 2020-04-17 | 2020-08-25 | 浙江大学 | Hybrid distributed machine learning updating method based on delay processing |
CN112633480A (en) * | 2020-12-31 | 2021-04-09 | 中山大学 | Calculation optimization method and system of semi-asynchronous parallel neural network |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150302295A1 (en) * | 2012-07-31 | 2015-10-22 | International Business Machines Corporation | Globally asynchronous and locally synchronous (gals) neuromorphic network |
CN106293942A (en) * | 2016-08-10 | 2017-01-04 | 中国科学技术大学苏州研究院 | Neutral net load balance optimization method based on the many cards of multimachine and system |
CN107018184A (en) * | 2017-03-28 | 2017-08-04 | 华中科技大学 | Distributed deep neural network cluster packet synchronization optimization method and system |
CN107209872A (en) * | 2015-02-06 | 2017-09-26 | 谷歌公司 | The distributed training of reinforcement learning system |
CN107784364A (en) * | 2016-08-25 | 2018-03-09 | 微软技术许可有限责任公司 | The asynchronous training of machine learning model |
CN108073986A (en) * | 2016-11-16 | 2018-05-25 | 北京搜狗科技发展有限公司 | A kind of neural network model training method, device and electronic equipment |
CN108460457A (en) * | 2018-03-30 | 2018-08-28 | 苏州纳智天地智能科技有限公司 | A kind of more asynchronous training methods of card hybrid parallel of multimachine towards convolutional neural networks |
US20180293492A1 (en) * | 2017-04-10 | 2018-10-11 | Intel Corporation | Abstraction library to enable scalable distributed machine learning |
-
2018
- 2018-10-29 CN CN201811265027.8A patent/CN109508785A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150302295A1 (en) * | 2012-07-31 | 2015-10-22 | International Business Machines Corporation | Globally asynchronous and locally synchronous (gals) neuromorphic network |
CN107209872A (en) * | 2015-02-06 | 2017-09-26 | 谷歌公司 | The distributed training of reinforcement learning system |
CN106293942A (en) * | 2016-08-10 | 2017-01-04 | 中国科学技术大学苏州研究院 | Neutral net load balance optimization method based on the many cards of multimachine and system |
CN107784364A (en) * | 2016-08-25 | 2018-03-09 | 微软技术许可有限责任公司 | The asynchronous training of machine learning model |
CN108073986A (en) * | 2016-11-16 | 2018-05-25 | 北京搜狗科技发展有限公司 | A kind of neural network model training method, device and electronic equipment |
CN107018184A (en) * | 2017-03-28 | 2017-08-04 | 华中科技大学 | Distributed deep neural network cluster packet synchronization optimization method and system |
US20180293492A1 (en) * | 2017-04-10 | 2018-10-11 | Intel Corporation | Abstraction library to enable scalable distributed machine learning |
CN108460457A (en) * | 2018-03-30 | 2018-08-28 | 苏州纳智天地智能科技有限公司 | A kind of more asynchronous training methods of card hybrid parallel of multimachine towards convolutional neural networks |
Non-Patent Citations (4)
Title |
---|
JIAQI ZHANG等: "AsySPA: An Exact Asynchronous Algorithm for Convex Optimization Over Digraphs", 《ARXIV》 * |
TIANYU WU等: "Decentralized Consensus Optimization with Asynchrony and Delays", 《ARXIV》 * |
WILLIAM CHAN: "Distributed Asynchronous Optimization of Convolutional Neural Networks", 《INTERSPEECH》 * |
谢佩: "网络化分布式凸优化算法研究进展", 《控制理论与应用》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110175680A (en) * | 2019-04-03 | 2019-08-27 | 西安电子科技大学 | Utilize the internet of things data analysis method of the online machine learning of distributed asynchronous refresh |
CN110175680B (en) * | 2019-04-03 | 2024-01-23 | 西安电子科技大学 | Internet of things data analysis method utilizing distributed asynchronous update online machine learning |
CN111582494A (en) * | 2020-04-17 | 2020-08-25 | 浙江大学 | Hybrid distributed machine learning updating method based on delay processing |
CN111582494B (en) * | 2020-04-17 | 2023-07-07 | 浙江大学 | Mixed distributed machine learning updating method based on delay processing |
CN112633480A (en) * | 2020-12-31 | 2021-04-09 | 中山大学 | Calculation optimization method and system of semi-asynchronous parallel neural network |
CN112633480B (en) * | 2020-12-31 | 2024-01-23 | 中山大学 | Calculation optimization method and system of semi-asynchronous parallel neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhong et al. | Practical block-wise neural network architecture generation | |
CN109948029B (en) | Neural network self-adaptive depth Hash image searching method | |
Zhang et al. | Adaptive federated learning on non-iid data with resource constraint | |
CN110134636B (en) | Model training method, server, and computer-readable storage medium | |
CN106297774B (en) | A kind of the distributed parallel training method and system of neural network acoustic model | |
CN109508785A (en) | A kind of asynchronous parallel optimization method for neural metwork training | |
CN111858009A (en) | Task scheduling method of mobile edge computing system based on migration and reinforcement learning | |
Cai et al. | Dynamic sample selection for federated learning with heterogeneous data in fog computing | |
CN114584581B (en) | Federal learning system and federal learning training method for intelligent city internet of things (IOT) letter fusion | |
CN108446770B (en) | Distributed machine learning slow node processing system and method based on sampling | |
CN110362380A (en) | A kind of multiple-objection optimization virtual machine deployment method in network-oriented target range | |
Zhan et al. | Pipe-torch: Pipeline-based distributed deep learning in a gpu cluster with heterogeneous networking | |
CN109063041A (en) | The method and device of relational network figure insertion | |
Tanaka et al. | Automatic graph partitioning for very large-scale deep learning | |
Nie et al. | HetuMoE: An efficient trillion-scale mixture-of-expert distributed training system | |
CN114399018B (en) | Efficient ientNet ceramic fragment classification method based on sparrow optimization of rotary control strategy | |
Hu et al. | Improved methods of BP neural network algorithm and its limitation | |
CN109636709A (en) | A kind of figure calculation method suitable for heterogeneous platform | |
CN116702925A (en) | Distributed random gradient optimization method and system based on event triggering mechanism | |
Ho et al. | Adaptive communication for distributed deep learning on commodity GPU cluster | |
Cheng et al. | Bandwidth reduction using importance weighted pruning on ring allreduce | |
CN114995157A (en) | Anti-synchronization optimization control method of multi-agent system under cooperative competition relationship | |
CN113672684A (en) | Layered user training management system and method for non-independent same-distribution data | |
Lu et al. | Adaptive asynchronous federated learning | |
CN108334939B (en) | Convolutional neural network acceleration device and method based on multi-FPGA annular communication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190322 |