CN109508785A - A kind of asynchronous parallel optimization method for neural metwork training - Google Patents

A kind of asynchronous parallel optimization method for neural metwork training Download PDF

Info

Publication number
CN109508785A
CN109508785A CN201811265027.8A CN201811265027A CN109508785A CN 109508785 A CN109508785 A CN 109508785A CN 201811265027 A CN201811265027 A CN 201811265027A CN 109508785 A CN109508785 A CN 109508785A
Authority
CN
China
Prior art keywords
computer
neural network
computers
data
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811265027.8A
Other languages
Chinese (zh)
Inventor
游科友
张家绮
宋士吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201811265027.8A priority Critical patent/CN109508785A/en
Publication of CN109508785A publication Critical patent/CN109508785A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, belongs to deep learning field.This method will be used to train data in the data set of neural network to being assigned to n platform computer first;The Communication topology for determining n platform computer obtains the corresponding set of computers for sending data and receiving data of every computer;The respective neural network of every computer initialization and relevant parameter, then training is iterated to respective neural network, by the weighting parameters of update after each iteration, weights consistency variable and total step-length variable sends jointly to all computers communicated with;After all computers terminate repetitive exercise, final neural network parameter is that finally trained parameter, Neural Network Optimization finish neural network on any one computer.The present invention is implemented simply, and network training speed is fast, to the favorable expandability of large-scale dataset and computer cluster.

Description

A kind of asynchronous parallel optimization method for neural metwork training
Technical field
The invention belongs to deep learning fields, and in particular to a kind of asynchronous parallel optimization side for neural metwork training Method.
Background technique
The fields such as artificial intelligence and its relevant computer learning, deep learning have received widespread attention in recent years. Its application in fields such as recognition of face, Face datection, natural language processing, speech recognitions also shows powerful ability.
A very common and important technology is artificial neural network (abbreviation nerve net in artificial intelligence and deep learning Network).One neural network is made of several neurons, and a neuron can be regarded as receiving several input signals, then According to the function of certain rule one signal of output, this function corresponds to some adjustable parameters.Between neuron according to Certain rule carries out series connection and parallel connection just constitutes a neural network, and common neural network structure has full connection nerve net Network, convolutional neural networks (CNN) etc..Fig. 1 is a simple three-layer neural network schematic diagram, wherein input layer and middle layer point Not there are three neuron, output layer has a neuron, all connects entirely between layers.The network receives three tunnels input letter Number, then export signal all the way.In general, a neural network externally can be regarded as receiving several input letters Number, the function of several signals is then exported according to certain rule, rule here is by all neurons inside neural network State modulator, the also referred to as parameter of this neural network.The neural network of same structure, if parameter is different, nerve net The function of network is not often also identical, i.e., different signals can be exported for same input signal.The training of neural network refers to The process of parameter in neural network is constantly adjusted according to some way, so that the output of network meets expectation.
Currently used neural network training method is the stochastic gradient method based on data set.Data set is one group given The set of the data of input/output relation, it is related with problem to be solved.It is every in data set such as in image classification problem One data generally comprises a width picture (input) and the corresponding classification of the picture (output).Data set is usually from practical life It is collected in work, and needs some processes manually demarcated.The purpose of training of the neural network based on data set is exactly to adjust Parameter in neural network makes given network specifically input (such as a width picture), and the output of network is as close possible to data Concentrate corresponding output (such as classification of the picture).Stochastic gradient method is a kind of training method being widely used, training Process is that each step first takes out a certain number of data in data set at random, is then defined on nerve using the calculating of these data The gradient of target function value and objective function about network parameter on network adjusts nerve further according to the gradient of objective function Parameter in network.The step for being repeated when training, until the effect of neural network output is satisfactory.Therefore, stochastic gradient " random " in method takes out a part of data when referring to trained from data set at random, and " gradient " refers to update neural network parameter Using gradient method.
Since the data set that training uses in real problems is generally very big, corresponding neural network is also larger, uses Stochastic gradient method training neural network in single computer or equipment needs a large amount of time that can just obtain satisfied effect. In order to accelerate the training of network, a kind of possible way is using multiple stage computers or equipment while to carry out the training of network, I Be referred to as distributed method.Needing the main problem solved using distributed method is information between different computers Exchange with it is synchronous, mainly take two methods at present.One is choosing one in multiple stage computers to be used as host, remaining is calculated The result of calculating such as gradient etc. is issued host as slave, every slave of each step of training process by machine, and then host is integrated The data obtained from different slaves update neural network again.This method requires every slave that can be communicated with host, Therefore restriction of the calculating speed by the processing speed and main-machine communication bandwidth of host.
Another method is the copy that every computer all retains a neural network, and then each step of training process is every Platform computer first updates the neural network copy of itself according to data set thereon, then the copy is issued to neighbouring several calculating Machine, and their neural network copy is received from these computers, finally take the copy and receipts of certain strategy fusion oneself The copy arrived.The difference of host and slave is not present in this method, the communication load basic one of all computers in training process It causes, therefore training speed is not readily susceptible to the restriction of single computer.
In above-mentioned second method, although overall training speed is not limited by the bandwidth of single computer, It is that all computers must be updated with identical clock and frequency, i.e. the training process synchronization that requires different computers. This is required prevent every computer according to the clock of oneself from being updated, it is necessary to wait the information for receiving other computers It together updates afterwards.When the computing capability of different computers has difference, this method can make computing capability strong, updating decision Computer will wait computing capability weak, update slow computer, cause the waste of resource.
Summary of the invention
The purpose of the invention is to overcome the shortcomings of existing neural network training method and improve the instruction of neural network Practice efficiency, proposes a kind of asynchronous parallel optimization method for neural metwork training.The present invention implements simple, network training speed Degree is fast, to the favorable expandability of large-scale dataset and computer cluster.
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, which is characterized in that including following Step:
(1) by the data in the data set D for being used to train neural network to being assigned to n >=1 computer, the data Collecting D includes N group inputoutput data pair;The Sub Data Set being assigned on i-th computer is expressed as Di, DiCorresponding data It is N to group numberi
(2) Communication topology for determining n platform computer obtains the corresponding reception computer of every computer and sends number According to set of computers;It enablesIndicate the set of computers for the data that i-th computer of reception is sent,It is counted including i-th Calculation machine itself,The quantity of middle computer is usedIt indicates, enablesExpression sends data to the computer set of i-th computer It closes;
(3) the respective neural network of every computer initialization and relevant parameter;Specific step is as follows:
(3-1) every computer initializes a neural network according to the neural network structure defined in advance;Every is counted The initial parameter of calculation machine epineural network is denoted as wi(0), the corresponding the number of iterations k of every computer is setiIt is 0, every meter is set The consistency variable initial value z of calculation machineiAnd total step-length variable initial value l (0)=1i(0)=1;
(3-2) every computer establishes three buffer area W respectivelyrec,i,Zrec,iAnd Lrec,iAnd it is initialized as sky, three Buffer area stores the weighting parameters of this computer received from other computers respectively, weights consistency variable, Yi Jizong Step-length variable;
(3-3) every computer defines the corresponding trigger event of the computer;
(3-4) every computer i calculates initial weighting parametersInitial weighting consistency variableThen willAnd li(0) it is sent toIn all computers;
(4) every computer is iterated training to respective neural network, until Neural Network Optimization finishes;Specific step It is rapid as follows:
Before the arrival of corresponding trigger event, every computer receives (4-1) every computer iIn other calculating Weighting parameters, weighting consistency variable and the total step-length variable of machine, and it is stored in corresponding buffer area W respectivelyrec,i,Zrec,iWith And Lrec,i
(4-2) when the arrival of the corresponding trigger event of computer i, this computer is performed the following operation:
(4-2-1) computer i updates each variable according to the following formula:
Wherein, ρ (t) is sequence, and t is the call number of sequence,It is loss function L (D, w) about nerve Network parameter wiStochastic gradient;
(4-2-2) computer i updates weighting parameters respectivelyWeight consistency variableThen willAnd li(ki+ 1) it is sent toIn all computers;
(4-2-3) updates the number of iterations ki, enable kiIncrease by 1, then return to step (4-2-1), when computer i meets When stopping criterion for iteration, this computer terminates repetitive exercise;
(4-3) after all computers terminate repetitive exercise, final neural network parameter is equal on any one computer For neural network, finally trained parameter, Neural Network Optimization are finished.
The features of the present invention and beneficial effect are:
(1) present invention in neural network be by multiple stage computers come and meanwhile train, for same network structure, do so Improve training speed.In addition, the distributed feature of this method makes it possible using bigger network, i.e. neural network Design is not only restricted to the computing capability of single computer.Compared with the single computer training method being widely used at present, we Method can use multiple stage computers cluster to significantly improve training speed.
(2) in the present invention different computers it is equal, the traffic of the every computer in each step is roughly the same, And every computer only needs to communicate with sub-fraction computer, do not need a host or similar central computer come with Every other computer communication.Compared to one host of currently used setting, other computers are slave, then host and all The training method that slave is communicated, this method make the traffic of single computer will not be as the bottleneck of whole system, more It is applied in the case that suitable number of computers is especially more.
(3) every computer updates itself neural network pair according to oneself clock or trigger event in the present invention This, is a kind of asynchronous algorithm without synchronous with the holding of other computers.In asynchronous algorithm every computer according to oneself Clock and frequency update network, and the communication between computer is allowed to there is delay, can make full use of the meter of every computer Calculation ability will not cause computer to have standby time to accelerate training process because of synchronizing, it is easier to expand to a large amount of The case where computer.
(4) future of the invention can be applied to include the image classification using nerual network technique, target detection, intensified learning Deng there is higher application value.
Detailed description of the invention
Fig. 1 is a three-layer neural network schematic diagram.
Fig. 2 is to change over time schematic diagram using the loss function error of the method for the present invention and distributed random gradient method.
Specific embodiment
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, with reference to the accompanying drawing and specific real It is as follows to apply example further description.
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, with n platform computer to one For neural network is trained.Enable (xi,yi) one group of inputoutput data pair is represented, wherein xiFor corresponding input feature vector to Amount, yiFor xiCorresponding true output.Enable D={ (xi,yi), i=1 .., N } indicate the data sets of N group data pair.Nerve It is x that network, which can be regarded as an input,iFunction, export as to true output yiEstimated valueI.e.Wherein w is parameter to be adjusted in the neural network.The corresponding function f of the network of different structure (xi, w) it is also different.The purpose of training neural network is to makeWith yiBetween gap it is as small as possible, for this purpose, training specific mesh Here to minimize loss functionFor, remaining alternative loss function further includes handing over Pitch entropy loss functionAnd the loss of regular terms is added on the basis of both loss functions Function:WithHere λ is one fixed Positive number, size will be selected according to practical problem.
The present invention proposes a kind of asynchronous parallel optimization method for neural metwork training, comprising the following steps:
(1) data set for being used to train neural network is assigned to n platform computer;
According to the computing capability of different computers, storage capacity or other practical factors will have the number of N group data pair Being assigned to n >=1 computer according to the data in collection D, (computer generally has CPU, memory, the general-purpose computations of hard disk Machine) on, N is typically much deeper than n in practical problem.The method of salary distribution can be any mode, for example data can be evenly distributed to n On platform computer, then every computer obtains N/n data.Here the Sub Data Set being assigned on i-th computer is indicated For Di, corresponding data are N to group numberi.Every computer can only access the data being assigned to thereon, i.e., i-th in training process Platform computer can only access DiData set.
(2) Communication topology for determining n platform computer obtains the corresponding computer that can receive of every computer and sends out Send the set of computers of data.
In the light of actual conditions determine which the communication mode of this n platform computer, i.e. every computer can send data to Platform computer or the data for receiving which platform computer.Ideally, every computer can give all other computer Send data, but bandwidth is limited in practice, whole efficiency may be made lower in this way, if therefore every computer may only to Dry platform computer sends data.In order to indicate convenient, it is first 1 ..., n to every computer number, then usesExpression can To receive the set of computers for the data that i-th computer is sent.Here it providesIncluding i-th computer itself.In The quantity of computer is usedIt indicates.It enables againExpression sends data to the set of computers of i-th computer.
(3) the respective neural network of every computer initialization and relevant parameter;Specific step is as follows:
(3-1) every computer initializes a neural network according to the neural network structure defined in advance, wherein every The parameter of neural network can randomly select when computer initialization.By the initial parameter note of i-th computer epineural network For wi(0), subscript i here and later indicates i-th computer.The neural network structure of different computers is identical, but It is that parameter may be different.In addition, every computer is by corresponding the number of iterations kiIt is set as 0, at the beginning of then consistency variable is set Initial value ziAnd total step-length variable initial value l (0)=1i(0)=1.(3-2) every computer needs in respective memory respectively Establish three buffer area Wrec,i,Zrec,iAnd Lrec,i, it is respectively intended to store on other computers received from other computers Weighting parameters, weight consistency variable, and total step-length variable.
It is (different that (3-3) every computer also needs to define the corresponding trigger event of the computer according to the actual situation The trigger event of computer may be different), the operation in step (4) will be executed when triggering every time.Trigger event is according to practical feelings Condition has different definition ways, and such as every primary by triggering in 1 second, triggering one is inferior after receiving data every time.
(3-4) every computer i calculates initial weighting parametersInitial weighting consistency variableThen willAnd li(0) it is sent toIn all computers.Here 0 generation in bracket The initial value of the table parameter.
(4) every computer is iterated training to respective neural network, until Neural Network Optimization finishes;Specific step It is rapid as follows:
(4-1) every computer i waits the arrival of the trigger event of predefined.During this period, every computer is all It will receiveIn the weighting parameters of other computers weight consistency variable, and total step-length variable be stored in respectively it is corresponding slow Deposit area Wrec,i,Zrec,iAnd Lrec,i(transmission for the first time of every computer is initial value, therefore what is received for the first time is also Initial value);Before the corresponding first time trigger event of every computer arrives, three initial parameters of the computerAnd li(0) it will not change, computer can update these three according to the following steps after triggering for the first time Parameter, changes will occur for these three variables after triggering every time later;The frequency that each computer is sent is not fixed, can basis Actual conditions are sent with optional frequency, and the transmission frequency of all computers is recommended to be consistent in certain practical operation as far as possible, this Sample effect is more preferable;Each computer only needs to retain one group of parameterAnd li, but this group of parameter can be in optimization process In constantly change.
For example, before the trigger event next time of computer i arrives, if computer i receives computer j transmission Data twice:Andlj(6), then computer iWithDeposit Wrec,i,WithIt is stored in Zrec,i, lj(5) and lj(6) it is stored in Lrec,i.The trigger event of different computers is mutually only It is vertical, before certain computer is not triggered, all parameters on the computer (being assumed to be j) (And lj) keep not Become, and do not send data, the possible data to arrive to be received such as only.
(4-2) is then performed the following operation once computer i is triggered:
(4-2-1) computer i is calculated according to the following formula and is updated each variable
Wherein ρ (t) is predefined sequence, such as constant sequence ρ (t)=ρ or sequenceT refers to The call number of sequence, such as sequenceThe sequence of representative is exactly For loss Function L (D, w) is about neural network parameter wiStochastic gradient.C in above formulai(ki+1),zi(ki+1),mi(ki+1), And αi(kiIt+1) is to update neural network parameter wi(ki+ 1), consistency variable zi(ki, and total step-length variable l+1)i (ki+ 1) temporary variable defined during for simplified formula does not have tool no need to send other computers are given yet Body physical significance.
When loss function isWhen, i-th computer calculates the method for its stochastic gradient such as Under: first from its Sub Data Set DiIn randomly select p data and be denoted as (xi,1,yi,1),(xi,2,yi,2),...,(xi,p,yi,p)。p For a specified in advance positive integer, such as 16,32 etc..Then the corresponding output of these data of neural computing is utilized, I.e.Finally calculate stochastic gradient:
Here ▽ fw(xi,j,mi(kiIt+1) is) nerve The output of network is being inputted about the gradient of w as xi,j, parameter mi(ki+ 1) value under.When using other loss functions,It is correspondingly changed to stochastic gradient of other loss functions under this p data, is such as usedWhen as loss function,It adopts With
When as loss function,
(4-2-2) computer i updates weighting parameters and weighting consistency variable respectively, then together with updated total step-length Variable is sent jointly toIn all computers.
Computer i updates weighting parametersWeight consistency variableThen willAnd li(ki+ 1) it is sent toIn all computers.
(4-2-3) updates the number of iterations kiEven kiIncrease by 1, then returns to step (4-2-1);When computer i thing When the termination condition first defined obtains meeting, this computer terminates repetitive exercise.When training can be set into termination condition Between exceed schedule time or li(ki+ 1) size is more than number etc. specified in advance.
(4-3) when the repetitive exercise on all computers all after, final neural network on any one computer Parameter wi(ki+ 1) all can serve as neural network, finally trained parameter, Neural Network Optimization finish.
In the present invention, if certain computer has first reached the stop condition defined in advance, so that it may end first training.It is real In the operation of border, different computers terminate the trained time will not difference it is too many, approximate can regard as while terminate.
Fig. 2 is in one embodiment of the invention respectively using this method and currently used distributed random gradient method Loss function error with time change comparison.Wherein abscissa indicates training time in seconds, ordinate table Show the value of loss function in training process under logarithmic coordinates system.What solid line indicated is the table of the mentioned method of the present invention in this example Existing, dotted line is using the performance of distributed random gradient method in this example.It can be seen from the figure that the mentioned algorithm of the present invention is received Its validity will be embodied far faster than distributed random gradient method by holding back speed.And the disadvantage is that in training process error concussion compared with Greatly.

Claims (1)

1. a kind of asynchronous parallel optimization method for neural metwork training, which comprises the following steps:
(1) by the data in the data set D for being used to train neural network to being assigned to n >=1 computer, the data set D Include N group inputoutput data pair;
The Sub Data Set being assigned on i-th computer is expressed as Di, DiCorresponding data are N to group numberi
(2) Communication topology for determining n platform computer obtains the corresponding computer that receives of every computer and sends data Set of computers;It enablesIndicate the set of computers for the data that i-th computer of reception is sent,Including i-th computer Itself,The quantity of middle computer is usedIt indicates, enablesExpression sends data to the set of computers of i-th computer;
(3) the respective neural network of every computer initialization and relevant parameter;Specific step is as follows:
(3-1) every computer initializes a neural network according to the neural network structure defined in advance;By every computer The initial parameter of epineural network is denoted as wi(0), the corresponding the number of iterations k of every computer is setiIt is 0, every computer is set Consistency variable initial value ziAnd total step-length variable initial value l (0)=1i(0)=1;
(3-2) every computer establishes three buffer area W respectivelyrec,i,Zrec,iAnd Lrec,iAnd it is initialized as sky, three cachings Area stores the weighting parameters of this computer received from other computers respectively, weights consistency variable and total step-length Variable;
(3-3) every computer defines the corresponding trigger event of the computer;
(3-4) every computer i calculates initial weighting parametersInitial weighting consistency variableThen willAnd li(0) it is sent toIn all computers;
(4) every computer is iterated training to respective neural network, until Neural Network Optimization finishes;Specific steps are such as Under:
Before the arrival of corresponding trigger event, every computer receives (4-1) every computer iIn other computers Weighting parameters, weighting consistency variable and total step-length variable, and it is stored in corresponding buffer area W respectivelyrec,i,Zrec,iAnd Lrec,i
(4-2) when the arrival of the corresponding trigger event of computer i, this computer is performed the following operation:
(4-2-1) computer i updates each variable according to the following formula:
Wherein, ρ (t) is sequence, and t is the call number of sequence,It is loss function L (D, w) about neural network Parameter wiStochastic gradient;
(4-2-2) computer i updates weighting parameters respectivelyWeight consistency variableThen willAnd li(ki+ 1) it is sent toIn all computers;
(4-2-3) updates the number of iterations ki, enable kiIncrease by 1, step (4-2-1) is then returned to, when computer i meets iteration When termination condition, this computer terminates repetitive exercise;
(4-3) after all computers terminate repetitive exercise, final neural network parameter is mind on any one computer Through network, finally trained parameter, Neural Network Optimization are finished.
CN201811265027.8A 2018-10-29 2018-10-29 A kind of asynchronous parallel optimization method for neural metwork training Pending CN109508785A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811265027.8A CN109508785A (en) 2018-10-29 2018-10-29 A kind of asynchronous parallel optimization method for neural metwork training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811265027.8A CN109508785A (en) 2018-10-29 2018-10-29 A kind of asynchronous parallel optimization method for neural metwork training

Publications (1)

Publication Number Publication Date
CN109508785A true CN109508785A (en) 2019-03-22

Family

ID=65746922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811265027.8A Pending CN109508785A (en) 2018-10-29 2018-10-29 A kind of asynchronous parallel optimization method for neural metwork training

Country Status (1)

Country Link
CN (1) CN109508785A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175680A (en) * 2019-04-03 2019-08-27 西安电子科技大学 Utilize the internet of things data analysis method of the online machine learning of distributed asynchronous refresh
CN111582494A (en) * 2020-04-17 2020-08-25 浙江大学 Hybrid distributed machine learning updating method based on delay processing
CN112633480A (en) * 2020-12-31 2021-04-09 中山大学 Calculation optimization method and system of semi-asynchronous parallel neural network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150302295A1 (en) * 2012-07-31 2015-10-22 International Business Machines Corporation Globally asynchronous and locally synchronous (gals) neuromorphic network
CN106293942A (en) * 2016-08-10 2017-01-04 中国科学技术大学苏州研究院 Neutral net load balance optimization method based on the many cards of multimachine and system
CN107018184A (en) * 2017-03-28 2017-08-04 华中科技大学 Distributed deep neural network cluster packet synchronization optimization method and system
CN107209872A (en) * 2015-02-06 2017-09-26 谷歌公司 The distributed training of reinforcement learning system
CN107784364A (en) * 2016-08-25 2018-03-09 微软技术许可有限责任公司 The asynchronous training of machine learning model
CN108073986A (en) * 2016-11-16 2018-05-25 北京搜狗科技发展有限公司 A kind of neural network model training method, device and electronic equipment
CN108460457A (en) * 2018-03-30 2018-08-28 苏州纳智天地智能科技有限公司 A kind of more asynchronous training methods of card hybrid parallel of multimachine towards convolutional neural networks
US20180293492A1 (en) * 2017-04-10 2018-10-11 Intel Corporation Abstraction library to enable scalable distributed machine learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150302295A1 (en) * 2012-07-31 2015-10-22 International Business Machines Corporation Globally asynchronous and locally synchronous (gals) neuromorphic network
CN107209872A (en) * 2015-02-06 2017-09-26 谷歌公司 The distributed training of reinforcement learning system
CN106293942A (en) * 2016-08-10 2017-01-04 中国科学技术大学苏州研究院 Neutral net load balance optimization method based on the many cards of multimachine and system
CN107784364A (en) * 2016-08-25 2018-03-09 微软技术许可有限责任公司 The asynchronous training of machine learning model
CN108073986A (en) * 2016-11-16 2018-05-25 北京搜狗科技发展有限公司 A kind of neural network model training method, device and electronic equipment
CN107018184A (en) * 2017-03-28 2017-08-04 华中科技大学 Distributed deep neural network cluster packet synchronization optimization method and system
US20180293492A1 (en) * 2017-04-10 2018-10-11 Intel Corporation Abstraction library to enable scalable distributed machine learning
CN108460457A (en) * 2018-03-30 2018-08-28 苏州纳智天地智能科技有限公司 A kind of more asynchronous training methods of card hybrid parallel of multimachine towards convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JIAQI ZHANG等: "AsySPA: An Exact Asynchronous Algorithm for Convex Optimization Over Digraphs", 《ARXIV》 *
TIANYU WU等: "Decentralized Consensus Optimization with Asynchrony and Delays", 《ARXIV》 *
WILLIAM CHAN: "Distributed Asynchronous Optimization of Convolutional Neural Networks", 《INTERSPEECH》 *
谢佩: "网络化分布式凸优化算法研究进展", 《控制理论与应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175680A (en) * 2019-04-03 2019-08-27 西安电子科技大学 Utilize the internet of things data analysis method of the online machine learning of distributed asynchronous refresh
CN110175680B (en) * 2019-04-03 2024-01-23 西安电子科技大学 Internet of things data analysis method utilizing distributed asynchronous update online machine learning
CN111582494A (en) * 2020-04-17 2020-08-25 浙江大学 Hybrid distributed machine learning updating method based on delay processing
CN111582494B (en) * 2020-04-17 2023-07-07 浙江大学 Mixed distributed machine learning updating method based on delay processing
CN112633480A (en) * 2020-12-31 2021-04-09 中山大学 Calculation optimization method and system of semi-asynchronous parallel neural network
CN112633480B (en) * 2020-12-31 2024-01-23 中山大学 Calculation optimization method and system of semi-asynchronous parallel neural network

Similar Documents

Publication Publication Date Title
Zhong et al. Practical block-wise neural network architecture generation
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
Zhang et al. Adaptive federated learning on non-iid data with resource constraint
CN110134636B (en) Model training method, server, and computer-readable storage medium
CN106297774B (en) A kind of the distributed parallel training method and system of neural network acoustic model
CN109508785A (en) A kind of asynchronous parallel optimization method for neural metwork training
CN111858009A (en) Task scheduling method of mobile edge computing system based on migration and reinforcement learning
Cai et al. Dynamic sample selection for federated learning with heterogeneous data in fog computing
CN114584581B (en) Federal learning system and federal learning training method for intelligent city internet of things (IOT) letter fusion
CN108446770B (en) Distributed machine learning slow node processing system and method based on sampling
CN110362380A (en) A kind of multiple-objection optimization virtual machine deployment method in network-oriented target range
Zhan et al. Pipe-torch: Pipeline-based distributed deep learning in a gpu cluster with heterogeneous networking
CN109063041A (en) The method and device of relational network figure insertion
Tanaka et al. Automatic graph partitioning for very large-scale deep learning
Nie et al. HetuMoE: An efficient trillion-scale mixture-of-expert distributed training system
CN114399018B (en) Efficient ientNet ceramic fragment classification method based on sparrow optimization of rotary control strategy
Hu et al. Improved methods of BP neural network algorithm and its limitation
CN109636709A (en) A kind of figure calculation method suitable for heterogeneous platform
CN116702925A (en) Distributed random gradient optimization method and system based on event triggering mechanism
Ho et al. Adaptive communication for distributed deep learning on commodity GPU cluster
Cheng et al. Bandwidth reduction using importance weighted pruning on ring allreduce
CN114995157A (en) Anti-synchronization optimization control method of multi-agent system under cooperative competition relationship
CN113672684A (en) Layered user training management system and method for non-independent same-distribution data
Lu et al. Adaptive asynchronous federated learning
CN108334939B (en) Convolutional neural network acceleration device and method based on multi-FPGA annular communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190322