CN107958284A - The training method and device of neutral net, computing device - Google Patents

The training method and device of neutral net, computing device Download PDF

Info

Publication number
CN107958284A
CN107958284A CN201711157050.0A CN201711157050A CN107958284A CN 107958284 A CN107958284 A CN 107958284A CN 201711157050 A CN201711157050 A CN 201711157050A CN 107958284 A CN107958284 A CN 107958284A
Authority
CN
China
Prior art keywords
layer
output data
intermediate layer
nervus opticus
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711157050.0A
Other languages
Chinese (zh)
Inventor
董健
韩玉刚
颜水成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201711157050.0A priority Critical patent/CN107958284A/en
Publication of CN107958284A publication Critical patent/CN107958284A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

Training method and device, computing device the invention discloses a kind of neutral net, its method include:Input data is inputted into trained obtained first nerves network, obtains the output data in the first intermediate layer of at least one layer of first nerves network;Input data is inputted into nervus opticus network to be trained, obtains the output data and final output data in the second intermediate layer of at least one layer of nervus opticus network, at least one layer of second intermediate layer has correspondence with least one layer of first intermediate layer;Using the loss between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer, and the loss between final output data and the output data that marks in advance, nervus opticus network is trained.The present invention, in the case where its calculation amount is constant, can greatly promote the performance of nervus opticus network in holding nervus opticus network.

Description

The training method and device of neutral net, computing device
Technical field
The present invention relates to deep learning field, and in particular to the training method and device of a kind of neutral net, computing device.
Background technology
Deep learning comes from the research to artificial neural network, and more abstract high-rise table is formed by combining low-level feature Show attribute classification or feature, to find that the distributed nature of data represents.Deep learning is that one kind is based on logarithm in machine learning According to the method for carrying out representative learning.Based on the neutral net for establishing, simulating human brain progress analytic learning, the mechanism for imitating human brain is come Explain data.Deep learning can extract highly effective algorithm to replace with the feature learning and layered characteristic of non-supervisory formula or Semi-supervised In generation, obtains feature by hand.Deep learning can be applied in such as Face datection, recognition of face, a variety of applications of scene analysis.With The development of deep learning, the application of deep learning is also more and more extensive.
Speed using the network of deep learning is faster, its accuracy rate is higher, its performance is better.Such as use deep layer network When (cloud server when) carries out deep learning, it can be supplied to the preferable Environmental Support of deep learning, and capability of fitting is preferable. But when carrying out deep learning (such as mobile equipment) using shallow-layer network, limited by itself environment, computing capability is limited, intends Conjunction ability is poor, is not typically available preferable performance.
Therefore, it is necessary to a kind of training method of neutral net, the performance of deep learning when using shallow-layer network to be lifted.
The content of the invention
In view of the above problems, it is proposed that the present invention overcomes the above problem in order to provide one kind or solves at least in part State the training method and device, computing device of the neutral net of problem.
According to an aspect of the invention, there is provided a kind of training method of neutral net, it includes:
Input data is inputted into trained obtained first nerves network, obtains at least one layer of first nerves network The output data in the first intermediate layer;
Input data is inputted into nervus opticus network to be trained, obtains at least one layer second of nervus opticus network The output data and final output data in intermediate layer, at least one layer of second intermediate layer and at least one layer of first intermediate layer have pair It should be related to;
Using between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer Loss between loss, and final output data and the output data that marks in advance, is trained nervus opticus network.
Alternatively, the number of plies of first nerves network is more than nervus opticus network.
Alternatively, at least one layer of first intermediate layer includes the bottleneck layer of first nerves network;At least one layer of second intermediate layer Bottleneck layer comprising nervus opticus network.
Alternatively, the output data and the output data at least one layer of first intermediate layer at least one layer of second intermediate layer are utilized Between loss, and the loss between final output data and the output data that marks in advance instructs nervus opticus network White silk further comprises:
According between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer The weight parameter of loss renewal nervus opticus network, according to the loss between final output data and the output data marked in advance more The weight parameter of new nervus opticus network, is trained nervus opticus network.
Alternatively, input data is being inputted into nervus opticus network to be trained, is obtaining nervus opticus network extremely Before the output data and final output data in few one layer of second intermediate layer, method further includes:
Input data is subjected to down-sampling processing, the input data using the data after processing as nervus opticus network.
Alternatively, the output data and the output data at least one layer of first intermediate layer at least one layer of second intermediate layer are utilized Between loss, and the loss between final output data and the output data that marks in advance instructs nervus opticus network White silk further comprises:
Using between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer Loss, and final output data and to down-sampling processing after data pre- mark output data between loss, to second Neutral net is trained.
According to another aspect of the present invention, there is provided a kind of training device of neutral net, it includes:
First output module, suitable for inputting input data into trained obtained first nerves network, obtains first The output data in the first intermediate layer of at least one layer of neutral net;
Second output module, suitable for inputting input data into nervus opticus network to be trained, obtains nervus opticus The output data and final output data in the second intermediate layer of at least one layer of network, at least one layer of second intermediate layer and at least one The first intermediate layer of layer has correspondence;
Training module, suitable for utilizing the defeated of the output data at least one layer of second intermediate layer and the first intermediate layer of at least one layer Go out the loss between data, and the loss between final output data and the output data that marks in advance, to nervus opticus network It is trained.
Alternatively, the number of plies of first nerves network is more than nervus opticus network.
Alternatively, at least one layer of first intermediate layer includes the bottleneck layer of first nerves network;At least one layer of second intermediate layer Bottleneck layer comprising nervus opticus network.
Alternatively, training module is further adapted for:
According between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer The weight parameter of loss renewal nervus opticus network, according to the loss between final output data and the output data marked in advance more The weight parameter of new nervus opticus network, is trained nervus opticus network.
Alternatively, device further includes:
Down sample module, suitable for input data is carried out down-sampling processing, using the data after processing as nervus opticus net The input data of network.
Alternatively, training module is further adapted for:
Using between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer Loss, and final output data and to down-sampling processing after data pre- mark output data between loss, to second Neutral net is trained.
According to another aspect of the invention, there is provided a kind of computing device, including:Processor, memory, communication interface and Communication bus, the processor, the memory and the communication interface complete mutual communication by the communication bus;
The memory is used to store an at least executable instruction, and it is above-mentioned that the executable instruction performs the processor The corresponding operation of training method of neutral net.
In accordance with a further aspect of the present invention, there is provided a kind of computer-readable storage medium, be stored with the storage medium to A few executable instruction, the executable instruction make processor perform the corresponding operation of training method such as above-mentioned neutral net.
The training method and device of the neutral net provided according to the present invention, computing device, input data is inputted to warp In the first nerves network that training obtains, the output data in the first intermediate layer of at least one layer of first nerves network is obtained;Will be defeated Enter data to input into nervus opticus network to be trained, obtain the output in the second intermediate layer of at least one layer of nervus opticus network Data and final output data, at least one layer of second intermediate layer have correspondence with least one layer of first intermediate layer;Utilize Loss between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer, and finally Loss between output data and the output data marked in advance, is trained nervus opticus network.The present invention is by using the Among the output data corresponding to nervus opticus network at least one layer of second in the first intermediate layer of at least one layer of one neutral net The output data of layer is trained, and nervus opticus network can be kept to greatly promote second in the case where its calculation amount is constant The performance of neutral net, the training time of effective reduction training nervus opticus network, the training effectiveness of the second network of raising.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by the embodiment of the present invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area Technical staff will be clear understanding.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And in whole attached drawing, identical component is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows the flow chart of the training method of neutral net according to an embodiment of the invention;
Fig. 2 shows the flow chart of the training method of neutral net in accordance with another embodiment of the present invention;
Fig. 3 shows the functional block diagram of the training device of neutral net according to an embodiment of the invention;
Fig. 4 shows the functional block diagram of the training device of neutral net in accordance with another embodiment of the present invention;
Fig. 5 shows a kind of structure diagram of computing device according to an embodiment of the invention.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
Fig. 1 shows the flow chart of the training method of neutral net according to an embodiment of the invention.As shown in Figure 1, The training method of neutral net specifically comprises the following steps:
Step S101, input data is inputted into trained obtained first nerves network, obtains first nerves network The first intermediate layer of at least one layer output data.
First nerves network is to first pass through the neutral net that training has been cured in advance.Specifically, first nerves Web vector graphic Multiple samples with used in the relevant training of input data, after training, first nerves network can be good at being suitable for defeated Enter data.Wherein, first nerves network is preferably using deep-neural-network, the neutral net as being applied to cloud server, its Performance is good, computationally intensive, and accuracy rate is high, and speed can be slower.First nerves network can export the defeated of the first intermediate layer of multilayer Go out data, as first nerves network includes 4 layer of first intermediate layer, be respectively the 4th layer of the first intermediate layer, the 3rd layer of the first intermediate layer, 2nd layer of the first intermediate layer and the 1st layer of the first intermediate layer, wherein, the 1st layer of the first intermediate layer is the bottleneck layer of first nerves network.
Input data is inputted into first nerves network, can be obtained among at least one layer first of first nerves network The output data of layer.Here it is possible to only obtain the output data in one layer of first intermediate layer, the first of adjacent multilayer can also be obtained The output data in intermediate layer, or the output data in the first intermediate layer of spaced multilayer is obtained, with specific reference to implementation Actual conditions are configured, and are not limited herein.
Step S102, input data is inputted into nervus opticus network to be trained, and obtains nervus opticus network extremely The output data and final output data in few one layer of second intermediate layer.
Nervus opticus network is this neutral net to be trained, and is shallow-layer neutral net, as applied to mobile terminal Neutral net, its computing capability is limited, and performance is bad.The number of plies of first nerves network is more than nervus opticus network.Such as the first god The number of plies through network is 4 layers, is respectively the 4th layer of the first intermediate layer, the 3rd layer of the first intermediate layer, the 2nd layer of the first intermediate layer and the 1st The first intermediate layer of layer;The number of plies of nervus opticus network is 2 layers, is respectively the 2nd layer of the second intermediate layer and the 1st layer of the second intermediate layer.
Input data is inputted into nervus opticus network to be trained, obtains at least one layer second of nervus opticus network The output data in intermediate layer.Wherein, at least one layer of second intermediate layer has correspondence with least one layer of first intermediate layer.Such as the 1st layer of the first intermediate layer of one neutral net is corresponding with the 1st layer of the second intermediate layer of nervus opticus network, first nerves network 2nd layer of the first intermediate layer is corresponding with the 2nd layer of the second intermediate layer of nervus opticus network.
The output data in the second intermediate layer of the nervus opticus network of acquisition needs the with the first nerves network that obtains The output data in one intermediate layer is corresponding, if obtaining the output data in two layers of first intermediate layers of first nerves network, it is also desirable to Obtain the output data in two layers of second intermediate layers of nervus opticus network.Such as obtain the layers 1 and 2 of first nerves network The output data in one intermediate layer, the output data in corresponding the second intermediate layer of layers 1 and 2 for obtaining nervus opticus network.
Preferably, at least one layer of first intermediate layer can include the bottleneck layer of first nerves network, i.e. first nerves network The 1st layer of the first intermediate layer, at least one layer of second intermediate layer includes the bottleneck layer of nervus opticus network, i.e. nervus opticus network 1st layer of the second intermediate layer.Hidden layer is top in bottleneck layer, that is, neutral net, one layer of minimum centre of the vector dimension of output Layer.Use bottleneck layer, it is ensured that subsequently when being trained, make final output data more accurate, preferably trained As a result.
Input data is being inputted into nervus opticus network to be trained, except at least one layer for obtaining nervus opticus network Outside the output data in the second intermediate layer, it is also necessary to obtain the final output data of nervus opticus network, facilitate the use final defeated Go out data counting loss, nervus opticus network is trained.
Step S103, utilizes the output data and the output number at least one layer of first intermediate layer at least one layer of second intermediate layer Loss between loss between, and final output data and the output data that marks in advance, carries out nervus opticus network Training.
According between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer Loss, can update the weight parameter of nervus opticus network, make the output number at least one layer of second intermediate layer of nervus opticus network According to the output data gone as far as possible close at least one layer of first intermediate layer of first nerves network.Meanwhile according to nervus opticus net Loss between the final output data of network and the output data marked in advance, can update the weight parameter of nervus opticus network, Nervus opticus network final output data is gone as far as possible close to the output data marked in advance, ensure that nervus opticus network is final The accuracy of output data.In the above manner, complete to be trained nervus opticus network.
The training method of the neutral net provided according to the present invention, input data is inputted to trained the first obtained god Through in network, obtaining the output data in the first intermediate layer of at least one layer of first nerves network;Input data is inputted to waiting to instruct In experienced nervus opticus network, the output data and final output in the second intermediate layer of at least one layer of nervus opticus network are obtained Data, at least one layer of second intermediate layer have correspondence with least one layer of first intermediate layer;Among at least one layer second Loss between the output data of layer and the output data at least one layer of first intermediate layer, and final output data and pre- mark Output data between loss, nervus opticus network is trained.The present invention by using first nerves network at least The output data in the output data in one layer of first intermediate layer at least one layer of second intermediate layer corresponding to nervus opticus network carries out Training, can keep nervus opticus network to greatly promote the performance of nervus opticus network in the case where its calculation amount is constant, have The training time of the reduction training nervus opticus network of effect, improves the training effectiveness of the second network.
Fig. 2 shows the flow chart of the training method of neutral net in accordance with another embodiment of the present invention.Such as Fig. 2 institutes Show, the training method of neutral net specifically comprises the following steps:
Step S201, input data is inputted into trained obtained first nerves network, obtains first nerves network The first intermediate layer of at least one layer output data.
The step is with reference to the step S101 in Fig. 1 embodiments, and details are not described herein.
Step S202, carries out down-sampling processing, using the data after processing as the defeated of nervus opticus network by input data Enter data.
It is shallow-layer neutral net in view of nervus opticus network, when input data is larger, directly uses input data meeting Influence the arithmetic speed of nervus opticus network.Down-sampling processing first can be carried out to input data, when such as input data being picture, Photo resolution can first be reduced by carrying out down-sampling processing, the input data using the data after processing as nervus opticus network. When so handling, the input data of low resolution is trained after the processing of nervus opticus Web vector graphic down-sampling, first nerves net Network is trained using high-resolution input data, when being trained using the output data of two neutral nets so that the Two neutral nets can also obtain the input data of low resolution high-resolution output result.
Step S203, input data is inputted into nervus opticus network to be trained, and obtains nervus opticus network extremely The output data and final output data in few one layer of second intermediate layer.
The number of plies of first nerves network is more than nervus opticus network.The number of plies such as first nerves network is 4 layers, is respectively the 4 layer of first intermediate layer, the 3rd layer of the first intermediate layer, the 2nd layer of the first intermediate layer and the 1st layer of the first intermediate layer;Nervus opticus network The number of plies is 2 layers, is respectively the 2nd layer of the second intermediate layer and the 1st layer of the second intermediate layer.
Input data after down-sampling is handled is inputted into nervus opticus network to be trained, and obtains nervus opticus network The second intermediate layer of at least one layer output data.Wherein, at least one layer of second intermediate layer has with least one layer of first intermediate layer There is correspondence.Such as the 1st layer of the first intermediate layer and the 1st layer of the second intermediate layer pair of nervus opticus network of first nerves network Should, the 2nd layer of the first intermediate layer of first nerves network is corresponding with the 2nd layer of the second intermediate layer of nervus opticus network.
The output data in the second intermediate layer of the nervus opticus network of acquisition needs the with the first nerves network that obtains The output data in one intermediate layer is corresponding, if obtaining the output data in two layers of first intermediate layers of first nerves network, it is also desirable to Obtain the output data in two layers of second intermediate layers of nervus opticus network.Such as obtain the layers 1 and 2 of first nerves network The output data in one intermediate layer, the output data in corresponding the second intermediate layer of layers 1 and 2 for obtaining nervus opticus network.
Preferably, at least one layer of first intermediate layer can include the bottleneck layer of first nerves network, i.e. first nerves network The 1st layer of the first intermediate layer, at least one layer of second intermediate layer includes the bottleneck layer of nervus opticus network, i.e. nervus opticus network 1st layer of the second intermediate layer.Hidden layer is top in bottleneck layer, that is, neutral net, one layer of minimum centre of the vector dimension of output Layer.Use bottleneck layer, it is ensured that subsequently when being trained, make final output data more accurate, preferably trained As a result.
Inputted in the input data after down-sampling is handled into nervus opticus network to be trained, except acquisition nervus opticus Outside the output data in the second intermediate layer of at least one layer of network, it is also necessary to the final output data of nervus opticus network are obtained, with Easy to utilize final output data counting loss, nervus opticus network is trained.
Step S204, utilizes the output data and the output number at least one layer of first intermediate layer at least one layer of second intermediate layer Loss between, and final output data and to down-sampling processing after data pre- mark output data between damage Lose, nervus opticus network is trained.
According between the output data at least one layer of second intermediate layer and the output data at least one layer of first intermediate layer Loss, can update the weight parameter of nervus opticus network, make the output number at least one layer of second intermediate layer of nervus opticus network According to the output data gone as far as possible close at least one layer of first intermediate layer of first nerves network.
Input data used in nervus opticus network is the input data after down-sampling processing, it is also necessary at down-sampling Input data after reason is marked in advance, obtains the output data of the pre- mark of data after down-sampling is handled.According to nervus opticus Loss after final output data and the down-sampling processing of network between the output data of the pre- mark of data, can update second The weight parameter of neutral net, makes nervus opticus network final output data go as far as possible close to data after down-sampling processing The output data marked in advance, ensures the accuracy of nervus opticus network final output data.In the above manner, complete to second Neutral net is trained.
The training method of the neutral net provided according to the present invention, with reference to the actual conditions of nervus opticus network, to input Data first carry out down-sampling processing, and the data after down-sampling is handled avoid influencing as the input data of nervus opticus network The arithmetic speed of nervus opticus network.Meanwhile nervus opticus Web vector graphic down-sampling processing after low resolution input data into Row training, the high-resolution input data of first nerves Web vector graphic are trained, and utilize the output data of two neutral nets When being trained so that nervus opticus network the input data of low resolution can also be obtained it is high-resolution output as a result, Greatly promote the training performance of nervus opticus network.
Fig. 3 shows the functional block diagram of the training device of neutral net according to an embodiment of the invention.Such as Fig. 3 institutes Show, the training device of neutral net includes following module:
First output module 310, suitable for inputting input data into trained obtained first nerves network, obtains the The output data in the first intermediate layer of at least one layer of one neutral net.
First nerves network is to first pass through the neutral net that training has been cured in advance.Specifically, first nerves Web vector graphic Multiple samples with used in the relevant training of input data, after training, first nerves network can be good at being suitable for defeated Enter data.Wherein, first nerves network is preferably using deep-neural-network, the neutral net as being applied to cloud server, its Performance is good, computationally intensive, and accuracy rate is high, and speed can be slower.First nerves network can export the defeated of the first intermediate layer of multilayer Go out data, as first nerves network includes 4 layer of first intermediate layer, be respectively the 4th layer of the first intermediate layer, the 3rd layer of the first intermediate layer, 2nd layer of the first intermediate layer and the 1st layer of the first intermediate layer, wherein, the 1st layer of the first intermediate layer is the bottleneck layer of first nerves network.
First output module 310 inputs input data into first nerves network, can obtain first nerves network The output data at least one layer of first intermediate layer.Here, the first output module 310 can only obtain the defeated of one layer of first intermediate layer Go out data, the output data in the first intermediate layer of adjacent multilayer can also be obtained, or obtain the first of spaced multilayer The output data in intermediate layer, is configured with specific reference to the actual conditions of implementation, does not limit herein.
Second output module 320, suitable for inputting input data into nervus opticus network to be trained, obtains the second god The output data and final output data in the second intermediate layer of at least one layer through network.
Nervus opticus network is this neutral net to be trained, and is shallow-layer neutral net, as applied to mobile terminal Neutral net, its computing capability is limited, and performance is bad.The number of plies of first nerves network is more than nervus opticus network.Such as the first god The number of plies through network is 4 layers, is respectively the 4th layer of the first intermediate layer, the 3rd layer of the first intermediate layer, the 2nd layer of the first intermediate layer and the 1st The first intermediate layer of layer;The number of plies of nervus opticus network is 2 layers, is respectively the 2nd layer of the second intermediate layer and the 1st layer of the second intermediate layer.
Second output module 320 inputs input data into nervus opticus network to be trained, and obtains nervus opticus net The output data in the second intermediate layer of at least one layer of network.Wherein, at least one layer of second intermediate layer and at least one layer of first intermediate layer With correspondence.Such as the 1st layer of the first intermediate layer and the 1st layer of the second intermediate layer pair of nervus opticus network of first nerves network Should, the 2nd layer of the first intermediate layer of first nerves network is corresponding with the 2nd layer of the second intermediate layer of nervus opticus network.
The output data in the second intermediate layer of the nervus opticus network that the second output module 320 obtains need with obtain the The output data in the first intermediate layer of one neutral net is corresponding, if the first output module 310 obtains the two of first nerves network The output data in the first intermediate layer of layer, the second output module 320 are also required to obtain two layers of second intermediate layers of nervus opticus network Output data.As the first output module 310 obtains the output number in the first intermediate layer of layers 1 and 2 of first nerves network According to corresponding second output module 320 obtains the output data in the second intermediate layer of layers 1 and 2 of nervus opticus network.
Preferably, at least one layer of first intermediate layer can include the bottleneck layer of first nerves network, i.e. first nerves network The 1st layer of the first intermediate layer, at least one layer of second intermediate layer includes the bottleneck layer of nervus opticus network, i.e. nervus opticus network 1st layer of the second intermediate layer.Hidden layer is top in bottleneck layer, that is, neutral net, one layer of minimum centre of the vector dimension of output Layer.Use bottleneck layer, it is ensured that follow-up training module 330 makes final output data more accurate, obtain when being trained Preferable training result.
Second output module 320 is inputting input data into nervus opticus network to be trained, refreshing except obtaining second Outside the output data in the second intermediate layer of at least one layer through network, the second output module 320 also needs to obtain nervus opticus network Final output data, in order to training module 330 utilize final output data counting loss, nervus opticus network is instructed Practice.
Training module 330, suitable for utilizing the output data at least one layer of second intermediate layer and at least one layer of first intermediate layer Output data between loss, and the loss between final output data and the output data that marks in advance, to nervus opticus Network is trained.
Output data and the output at least one layer of first intermediate layer of the training module 330 according at least one layer of second intermediate layer Loss between data, can update the weight parameter of nervus opticus network, make among nervus opticus network at least one layer second The output data of layer goes the output data close at least one layer of first intermediate layer of first nerves network as far as possible.Meanwhile training Module 330 can update according to the loss between the final output data of nervus opticus network and the output data marked in advance The weight parameter of two neutral nets, makes nervus opticus network final output data go as far as possible close to the output number marked in advance According to the accuracy of guarantee nervus opticus network final output data.Training module 330 is in the above manner, complete to the second god It is trained through network.
The training device of the neutral net provided according to the present invention, input data is inputted to trained the first obtained god Through in network, obtaining the output data in the first intermediate layer of at least one layer of first nerves network;Input data is inputted to waiting to instruct In experienced nervus opticus network, the output data and final output in the second intermediate layer of at least one layer of nervus opticus network are obtained Data, at least one layer of second intermediate layer have correspondence with least one layer of first intermediate layer;Among at least one layer second Loss between the output data of layer and the output data at least one layer of first intermediate layer, and final output data and pre- mark Output data between loss, nervus opticus network is trained.The present invention by using first nerves network at least The output data in the output data in one layer of first intermediate layer at least one layer of second intermediate layer corresponding to nervus opticus network carries out Training, can keep nervus opticus network to greatly promote the performance of nervus opticus network in the case where its calculation amount is constant, have The training time of the reduction training nervus opticus network of effect, improves the training effectiveness of the second network.
Fig. 4 shows the functional block diagram of the training device of neutral net in accordance with another embodiment of the present invention.Such as Fig. 4 institutes Show, be with Fig. 3 differences, the training device of neutral net further includes:
Down sample module 340, suitable for input data is carried out down-sampling processing, using the data after processing as nervus opticus The input data of network.
It is shallow-layer neutral net in view of nervus opticus network, when input data is larger, directly uses input data meeting Influence the arithmetic speed of nervus opticus network.Down sample module 340 first can carry out input data down-sampling processing, such as input When data are picture, down sample module 340, which carries out down-sampling processing, can first reduce photo resolution, and the data after processing are made For the input data of nervus opticus network.When so handling, low resolution is defeated after the processing of nervus opticus Web vector graphic down-sampling Enter data to be trained, the high-resolution input data of first nerves Web vector graphic is trained, and utilizes two neutral nets When output data is trained so that nervus opticus network can also obtain the input data of low resolution high-resolution defeated Go out result.
Training module 330 is further adapted in the output data and at least one layer first using at least one layer of second intermediate layer Loss between the output data of interbed, and final output data and the output number to the pre- mark of data after down-sampling processing Loss between, is trained nervus opticus network.
Output data and the output at least one layer of first intermediate layer of the training module 330 according at least one layer of second intermediate layer Loss between data, can update the weight parameter of nervus opticus network, make among nervus opticus network at least one layer second The output data of layer goes the output data close at least one layer of first intermediate layer of first nerves network as far as possible.
Input data used in nervus opticus network is the input data after down-sampling processing, it is also necessary at down-sampling Input data after reason is marked in advance, obtains the output data of the pre- mark of data after down-sampling is handled.Training module 330 , can according to the loss between the output data of the pre- mark of data after final output data and the down-sampling processing of nervus opticus network To update the weight parameter of nervus opticus network, nervus opticus network final output data are made to go as far as possible close at down-sampling The output data of the pre- mark of data after reason, ensures the accuracy of nervus opticus network final output data.Training module 330 is logical Cross with upper type, complete to be trained nervus opticus network.
The training device of the neutral net provided according to the present invention, with reference to the actual conditions of nervus opticus network, to input Data first carry out down-sampling processing, and the data after down-sampling is handled avoid influencing as the input data of nervus opticus network The arithmetic speed of nervus opticus network.Meanwhile nervus opticus Web vector graphic down-sampling processing after low resolution input data into Row training, the high-resolution input data of first nerves Web vector graphic are trained, and utilize the output data of two neutral nets When being trained so that nervus opticus network the input data of low resolution can also be obtained it is high-resolution output as a result, Greatly promote the training performance of nervus opticus network.
Present invention also provides a kind of nonvolatile computer storage media, the computer-readable storage medium is stored with least One executable instruction, the computer executable instructions can perform the training side of the neutral net in above-mentioned any means embodiment Method.
Fig. 5 shows a kind of structure diagram of computing device according to an embodiment of the invention, and the present invention is specific real Specific implementation of the example not to computing device is applied to limit.
As shown in figure 5, the computing device can include:Processor (processor) 502, communication interface (Communications Interface) 504, memory (memory) 506 and communication bus 508.
Wherein:
Processor 502, communication interface 504 and memory 506 complete mutual communication by communication bus 508.
Communication interface 504, for communicating with the network element of miscellaneous equipment such as client or other servers etc..
Processor 502, for executive program 510, in the training method embodiment that can specifically perform above-mentioned neutral net Correlation step.
Specifically, program 510 can include program code, which includes computer-managed instruction.
Processor 502 is probably central processor CPU, or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be arranged to implement the embodiment of the present invention one or more integrate electricity Road.The one or more processors that computing device includes, can be same type of processors, such as one or more CPU;Also may be used To be different types of processor, such as one or more CPU and one or more ASIC.
Memory 506, for storing program 510.Memory 506 may include high-speed RAM memory, it is also possible to further include Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.
Program 510 specifically can be used for so that processor 502 performs the neutral net in above-mentioned any means embodiment Training method.The specific implementation of each step may refer to the corresponding step in the training embodiment of above-mentioned neutral net in program 510 Corresponding description in rapid and unit, this will not be repeated here.It is apparent to those skilled in the art that the side for description Just and succinctly, the specific work process of the equipment of foregoing description and module, may be referred to corresponding in preceding method embodiment Journey describes, and details are not described herein.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description to the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield features more more than the feature being expressly recited in each claim.It is more precisely, such as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit requires, summary and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation Replace.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.
The all parts embodiment of the present invention can be with hardware realization, or to be run on one or more processor Software module realize, or realized with combinations thereof.It will be understood by those of skill in the art that it can use in practice Microprocessor or digital signal processor (DSP) are realized in the device of the training of neutral net according to embodiments of the present invention Some or all components some or all functions.The present invention is also implemented as being used to perform side as described herein The some or all equipment or program of device (for example, computer program and computer program product) of method.It is such Realizing the program of the present invention can store on a computer-readable medium, or can have the shape of one or more signal Formula.Such signal can be downloaded from internet website and obtained, and either be provided or with any other shape on carrier signal Formula provides.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of some different elements and being come by means of properly programmed computer real It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame Claim.

Claims (10)

1. a kind of training method of neutral net, it includes:
The input data is inputted into trained obtained first nerves network, obtains at least one layer of first nerves network The output data in the first intermediate layer;
The input data is inputted into nervus opticus network to be trained, obtains at least one layer second of nervus opticus network The output data and final output data in intermediate layer, among at least one layer of second intermediate layer and described at least one layer of first Layer has correspondence;
Using at least one layer of second intermediate layer output data and at least one layer of first intermediate layer output data it Between loss, and the loss between the final output data and the output data that marks in advance carries out nervus opticus network Training.
2. according to the method described in claim 1, wherein, the number of plies of the first nerves network is more than nervus opticus network.
3. method according to claim 1 or 2, wherein, at least one layer of first intermediate layer includes first nerves network Bottleneck layer;At least one layer of second intermediate layer includes the bottleneck layer of nervus opticus network.
4. method according to any one of claim 1-3, wherein, it is described using at least one layer of second intermediate layer Loss between output data and the output data at least one layer of first intermediate layer, and the final output data with it is pre- Loss between the output data of mark, is trained nervus opticus network and further comprises:
According to the output data in the output data at least one layer of second intermediate layer and at least one layer of first intermediate layer it Between loss update the weight parameter of the nervus opticus network, according to the final output data and the output data that marks in advance Between loss update the weight parameter of the nervus opticus network, nervus opticus network is trained.
5. according to the described method of any one of claim 1-4, wherein, the input data is inputted to waiting to train described Nervus opticus network in, obtain nervus opticus network the second intermediate layer of at least one layer output data and final output number According to before, the method further includes:
The input data is subjected to down-sampling processing, the input data using the data after processing as nervus opticus network.
6. according to the method described in claim 5, wherein, the output data using at least one layer of second intermediate layer with Loss between the output data at least one layer of first intermediate layer, and the final output data and the output that marks in advance Loss between data, is trained nervus opticus network and further comprises:
Using at least one layer of second intermediate layer output data and at least one layer of first intermediate layer output data it Between loss, and the final output data and to the down-sampling processing after data pre- mark output data between Loss, is trained nervus opticus network.
7. a kind of training device of neutral net, it includes:
First output module, suitable for inputting the input data into trained obtained first nerves network, obtains first The output data in the first intermediate layer of at least one layer of neutral net;
Second output module, suitable for inputting the input data into nervus opticus network to be trained, obtains nervus opticus The output data and final output data in the second intermediate layer of at least one layer of network, at least one layer of second intermediate layer and institute Stating at least one layer of first intermediate layer has correspondence;
Training module, suitable for utilizing the output data at least one layer of second intermediate layer and at least one layer of first intermediate layer Output data between loss, and the loss between the final output data and the output data that marks in advance, to second Neutral net is trained.
8. device according to claim 7, wherein, the number of plies of the first nerves network is more than nervus opticus network.
9. a kind of computing device, including:Processor, memory, communication interface and communication bus, the processor, the storage Device and the communication interface complete mutual communication by the communication bus;
The memory is used to store an at least executable instruction, and the executable instruction makes the processor perform right such as will Ask the corresponding operation of training method of the neutral net any one of 1-6.
10. a kind of computer-readable storage medium, an at least executable instruction, the executable instruction are stored with the storage medium Make the corresponding operation of training method of neutral net of the processor execution as any one of claim 1-6.
CN201711157050.0A 2017-11-20 2017-11-20 The training method and device of neutral net, computing device Pending CN107958284A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711157050.0A CN107958284A (en) 2017-11-20 2017-11-20 The training method and device of neutral net, computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711157050.0A CN107958284A (en) 2017-11-20 2017-11-20 The training method and device of neutral net, computing device

Publications (1)

Publication Number Publication Date
CN107958284A true CN107958284A (en) 2018-04-24

Family

ID=61964919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711157050.0A Pending CN107958284A (en) 2017-11-20 2017-11-20 The training method and device of neutral net, computing device

Country Status (1)

Country Link
CN (1) CN107958284A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871791A (en) * 2019-01-31 2019-06-11 北京字节跳动网络技术有限公司 Image processing method and device
CN110598504A (en) * 2018-06-12 2019-12-20 北京市商汤科技开发有限公司 Image recognition method and device, electronic equipment and storage medium
CN110781659A (en) * 2018-07-11 2020-02-11 株式会社Ntt都科摩 Text processing method and text processing device based on neural network
CN111311646A (en) * 2018-12-12 2020-06-19 杭州海康威视数字技术股份有限公司 Optical flow neural network training method and device
CN112599141A (en) * 2020-11-26 2021-04-02 北京百度网讯科技有限公司 Neural network vocoder training method and device, electronic equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598504A (en) * 2018-06-12 2019-12-20 北京市商汤科技开发有限公司 Image recognition method and device, electronic equipment and storage medium
CN110781659A (en) * 2018-07-11 2020-02-11 株式会社Ntt都科摩 Text processing method and text processing device based on neural network
CN111311646A (en) * 2018-12-12 2020-06-19 杭州海康威视数字技术股份有限公司 Optical flow neural network training method and device
CN111311646B (en) * 2018-12-12 2023-04-07 杭州海康威视数字技术股份有限公司 Optical flow neural network training method and device
CN109871791A (en) * 2019-01-31 2019-06-11 北京字节跳动网络技术有限公司 Image processing method and device
CN112599141A (en) * 2020-11-26 2021-04-02 北京百度网讯科技有限公司 Neural network vocoder training method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107958284A (en) The training method and device of neutral net, computing device
US10373312B2 (en) Automated skin lesion segmentation using deep side layers
WO2019243962A1 (en) Runtime reconfigurable neural network processor core
CN107832725A (en) Video front cover extracting method and device based on evaluation index
WO2019239254A1 (en) Parallel computational architecture with reconfigurable core-level and vector-level parallelism
CN107392842A (en) Image stylization processing method, device, computing device and computer-readable storage medium
US11693627B2 (en) Contiguous sparsity pattern neural networks
CN104077303B (en) Method and apparatus for data to be presented
CN107610146A (en) Image scene segmentation method, apparatus, computing device and computer-readable storage medium
CN110135582A (en) Neural metwork training, image processing method and device, storage medium
US11657291B2 (en) Spatio-temporal embeddings
CN107710245A (en) Course skills match system and method
CN109816098A (en) The processing method and appraisal procedure of neural network, data analysing method and device
CN107516290A (en) Image switching network acquisition methods, device, computing device and storage medium
US11238347B2 (en) Data distribution in an array of neural network cores
CN107832751A (en) Mask method, device and the computing device of human face characteristic point
CN113033825A (en) Privacy protection model training method, system and device
CN112308145A (en) Classification network training method, classification device and electronic equipment
CN108875920A (en) Operation method, device, system and the storage medium of neural network
Wu et al. Sim2real transfer learning for point cloud segmentation: An industrial application case on autonomous disassembly
CN111144574B (en) Artificial intelligence system and method for training learner model using instructor model
Ballard Hands-on deep learning for images with TensorFlow: build intelligent computer vision applications using TensorFlow and Keras
Hinz et al. The effects of regularization on learning facial expressions with convolutional neural networks
CN108062709A (en) Personal behavior model training method and device based on semi-supervised learning
WO2020011936A1 (en) Hierarchical parallelism in a network of distributed neural network cores

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180424

RJ01 Rejection of invention patent application after publication