CN109086871A - Training method, device, electronic equipment and the computer-readable medium of neural network - Google Patents
Training method, device, electronic equipment and the computer-readable medium of neural network Download PDFInfo
- Publication number
- CN109086871A CN109086871A CN201810847796.2A CN201810847796A CN109086871A CN 109086871 A CN109086871 A CN 109086871A CN 201810847796 A CN201810847796 A CN 201810847796A CN 109086871 A CN109086871 A CN 109086871A
- Authority
- CN
- China
- Prior art keywords
- euclidean distance
- trained
- network
- distance matrix
- european
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of training method of neural network, device, electronic equipment and computer-readable mediums, are related to the technical field of artificial intelligence, comprising: obtain network to be trained, wherein network to be trained is multiple;The corresponding Euclidean distance matrix of prediction data for calculating each network output to be trained, obtains multiple Euclidean distance matrixes;One or more European loss functions are determined using multiple Euclidean distance matrixes;In conjunction with one or more European loss functions and each trained network each to be trained of loss function of network train, can be realized by this method and uses neural network low in energy consumption and better performances in intelligent mobile terminal.
Description
Technical field
The present invention relates to field of artificial intelligence, more particularly, to a kind of training method of neural network, device, electronics
Equipment and computer-readable medium.
Background technique
With the fast development of artificial intelligence technology, artificial intelligence technology, which has begun, to be applied in various physical products,
For example, photographic device, image processing apparatus etc..With the more intelligent development of intelligent mobile terminal, artificial intelligence technology exists
Scene in an increasingly wide range of applications on intelligent mobile terminal (for example, smart phone).
It is various multiple inside convolutional neural networks during convolutional neural networks prediction by taking convolutional neural networks as an example
Miscellaneous convolution operation and matrix operation proposes certain requirement to the computing resource and energy consumption of equipment.In practical applications, think
There is better scene Recognition effect just to need using more complicated network model, and more complicated, bigger model make use of momentum it is required
More computing resources are expended, bigger power consumption is generated.On large server, everybody can generally focus more on the table of model
It is existing, but the computing resource of intelligent mobile terminal is limited, and limited electricity is more sensitive for power consumption.Therefore, intelligent mobile
Terminal is different from computer terminal, server-side, and the contradiction of performance and power consumption embodies more sharp in intelligent mobile terminal.
Existing intelligent mobile terminal is in application neural network or cruising ability is sacrificed in selection, more multiple using structure
Miscellaneous, expression effect is more preferable but the biggish network model of energy consumption;Small neural network structure is selected, is answered reducing deep learning
Low energy consumption and long continuation of the journey are obtained while performance.But all there is certain disadvantages for above two selection.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of training method of neural network, device, electronic equipment and meters
Calculation machine readable medium can be realized by this method and use neural network low in energy consumption and better performances in intelligent mobile terminal.
In a first aspect, the embodiment of the invention provides a kind of training methods of neural network, comprising: obtain to training net
Network, wherein the network to be trained is multiple;Calculate prediction data that each network train exports it is corresponding it is European away from
From matrix, multiple Euclidean distance matrixes are obtained;One or more European loss letters are determined using the multiple Euclidean distance matrix
Number;In conjunction with one or more of European loss functions and network train loss function it is trained each described in
Training network.
Further, in conjunction with one or more of European loss functions and it is each described in network to be trained loss function
The each network to be trained of training includes: by one or more of European loss functions and multiple networks to be trained
Loss function carries out summation operation, obtains target loss function;Using target loss function training it is described it is each described to
Training network.
Further, using the multiple Euclidean distance matrix determine one or more European loss functions include: according to
Incidence relation between multiple networks to be trained, determines at least one Euclidean distance from the multiple Euclidean distance matrix
Matrix group, wherein include multiple Euclidean distance matrixes to be calculated in each Euclidean distance matrix group;Using it is described at least
One Euclidean distance matrix group determines one or more of European loss functions.
Further, one or more of European loss functions are determined using at least one described Euclidean distance matrix group
It include: that target Euclidean distance matrix is calculated based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group;It obtains
Take the weight of the target Euclidean distance matrix;Using the weight and the product of the target Euclidean distance matrix as the Europe
Formula loss function.
Further, the multiple Euclidean distance matrix to be calculated includes: that the first Euclidean distance matrix and second are European
Distance matrix;The Euclidean distance between Euclidean distance matrix to be calculated in each Euclidean distance matrix group is calculated, is obtained
Target Euclidean distance matrix includes: successively to calculate the first element dakjWith second element dbkjBetween Euclidean distance, obtain described
Element dm in target Euclidean distance matrixkj, wherein the first element dakjFor row k jth in the first Euclidean distance matrix
The element of column, second element dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
Further, the Euclidean distance calculated between the prediction data of each network output to be trained includes: to calculate
Euclidean distance in each prediction data between any two row vector obtains the multiple Euclidean distance matrix.
Further, the Euclidean distance calculated in each prediction data between any two row vector includes: to pass through
FormulaThe Euclidean distance in the prediction data between any two row vector is calculated, is obtained European
Distance matrixWherein, dakjThe member arranged for row k jth in the Euclidean distance matrix
Element, akiFor the element that row k i-th in the prediction data arranges, ajiThe element arranged for jth row i-th in the prediction data.
Second aspect, the embodiment of the invention also provides a kind of training devices of neural network, comprising: acquiring unit is used
In obtaining network to be trained, wherein the network to be trained is multiple;Computing unit, it is each described to training net for calculating
The corresponding Euclidean distance matrix of prediction data of network output, obtains multiple Euclidean distance matrixes;Determination unit, described in utilizing
Multiple Euclidean distance matrixes determine one or more European loss functions;Training unit, for combining one or more of Europe
The each network to be trained of the loss function training of formula loss function and each network to be trained.
The third aspect the embodiment of the invention provides a kind of electronic equipment, including memory, processor and is stored in described
On memory and the computer program that can run on the processor, the processor are realized when executing the computer program
Method described in any one of above-mentioned first aspect.
Fourth aspect, the embodiment of the invention provides a kind of meters of non-volatile program code that can be performed with processor
Calculation machine readable medium, said program code make the processor execute method described in any one of above-mentioned first aspect.
In embodiments of the present invention, network to be trained is obtained first;Then, the prediction of each network output to be trained is calculated
The corresponding Euclidean distance matrix of data, obtains multiple Euclidean distance matrixes;Later, one is determined using multiple Euclidean distance matrixes
Or multiple European loss functions;Finally, combining the damage of one or more European loss functions and neural network each to be trained
It loses function and trains neural network each to be trained.
As can be seen from the above description, in the present embodiment, using the Europe for characterizing the otherness between network to be trained
The mode of the loss function training network each to be trained of formula loss function and each network to be trained, can guide each wait instruct
Practice network mutually to learn in the training stage, the performance of each network to be trained is improved in a manner of by mutually learning, to instruct
Practise that low in energy consumption, speed is fast, and accuracy can match in excellence or beauty large-scale neural network model, can preferably solve through the above way
Existing intelligent mobile terminal is since electricity is limited and the limited caused intelligent mobile terminal of computing resource is for deep learning
The application of technology has the technical issues of certain limitation.
Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention are in specification, claims
And specifically noted structure is achieved and obtained in attached drawing.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate
Appended attached drawing, is described in detail below.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art
Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below
Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor
It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the schematic diagram of a kind of electronic equipment according to an embodiment of the present invention;
Fig. 2 is a kind of flow chart of the training method of neural network according to an embodiment of the present invention;
Fig. 3 is the flow chart of the training method of another neural network according to an embodiment of the present invention;
Fig. 4 is a kind of stream being trained to neural network A and neural network B optionally according to an embodiment of the present invention
Journey figure schematic diagram;
Fig. 5 be it is according to an embodiment of the present invention it is a kind of optionally to neural network A, neural network B and neural network C into
The flow chart schematic diagram of row training;
Fig. 6 is a kind of schematic diagram of the training device of neural network according to an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention
Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than
Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Firstly, describing the electronic equipment 100 for realizing the embodiment of the present invention referring to Fig.1, which can be used
In the training method of the neural network of operation various embodiments of the present invention.
As shown in Figure 1, electronic equipment 100 includes one or more processors 102, one or more memories 104, input
Device 106, output device 108 and data collector 110, the company that these components pass through bus system 112 and/or other forms
The interconnection of connection mechanism (not shown).It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 are only exemplary, rather than
Restrictive, as needed, the electronic equipment also can have other assemblies and structure.
The processor 102 can use digital signal processor (DSP), field programmable gate array (FPGA), can compile
At least one of journey logic array (PLA) and ASIC (Application Specific Integrated Circuit) are hard
Part form realizes that the processor 102 can be central processing unit (CPU), graphics processing unit (GPU) or have number
According to the processing unit of processing capacity and/or the other forms of instruction execution capability, and can control in the electronic equipment 100
Other components to execute desired function.
The memory 104 may include one or more computer program products, and the computer program product can be with
Including various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described volatile
Property memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-easy
The property lost memory for example may include read-only memory (ROM), hard disk, flash memory etc..On the computer readable storage medium
It can store one or more computer program instructions, processor 102 can run described program instruction, described below to realize
The embodiment of the present invention in the client functionality (realized by processor) and/or other desired functions.In the calculating
Various application programs and various data can also be stored in machine readable storage medium storing program for executing, such as the application program is used and/or produced
Raw various data etc..
The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat
One or more of gram wind and touch screen etc..
The output device 108 can export various information (for example, image or sound) to external (for example, user), and
It and may include one or more of display, loudspeaker etc..
For carrying out data acquisition, data collector can also store data collected the data collector 110
For the use of other components in the memory 104.
It illustratively, can be by for realizing the electronic equipment of the training method of neural network according to an embodiment of the present invention
It is embodied as the intelligent terminals such as video camera, capture machine, smart phone, tablet computer.
According to embodiments of the present invention, a kind of embodiment of the training method of neural network is provided, it should be noted that
The step of process of attached drawing illustrates can execute in a computer system such as a set of computer executable instructions, also,
It, in some cases, can be to be different from shown in sequence execution herein although logical order is shown in flow charts
The step of out or describing.
Fig. 2 is a kind of flow chart of the training method of neural network according to an embodiment of the present invention, as shown in Fig. 2, the party
Method includes the following steps:
Step S202 obtains network to be trained, wherein network to be trained is multiple;
In the present embodiment, the quantity of network to be trained is at least 2.Network to be trained can be relatively simple network
Structure, for example, the neural network with less convolutional layer port number;Being somebody's turn to do network to be trained can also be more complicated nerve net
Network, for example, the neural network with more convolutional layer port number.Wherein, more complicated network to be trained, which refers to, has more
Convolutional layer port number, alternatively, the network to be trained with the deeper network number of plies.
Under normal circumstances, more complicated neural network has better data characterization ability, can be on partial task
Show higher accuracy rate.
Step S204 calculates the corresponding Euclidean distance matrix of prediction data of each network output to be trained (hereafter
It is described as the corresponding Euclidean distance matrix of network to be trained), obtain multiple Euclidean distance matrixes;
Step S206 determines one or more European loss functions using the multiple Euclidean distance matrix, wherein described
European loss function is used to characterize the otherness between network to be trained;
Step S208, in conjunction with one or more of European loss functions and each loss function training of network to be trained
The network each to be trained.
For example, an European loss function can be determined according to the corresponding Euclidean distance matrix of two networks to be trained, it should
European loss function is used to characterize the otherness between the two networks to be trained.
As can be seen from the above description, in the present embodiment, using the Europe for characterizing the otherness between network to be trained
The mode of the loss function training network each to be trained of formula loss function and each network to be trained, can guide each wait instruct
Practice network mutually to learn in the training stage, the performance of each network to be trained is improved in a manner of by mutually learning, to instruct
Practise that low in energy consumption, speed is fast, and accuracy can match in excellence or beauty large-scale neural network model, can preferably solve through the above way
Existing intelligent mobile terminal is since electricity is limited and the limited caused intelligent mobile terminal of computing resource is for deep learning
The application of technology has the technical issues of certain limitation.
As can be seen from the above description, in the present embodiment, it firstly, obtaining network to be trained, then, calculates each wait train
The corresponding Euclidean distance matrix of preset data of network output;And it is determined using multiple Euclidean distance matrixes one or more European
Loss function.
In the present embodiment, step S204 is calculated European between the prediction data of each network output to be trained
Distance includes: the Euclidean distance calculated in each prediction data between any two row vector, obtains the Euclidean distance matrix.
The specific formula for calculation of Euclidean distance matrix will be specifically introduced in the following embodiments.
In the present embodiment, step S206 determines one or more European losses using the multiple Euclidean distance matrix
Function includes step S2061 and S2062.
Step S2061, according to the incidence relation between multiple networks to be trained, from the multiple Euclidean distance matrix
At least one Euclidean distance matrix group of middle determination, wherein in each Euclidean distance matrix group include multiple (such as two) to
The Euclidean distance matrix of calculating.Above-mentioned incidence relation is pre-set.When training network be two when, at this point, this two
A network to be trained has incidence relation, is illustrated in fig. 4 shown below.When training network is more than two, be illustrated in fig. 5 shown below three
When a, network A to be trained and network B to be trained have incidence relation, and network A to be trained has with network C to be trained is associated with pass
System, network C to be trained and network B to be trained have incidence relation.
Step S2062 determines one or more of European loss letters using at least one described Euclidean distance matrix group
Number.An European loss function can be determined using each Euclidean distance matrix group, and the European loss function is for characterizing the Europe
Otherness between the corresponding network to be trained of formula distance matrix group.That is, for multiple wait instruct with incidence relation
Practice network, characterizes the otherness between multiple network to be trained by calculating an European loss function.
Optionally, step S2062 is determined one or more of European using at least one described Euclidean distance matrix group
Loss function include: based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group, calculate target it is European away from
From matrix;Obtain the weight of the target Euclidean distance matrix;By the product of the weight and the target Euclidean distance matrix
As the European loss function.
Specifically, in the present embodiment, firstly, based on Euclidean distance to be calculated in each Euclidean distance matrix group
Matrix calculates target Euclidean distance matrix.If Euclidean distance matrix to be calculated is two, it is to be calculated to calculate the two
Euclidean distance between Euclidean distance matrix is as target Euclidean distance matrix;If Euclidean distance matrix to be calculated is 2
More than, for example, three, then the Euclidean distance square that any two are to be calculated in 3 Euclidean distance matrixes to be calculated is calculated first
Euclidean distance between battle array obtains an intermediate Euclidean distance matrix, then calculate again the intermediate Euclidean distance matrix and 3 to
Euclidean distance in the Euclidean distance matrix of calculating between remaining Euclidean distance matrix to be calculated, obtains target Euclidean distance square
Battle array.The specific formula for calculation of Euclidean distance between any two Euclidean distance matrix to be calculated will in the following embodiments
It is specifically introduced.
In another example, every two network pair to be trained in multiple networks to be trained can also be calculated in step S206
The Euclidean distance between Euclidean distance matrix answered, obtains target Euclidean distance matrix.
Later, the weight of target Euclidean distance matrix is obtained.
Finally, using the weight and the product of the target Euclidean distance matrix as the European loss function, wherein
When the European loss function is multiple, weight corresponding to the European loss function of any two is same or different.For example,
If the weight of target Euclidean distance matrix is W, European loss function can be indicated are as follows: Euclidean Loss=W*
(D-Mutual), wherein EuclideanLoss is European loss function.It should be noted that W be rule of thumb provide or
The optimum value obtained using parameter search.
After obtaining European loss function Euclidean Loss by above-mentioned described method, so that it may be bound to
Few one or more European loss functions and each network loss function training to be trained network each to be trained.
In an optional embodiment, step S206, in conjunction with one or more European loss functions and each wait instruct
Practice network losses function training network each to be trained to include the following steps:
Step S2061 sums the loss function of one or more European loss functions and multiple networks to be trained
Operation obtains target loss function;
Step S2062 utilizes target loss function training network each to be trained.
In the present embodiment, after obtaining one or more European loss functions, one or more European losses are utilized
Function, and each the loss function of network to be trained determines target loss function.For example, can be by one or more European damages
It loses function and each the loss function of network to be trained carries out summation operation, obtain target loss function.Obtaining target loss
After function, so that it may utilize target loss function training network each to be trained.
As can be seen from the above description, European loss function is that each network to be trained treated in trained network was being trained
What the prediction data exported in journey was calculated, which is used to characterize the difference between each network to be trained
It is anisotropic.One or more European loss functions are being added with each respective loss function of network to be trained as last optimization
Target loss function, and by the target loss function to it is each when training network be trained when, can using this method
While having trained each network to be trained, the advantages of mutually learning other side between each network to be trained also is guided, jointly
It is promoted, to improve the performance of each network to be trained.
For example, including two networks to be trained, respectively structure network simply to be trained and one in network model
A complicated network to be trained.One structure network simply to be trained and another complicated network to be trained
It is trained using above-mentioned described method, enables to mutually learn the excellent of other side between the two mind networks to be trained
Point.If the advantages of complicated network to be trained is that performance is good, accuracy is high, then structure network simply to be trained exists
In trained process, the advantage will be learnt so that structure network simply to be trained performance equally with higher and
Accuracy.
Practice have shown that abundant in training, weight M is arranged in reasonable situation, method energy provided by the embodiment of the present invention
Can enough it make each wait train network than all having higher accuracy when respectively individually training.
In the present embodiment, to it is above-mentioned after training network training after the completion of, so that it may according to intelligent mobile terminal to mould
Type power consumption, speed and the requirement of accuracy, taking out wherein some network to be trained, (for example structure is relatively simple to training net
Network) weigh every gain and loss for intelligent mobile terminal after use.
Under normal circumstances, in each branch group, including two networks to be trained.In the case, as shown in figure 3, step
Rapid S204 obtains multiple European the corresponding Euclidean distance matrix of prediction data for calculating each network output to be trained
Distance matrix includes the following steps:
Step S301 obtains the first prediction data and the second preset data to be calculated, wherein the first prediction data and
Two preset datas are the prediction data of two network output to be trained;
Step S302 calculates the Euclidean distance in first prediction data between any two row vector, obtains first
Euclidean distance matrix;
Step S303 calculates the Euclidean distance in second preset data between any two row vector, obtains second
Euclidean distance matrix;
Step S304 calculates the European damage based on the first Euclidean distance matrix and the second Euclidean distance matrix
Lose function.
In the present embodiment, firstly, obtaining the first prediction data and the second prediction data to be calculated.
Then, the Euclidean distance in the first prediction data between any two row vector is calculated, the first Euclidean distance is obtained
Matrix.It is alternatively possible to pass through formulaCalculate in the first prediction data any two row vector it
Between Euclidean distance, obtain the first Euclidean distance matrixWherein, dakjFor the first Europe
The element that row k jth arranges in formula distance matrix, akiFor the element that row k i-th in the first prediction data arranges, ajiFor the first prediction
The element that jth row i-th arranges in data.
Later, the Euclidean distance in the second preset data between any two row vector is calculated, the second Euclidean distance is obtained
Matrix.It is alternatively possible to pass through formulaIt calculates in the second prediction data between any two row vector
Euclidean distance, obtain the second Euclidean distance matrixWherein, dbkjFor the second Europe
The element that row k jth arranges in formula distance matrix, bkiFor the element that row k i-th in the second prediction data arranges, bjiIt is described second
The element that jth row i-th arranges in prediction data.
Finally, European loss function can be calculated based on the first Euclidean distance matrix and the second Euclidean distance matrix, from
And one or more European loss functions are obtained, and combine one or more European loss functions and each damage of network to be trained
Lose function training network each to be trained.
In an optional embodiment, as shown in figure 4, if multiple Euclidean distance matrixes to be calculated include:
One Euclidean distance matrix and the second Euclidean distance matrix, then based on to be calculated European in each Euclidean distance matrix group
Distance matrix calculates target Euclidean distance matrix and includes the following steps:
Successively calculate the first element dakjWith second element dbkjBetween Euclidean distance, obtain the target Euclidean distance
Element dm in matrixkj, wherein the first element dakjFor the element that row k jth in the first Euclidean distance matrix arranges, the
Was Used dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
Specifically, in the present embodiment, formula can be passed throughCalculate first Euclidean distance
Euclidean distance matrix between matrix and the second Euclidean distance matrix obtains the target Euclidean distance matrix, wherein institute
Target Euclidean distance matrix is stated to be expressed as:dmkjIt is European for the target
The element that row k jth arranges in distance matrix, the first element dakjThe member arranged for row k jth in the first Euclidean distance matrix
Element, second element dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
Embodiment two:
For ease of understanding, a kind of specific application example is present embodiments provided, to introduce described in above-described embodiment
Neural network training method.
Scene 1, network to be trained include two networks to be trained.
The two networks to be trained include neural network A and neural network B, wherein neural network A is that structure is relatively simple
Neural network, neural network B be the more complicated neural network of structure.As shown in Figure 4 is a kind of optionally to nerve
The flow chart schematic diagram that network A and neural network B are trained.Figure 4, it is seen that neural network A includes convolutional layer
CNN-A, full articulamentum FC-A and softmax layers, wherein include one or more convolutional layers, full articulamentum in convolutional layer CNN-A
It include one or more full articulamentums in FC-A.Neural network B includes convolutional layer CNN-B, full articulamentum FC-B and softmax
Layer, wherein include one or more convolutional layers in convolutional layer CNN-B, include one or more full connections in full articulamentum FC-B
Layer.
In the present embodiment, firstly, obtaining the prediction data A that neural network A and neural network B is exported in the training process
With prediction data B, wherein the prediction data that the softmax layer that prediction data A is neural network A exports, prediction data B are mind
The prediction data of softmax layer output through network B.It should be understood that in another example, prediction data A and prediction data B can also
Think the data of full articulamentum output in neural network.
Then, pass through formulaIt calculates any two in prediction data A (that is, first prediction data)
Euclidean distance between a row vector obtains Euclidean distance matrix A (the first Euclidean distance matrix)Wherein, dakjFor the element that row k jth in the first Euclidean distance matrix arranges, akiFor
The element that row k i-th arranges in first prediction data, ajiThe element arranged for jth row i-th in first prediction data.And pass through
FormulaIt calculates European between any two row vector in prediction data B (that is, second prediction data)
Distance obtains Euclidean distance matrix B (that is, second Euclidean distance matrix)Wherein, dbkj
For the element that row k jth in the second Euclidean distance matrix arranges, bkiThe member arranged for row k i-th in second prediction data
Element, bjiThe element arranged for jth row i-th in second prediction data.
Later, the Euclidean distance matrix between Euclidean distance matrix A and Euclidean distance matrix B is calculated, it is European to obtain target
Distance matrix.And obtain the weight M (or hyper parameter M) of the target Euclidean distance matrix;Finally, weight M and target is European
The product of distance matrix is as European loss function.
After obtaining European loss function, so that it may by the loss function A and mind of European loss function and neural network A
Loss function B through network B carries out summation operation, obtains target loss function.And utilize target loss function training nerve net
Network A and neural network B, finally, obtaining neural network A and neural network B after training.
As can be seen from the above description, neural network A has better simply network structure and less parameter, neural network B
Structure it is bigger.After being trained using method provided by the present embodiment, it is able to ascend the performance of neural network B, and
Neural network A can be more than performance of the more complicated neural network B in independent training in simple structure.Therefore it is had both
The neural network A of the advantages such as low-power consumption, quick, high accuracy solves power consumption, speed and mould for intelligent mobile terminal use
Type accuracy is difficult to satisfactory to both parties problem.
It include more than two networks to be trained in scene 2, network to be trained, for example, three networks to be trained, i.e. nerve net
Network A, neural network B and neural network C, as shown in Figure 5.
Neural network A is the relatively simple neural network of structure, and neural network B and neural network C are that structure is complex
Neural network.Neural network A includes convolutional layer CNN-A, and full articulamentum FC-A and softmax layers, wherein convolutional layer CNN-A
In include one or more convolutional layers, include one or more full articulamentums in full articulamentum FC-A.Neural network B includes convolution
Layer CNN-B, full articulamentum FC-B and softmax layer, wherein it include one or more convolutional layers in convolutional layer CNN-B, full connection
It include one or more full articulamentums in layer FC-B.Neural network C includes convolutional layer CNN-C, full articulamentum FC-C and softmax
Layer, wherein include one or more convolutional layers in convolutional layer CNN-C, include one or more full connections in full articulamentum FC-C
Layer.
In the present embodiment, it is exported in the training process firstly, obtaining neural network A, neural network B and neural network C
Prediction data A, prediction data B and prediction data C, wherein prediction data A be neural network A softmax layer output it is pre-
Measured data, the prediction data that the softmax layer that prediction data B is neural network B exports, prediction data C are neural network C's
The prediction data of softmax layers of output.Then, any two in prediction data A, prediction data B and prediction data C are calculated separately
Euclidean distance between row vector respectively obtains Euclidean distance matrix A, Euclidean distance matrix B and Euclidean distance Matrix C.
Later, the Euclidean distance matrix between Euclidean distance matrix A and Euclidean distance matrix B is calculated, it is European to obtain target
Distance matrix 1;The Euclidean distance matrix between Euclidean distance matrix A and Euclidean distance Matrix C is calculated, target Euclidean distance is obtained
Matrix 2;And the Euclidean distance matrix between Euclidean distance Matrix C and Euclidean distance matrix B, obtain target Euclidean distance matrix
3。
Next, obtaining the weight M1 of the target Euclidean distance matrix 1;Obtain the weight of the target Euclidean distance matrix 2
M2;Obtain the weight M3 of the target Euclidean distance matrix 3.Wherein, M1, M2 and M3 are same or different.
Finally, using the product of weight M1 and target Euclidean distance matrix 1 as European loss function 1;By weight M2 and mesh
The product of Euclidean distance matrix 2 is marked as European loss function 2;Using the product of weight M3 and target Euclidean distance matrix 3 as
European loss function 3.
After obtaining the European loss function of above three, so that it may by the European loss function of above three and neural network A
Loss function A, neural network B loss function B and neural network C loss function C carry out summation operation, obtain target damage
Lose function.And using target loss function training neural network A, neural network B and neural network C, finally, obtaining after training
Neural network A, neural network B and neural network C.
More complicated structure can also be set up, for example 3 branches, 4 branch models are merged into training, is finally selected
Performance and power consumption is taken to be in a model of optimal equalization point as using in mobile phone terminal.Structure with the 3 branches such as following figure
It is shown:
In conclusion advantage of the invention is that the following:
1. by a kind of preferable training method of versatility, train low in energy consumption, speed is fast, but accuracy matches in excellence or beauty large-sized model
Miniature neural network model, preferably solve that cell phone electricity is low, computing resource is limited answers depth learning technology
Limitation.
2. the training method of the present embodiment is simple, easy to spread.The loss function that the present embodiment uses is more general, to mould
Type structural requirement is not high, therefore can use between different model structures, can also be directed to different deep learning tasks
It uses, there is preferably transportable property.
3. under the premise of not needing the structure of change model, can maximally develop have basic model task it is latent
Power reduces the cost and investment of the application of mobile terminal depth learning technology.
Example IV:
The embodiment of the invention also provides a kind of training device of neural network, the training device of the neural network is mainly used
In the training method for executing neural network provided by above content of the embodiment of the present invention, below to provided in an embodiment of the present invention
The training device of neural network does specific introduction.
Fig. 6 is a kind of schematic diagram of the training device of neural network according to an embodiment of the present invention, as shown in fig. 6, the mind
Training device through network mainly includes acquiring unit 10, computing unit 20, determination unit 30 and training unit 40, in which:
Acquiring unit 10, for obtaining network to be trained, wherein the network to be trained is multiple;
Computing unit 20, for calculating the corresponding Euclidean distance square of prediction data of each network output to be trained
Battle array, obtains multiple Euclidean distance matrixes;
Determination unit 30, for determining one or more European loss functions using the multiple Euclidean distance matrix;
Training unit 40, in conjunction with one or more of European loss functions and it is each described in network to be trained damage
Lose each network to be trained of function training.
As can be seen from the above description, in the present embodiment, using the Europe for characterizing the otherness between network to be trained
The mode of the loss function training network each to be trained of formula loss function and each network to be trained, can guide each wait instruct
Practice network mutually to learn in the training stage, the performance of each network to be trained is improved in a manner of by mutually learning, to instruct
Practise that low in energy consumption, speed is fast, and accuracy can match in excellence or beauty large-scale neural network model, can preferably solve through the above way
Existing intelligent mobile terminal is since electricity is limited and the limited caused intelligent mobile terminal of computing resource is for deep learning
The application of technology has the technical issues of certain limitation.
Optionally, training unit 40 is used for: by one or more of European loss functions and multiple described to training net
The loss function of network carries out summation operation, obtains target loss function;Utilize target loss function training each institute
State network to be trained.
Optionally it is determined that unit includes: the first determining module, for according to the association between multiple networks to be trained
Relationship determines at least one Euclidean distance matrix group, wherein each Euclidean distance from the multiple Euclidean distance matrix
It include multiple Euclidean distance matrixes to be calculated in matrix group;Second determining module, for using it is described at least one it is European away from
One or more of European loss functions are determined from matrix group.
Optionally, the second determining module is used for: based on Euclidean distance to be calculated in each Euclidean distance matrix group
Matrix calculates target Euclidean distance matrix;Obtain the weight of the target Euclidean distance matrix;By the weight and the target
The product of Euclidean distance matrix is as the European loss function.
Optionally, the second determining module is also used to: including the first Euclidean distance in multiple Euclidean distance matrixes to be calculated
In the case where matrix and the second Euclidean distance matrix, the first element da is successively calculatedkjWith second element dbkjBetween Euclidean away from
From obtaining the element dm in the target Euclidean distance matrixkj, wherein the first element dakjFor the first Euclidean distance square
The element of row k jth column, second element db in battle arraykjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j
It successively takes 1 to m.
Optionally, computing unit 20 is used for: being calculated European between any two row vector in each prediction data
Distance obtains the multiple Euclidean distance matrix.
Optionally, computing unit 20 is also used to: passing through formulaIt calculates in the prediction data and appoints
Euclidean distance between two row vectors of anticipating, obtains Euclidean distance matrixWherein,
dakjFor the element that row k jth in the Euclidean distance matrix arranges, akiFor in the prediction data row k i-th arrange element,
ajiThe element arranged for jth row i-th in the prediction data.
The technical effect and preceding method embodiment phase of device provided by the embodiment of the present invention, realization principle and generation
Together, to briefly describe, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
In the present embodiment, the computer for additionally providing a kind of non-volatile program code that can be performed with processor can
Medium is read, said program code makes the processor execute the training method of above-mentioned neural network.
In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase
Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected;It can
To be mechanical connection, it is also possible to be electrically connected;It can be directly connected, can also can be indirectly connected through an intermediary
Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition
Concrete meaning in invention.
In the description of the present invention, it should be noted that term " center ", "upper", "lower", "left", "right", "vertical",
The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to
Convenient for description the present invention and simplify description, rather than the device or element of indication or suggestion meaning must have a particular orientation,
It is constructed and operated in a specific orientation, therefore is not considered as limiting the invention.In addition, term " first ", " second ",
" third " is used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with
It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit,
Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can
To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for
The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect
Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, of the invention
Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words
The form of product embodies, which is stored in a storage medium, including some instructions use so that
One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention
State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with
Store the medium of program code.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention
Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair
It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art
In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light
It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make
The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention
Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. a kind of training method of neural network characterized by comprising
Obtain network to be trained, wherein the network to be trained is multiple;
The corresponding Euclidean distance matrix of prediction data for calculating each network output to be trained, obtains multiple Euclidean distance squares
Battle array;
One or more European loss functions are determined using the multiple Euclidean distance matrix;
In conjunction with one or more of European loss functions and network train loss function it is trained each described in
Network to be trained.
2. the method according to claim 1, wherein in conjunction with one or more of European loss functions and each
Each the network to be trained includes: for the loss function training of the network to be trained
The loss function of one or more of European loss functions and multiple networks to be trained is subjected to summation operation, is obtained
To target loss function;
Utilize target loss function training each network to be trained.
3. method according to claim 1 or 2, which is characterized in that determine one using the multiple Euclidean distance matrix
Or multiple European loss functions include:
According to the incidence relation between multiple networks to be trained, at least one is determined from the multiple Euclidean distance matrix
Euclidean distance matrix group, wherein include multiple Euclidean distance matrixes to be calculated in each Euclidean distance matrix group;
One or more of European loss functions are determined using at least one described Euclidean distance matrix group.
4. according to the method described in claim 3, it is characterized in that, determining institute using at least one described Euclidean distance matrix group
Stating one or more European loss functions includes:
Based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group, target Euclidean distance matrix is calculated;
Obtain the weight of the target Euclidean distance matrix;
Using the weight and the product of the target Euclidean distance matrix as the European loss function.
5. according to the method described in claim 4, it is characterized in that, the multiple Euclidean distance matrix to be calculated includes:
One Euclidean distance matrix and the second Euclidean distance matrix;
Based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group, target Euclidean distance matrix packet is calculated
It includes:
Successively calculate the first element dakjWith second element dbkjBetween Euclidean distance, obtain in the target Euclidean distance matrix
Element dmkj, wherein the first element dakjFor the element that row k jth in the first Euclidean distance matrix arranges, second element
dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
6. method according to claim 1 or 2, which is characterized in that calculate the prediction of each network output to be trained
Euclidean distance between data includes:
The Euclidean distance in each prediction data between any two row vector is calculated, the multiple Euclidean distance square is obtained
Battle array.
7. according to the method described in claim 6, it is characterized in that, calculating any two row vector in each prediction data
Between Euclidean distance include:
Pass through formulaThe Euclidean distance in the prediction data between any two row vector is calculated,
Obtain Euclidean distance matrixWherein, dakjFor row k in the Euclidean distance matrix
The element of jth column, akiFor the element that row k i-th in the prediction data arranges, ajiIt is arranged for jth row i-th in the prediction data
Element.
8. a kind of training device of neural network characterized by comprising
Acquiring unit, for obtaining network to be trained, wherein the network to be trained is multiple;
Computing unit is obtained for calculating the corresponding Euclidean distance matrix of prediction data of each network output to be trained
Multiple Euclidean distance matrixes;
Determination unit, for determining one or more European loss functions using the multiple Euclidean distance matrix;
Training unit, in conjunction with one or more of European loss functions and it is each described in network to be trained loss function
The each network to be trained of training.
9. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor
Capable computer program, which is characterized in that the processor realizes the claims 1 to 7 when executing the computer program
Any one of described in method.
10. a kind of computer-readable medium for the non-volatile program code that can be performed with processor, which is characterized in that described
Program code makes the processor execute method described in any one of the claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810847796.2A CN109086871A (en) | 2018-07-27 | 2018-07-27 | Training method, device, electronic equipment and the computer-readable medium of neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810847796.2A CN109086871A (en) | 2018-07-27 | 2018-07-27 | Training method, device, electronic equipment and the computer-readable medium of neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109086871A true CN109086871A (en) | 2018-12-25 |
Family
ID=64833335
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810847796.2A Pending CN109086871A (en) | 2018-07-27 | 2018-07-27 | Training method, device, electronic equipment and the computer-readable medium of neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109086871A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298240A (en) * | 2019-05-21 | 2019-10-01 | 北京迈格威科技有限公司 | A kind of user vehicle recognition methods, device, system and storage medium |
CN110310629A (en) * | 2019-07-16 | 2019-10-08 | 湖南检信智能科技有限公司 | Speech recognition control system based on text emotion classification |
CN111667066A (en) * | 2020-04-23 | 2020-09-15 | 北京旷视科技有限公司 | Network model training and character recognition method and device and electronic equipment |
CN112085041A (en) * | 2019-06-12 | 2020-12-15 | 北京地平线机器人技术研发有限公司 | Training method and training device for neural network and electronic equipment |
CN112489732A (en) * | 2019-09-12 | 2021-03-12 | 罗伯特·博世有限公司 | Graphic transducer neural network force field for predicting atomic force and energy in MD simulation |
CN114120289A (en) * | 2022-01-25 | 2022-03-01 | 中科视语(北京)科技有限公司 | Method and system for identifying driving area and lane line |
-
2018
- 2018-07-27 CN CN201810847796.2A patent/CN109086871A/en active Pending
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298240A (en) * | 2019-05-21 | 2019-10-01 | 北京迈格威科技有限公司 | A kind of user vehicle recognition methods, device, system and storage medium |
CN110298240B (en) * | 2019-05-21 | 2022-05-06 | 北京迈格威科技有限公司 | Automobile user identification method, device, system and storage medium |
CN112085041A (en) * | 2019-06-12 | 2020-12-15 | 北京地平线机器人技术研发有限公司 | Training method and training device for neural network and electronic equipment |
CN112085041B (en) * | 2019-06-12 | 2024-07-12 | 北京地平线机器人技术研发有限公司 | Training method and training device of neural network and electronic equipment |
CN110310629A (en) * | 2019-07-16 | 2019-10-08 | 湖南检信智能科技有限公司 | Speech recognition control system based on text emotion classification |
CN112489732A (en) * | 2019-09-12 | 2021-03-12 | 罗伯特·博世有限公司 | Graphic transducer neural network force field for predicting atomic force and energy in MD simulation |
CN111667066A (en) * | 2020-04-23 | 2020-09-15 | 北京旷视科技有限公司 | Network model training and character recognition method and device and electronic equipment |
CN111667066B (en) * | 2020-04-23 | 2024-06-11 | 北京旷视科技有限公司 | Training method and device of network model, character recognition method and device and electronic equipment |
CN114120289A (en) * | 2022-01-25 | 2022-03-01 | 中科视语(北京)科技有限公司 | Method and system for identifying driving area and lane line |
CN114120289B (en) * | 2022-01-25 | 2022-05-03 | 中科视语(北京)科技有限公司 | Method and system for identifying driving area and lane line |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11710041B2 (en) | Feature map and weight selection method and accelerating device | |
CN109086871A (en) | Training method, device, electronic equipment and the computer-readable medium of neural network | |
CN110473141B (en) | Image processing method, device, storage medium and electronic equipment | |
CN109543832B (en) | Computing device and board card | |
CN109376852B (en) | Arithmetic device and arithmetic method | |
CN110163358A (en) | A kind of computing device and method | |
CN108764466A (en) | Convolutional neural networks hardware based on field programmable gate array and its accelerated method | |
CN110263909A (en) | Image-recognizing method and device | |
CN111047022B (en) | Computing device and related product | |
CN109670581B (en) | Computing device and board card | |
CN109754084B (en) | Network structure processing method and device and related products | |
CN114416260B (en) | Image processing method, device, electronic equipment and storage medium | |
CN110163350A (en) | A kind of computing device and method | |
CN106373112A (en) | Image processing method, image processing device and electronic equipment | |
CN109711540B (en) | Computing device and board card | |
CN113240128B (en) | Collaborative training method and device for data unbalance, electronic equipment and storage medium | |
CN114360018A (en) | Rendering method and device of three-dimensional facial expression, storage medium and electronic device | |
CN109635706A (en) | Gesture identification method, equipment, storage medium and device neural network based | |
CN109711538B (en) | Operation method, device and related product | |
CN109740730B (en) | Operation method, device and related product | |
CN108960420A (en) | Processing method and accelerator | |
CN109740729B (en) | Operation method, device and related product | |
CN112801276B (en) | Data processing method, processor and electronic equipment | |
CN109146069B (en) | Arithmetic device, arithmetic method, and chip | |
CN112990370B (en) | Image data processing method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181225 |