CN109086871A - Training method, device, electronic equipment and the computer-readable medium of neural network - Google Patents

Training method, device, electronic equipment and the computer-readable medium of neural network Download PDF

Info

Publication number
CN109086871A
CN109086871A CN201810847796.2A CN201810847796A CN109086871A CN 109086871 A CN109086871 A CN 109086871A CN 201810847796 A CN201810847796 A CN 201810847796A CN 109086871 A CN109086871 A CN 109086871A
Authority
CN
China
Prior art keywords
euclidean distance
trained
network
distance matrix
european
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810847796.2A
Other languages
Chinese (zh)
Inventor
黄鼎
朱星宇
张�诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Megvii Technology Co Ltd
Beijing Maigewei Technology Co Ltd
Original Assignee
Beijing Maigewei Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Maigewei Technology Co Ltd filed Critical Beijing Maigewei Technology Co Ltd
Priority to CN201810847796.2A priority Critical patent/CN109086871A/en
Publication of CN109086871A publication Critical patent/CN109086871A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of training method of neural network, device, electronic equipment and computer-readable mediums, are related to the technical field of artificial intelligence, comprising: obtain network to be trained, wherein network to be trained is multiple;The corresponding Euclidean distance matrix of prediction data for calculating each network output to be trained, obtains multiple Euclidean distance matrixes;One or more European loss functions are determined using multiple Euclidean distance matrixes;In conjunction with one or more European loss functions and each trained network each to be trained of loss function of network train, can be realized by this method and uses neural network low in energy consumption and better performances in intelligent mobile terminal.

Description

Training method, device, electronic equipment and the computer-readable medium of neural network
Technical field
The present invention relates to field of artificial intelligence, more particularly, to a kind of training method of neural network, device, electronics Equipment and computer-readable medium.
Background technique
With the fast development of artificial intelligence technology, artificial intelligence technology, which has begun, to be applied in various physical products, For example, photographic device, image processing apparatus etc..With the more intelligent development of intelligent mobile terminal, artificial intelligence technology exists Scene in an increasingly wide range of applications on intelligent mobile terminal (for example, smart phone).
It is various multiple inside convolutional neural networks during convolutional neural networks prediction by taking convolutional neural networks as an example Miscellaneous convolution operation and matrix operation proposes certain requirement to the computing resource and energy consumption of equipment.In practical applications, think There is better scene Recognition effect just to need using more complicated network model, and more complicated, bigger model make use of momentum it is required More computing resources are expended, bigger power consumption is generated.On large server, everybody can generally focus more on the table of model It is existing, but the computing resource of intelligent mobile terminal is limited, and limited electricity is more sensitive for power consumption.Therefore, intelligent mobile Terminal is different from computer terminal, server-side, and the contradiction of performance and power consumption embodies more sharp in intelligent mobile terminal.
Existing intelligent mobile terminal is in application neural network or cruising ability is sacrificed in selection, more multiple using structure Miscellaneous, expression effect is more preferable but the biggish network model of energy consumption;Small neural network structure is selected, is answered reducing deep learning Low energy consumption and long continuation of the journey are obtained while performance.But all there is certain disadvantages for above two selection.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of training method of neural network, device, electronic equipment and meters Calculation machine readable medium can be realized by this method and use neural network low in energy consumption and better performances in intelligent mobile terminal.
In a first aspect, the embodiment of the invention provides a kind of training methods of neural network, comprising: obtain to training net Network, wherein the network to be trained is multiple;Calculate prediction data that each network train exports it is corresponding it is European away from From matrix, multiple Euclidean distance matrixes are obtained;One or more European loss letters are determined using the multiple Euclidean distance matrix Number;In conjunction with one or more of European loss functions and network train loss function it is trained each described in Training network.
Further, in conjunction with one or more of European loss functions and it is each described in network to be trained loss function The each network to be trained of training includes: by one or more of European loss functions and multiple networks to be trained Loss function carries out summation operation, obtains target loss function;Using target loss function training it is described it is each described to Training network.
Further, using the multiple Euclidean distance matrix determine one or more European loss functions include: according to Incidence relation between multiple networks to be trained, determines at least one Euclidean distance from the multiple Euclidean distance matrix Matrix group, wherein include multiple Euclidean distance matrixes to be calculated in each Euclidean distance matrix group;Using it is described at least One Euclidean distance matrix group determines one or more of European loss functions.
Further, one or more of European loss functions are determined using at least one described Euclidean distance matrix group It include: that target Euclidean distance matrix is calculated based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group;It obtains Take the weight of the target Euclidean distance matrix;Using the weight and the product of the target Euclidean distance matrix as the Europe Formula loss function.
Further, the multiple Euclidean distance matrix to be calculated includes: that the first Euclidean distance matrix and second are European Distance matrix;The Euclidean distance between Euclidean distance matrix to be calculated in each Euclidean distance matrix group is calculated, is obtained Target Euclidean distance matrix includes: successively to calculate the first element dakjWith second element dbkjBetween Euclidean distance, obtain described Element dm in target Euclidean distance matrixkj, wherein the first element dakjFor row k jth in the first Euclidean distance matrix The element of column, second element dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
Further, the Euclidean distance calculated between the prediction data of each network output to be trained includes: to calculate Euclidean distance in each prediction data between any two row vector obtains the multiple Euclidean distance matrix.
Further, the Euclidean distance calculated in each prediction data between any two row vector includes: to pass through FormulaThe Euclidean distance in the prediction data between any two row vector is calculated, is obtained European Distance matrixWherein, dakjThe member arranged for row k jth in the Euclidean distance matrix Element, akiFor the element that row k i-th in the prediction data arranges, ajiThe element arranged for jth row i-th in the prediction data.
Second aspect, the embodiment of the invention also provides a kind of training devices of neural network, comprising: acquiring unit is used In obtaining network to be trained, wherein the network to be trained is multiple;Computing unit, it is each described to training net for calculating The corresponding Euclidean distance matrix of prediction data of network output, obtains multiple Euclidean distance matrixes;Determination unit, described in utilizing Multiple Euclidean distance matrixes determine one or more European loss functions;Training unit, for combining one or more of Europe The each network to be trained of the loss function training of formula loss function and each network to be trained.
The third aspect the embodiment of the invention provides a kind of electronic equipment, including memory, processor and is stored in described On memory and the computer program that can run on the processor, the processor are realized when executing the computer program Method described in any one of above-mentioned first aspect.
Fourth aspect, the embodiment of the invention provides a kind of meters of non-volatile program code that can be performed with processor Calculation machine readable medium, said program code make the processor execute method described in any one of above-mentioned first aspect.
In embodiments of the present invention, network to be trained is obtained first;Then, the prediction of each network output to be trained is calculated The corresponding Euclidean distance matrix of data, obtains multiple Euclidean distance matrixes;Later, one is determined using multiple Euclidean distance matrixes Or multiple European loss functions;Finally, combining the damage of one or more European loss functions and neural network each to be trained It loses function and trains neural network each to be trained.
As can be seen from the above description, in the present embodiment, using the Europe for characterizing the otherness between network to be trained The mode of the loss function training network each to be trained of formula loss function and each network to be trained, can guide each wait instruct Practice network mutually to learn in the training stage, the performance of each network to be trained is improved in a manner of by mutually learning, to instruct Practise that low in energy consumption, speed is fast, and accuracy can match in excellence or beauty large-scale neural network model, can preferably solve through the above way Existing intelligent mobile terminal is since electricity is limited and the limited caused intelligent mobile terminal of computing resource is for deep learning The application of technology has the technical issues of certain limitation.
Other features and advantages of the present invention will illustrate in the following description, also, partly become from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention are in specification, claims And specifically noted structure is achieved and obtained in attached drawing.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.
Detailed description of the invention
It, below will be to specific in order to illustrate more clearly of the specific embodiment of the invention or technical solution in the prior art Embodiment or attached drawing needed to be used in the description of the prior art be briefly described, it should be apparent that, it is described below Attached drawing is some embodiments of the present invention, for those of ordinary skill in the art, before not making the creative labor It puts, is also possible to obtain other drawings based on these drawings.
Fig. 1 is the schematic diagram of a kind of electronic equipment according to an embodiment of the present invention;
Fig. 2 is a kind of flow chart of the training method of neural network according to an embodiment of the present invention;
Fig. 3 is the flow chart of the training method of another neural network according to an embodiment of the present invention;
Fig. 4 is a kind of stream being trained to neural network A and neural network B optionally according to an embodiment of the present invention Journey figure schematic diagram;
Fig. 5 be it is according to an embodiment of the present invention it is a kind of optionally to neural network A, neural network B and neural network C into The flow chart schematic diagram of row training;
Fig. 6 is a kind of schematic diagram of the training device of neural network according to an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with attached drawing to the present invention Technical solution be clearly and completely described, it is clear that described embodiments are some of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Firstly, describing the electronic equipment 100 for realizing the embodiment of the present invention referring to Fig.1, which can be used In the training method of the neural network of operation various embodiments of the present invention.
As shown in Figure 1, electronic equipment 100 includes one or more processors 102, one or more memories 104, input Device 106, output device 108 and data collector 110, the company that these components pass through bus system 112 and/or other forms The interconnection of connection mechanism (not shown).It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 are only exemplary, rather than Restrictive, as needed, the electronic equipment also can have other assemblies and structure.
The processor 102 can use digital signal processor (DSP), field programmable gate array (FPGA), can compile At least one of journey logic array (PLA) and ASIC (Application Specific Integrated Circuit) are hard Part form realizes that the processor 102 can be central processing unit (CPU), graphics processing unit (GPU) or have number According to the processing unit of processing capacity and/or the other forms of instruction execution capability, and can control in the electronic equipment 100 Other components to execute desired function.
The memory 104 may include one or more computer program products, and the computer program product can be with Including various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described volatile Property memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-easy The property lost memory for example may include read-only memory (ROM), hard disk, flash memory etc..On the computer readable storage medium It can store one or more computer program instructions, processor 102 can run described program instruction, described below to realize The embodiment of the present invention in the client functionality (realized by processor) and/or other desired functions.In the calculating Various application programs and various data can also be stored in machine readable storage medium storing program for executing, such as the application program is used and/or produced Raw various data etc..
The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat One or more of gram wind and touch screen etc..
The output device 108 can export various information (for example, image or sound) to external (for example, user), and It and may include one or more of display, loudspeaker etc..
For carrying out data acquisition, data collector can also store data collected the data collector 110 For the use of other components in the memory 104.
It illustratively, can be by for realizing the electronic equipment of the training method of neural network according to an embodiment of the present invention It is embodied as the intelligent terminals such as video camera, capture machine, smart phone, tablet computer.
According to embodiments of the present invention, a kind of embodiment of the training method of neural network is provided, it should be noted that The step of process of attached drawing illustrates can execute in a computer system such as a set of computer executable instructions, also, It, in some cases, can be to be different from shown in sequence execution herein although logical order is shown in flow charts The step of out or describing.
Fig. 2 is a kind of flow chart of the training method of neural network according to an embodiment of the present invention, as shown in Fig. 2, the party Method includes the following steps:
Step S202 obtains network to be trained, wherein network to be trained is multiple;
In the present embodiment, the quantity of network to be trained is at least 2.Network to be trained can be relatively simple network Structure, for example, the neural network with less convolutional layer port number;Being somebody's turn to do network to be trained can also be more complicated nerve net Network, for example, the neural network with more convolutional layer port number.Wherein, more complicated network to be trained, which refers to, has more Convolutional layer port number, alternatively, the network to be trained with the deeper network number of plies.
Under normal circumstances, more complicated neural network has better data characterization ability, can be on partial task Show higher accuracy rate.
Step S204 calculates the corresponding Euclidean distance matrix of prediction data of each network output to be trained (hereafter It is described as the corresponding Euclidean distance matrix of network to be trained), obtain multiple Euclidean distance matrixes;
Step S206 determines one or more European loss functions using the multiple Euclidean distance matrix, wherein described European loss function is used to characterize the otherness between network to be trained;
Step S208, in conjunction with one or more of European loss functions and each loss function training of network to be trained The network each to be trained.
For example, an European loss function can be determined according to the corresponding Euclidean distance matrix of two networks to be trained, it should European loss function is used to characterize the otherness between the two networks to be trained.
As can be seen from the above description, in the present embodiment, using the Europe for characterizing the otherness between network to be trained The mode of the loss function training network each to be trained of formula loss function and each network to be trained, can guide each wait instruct Practice network mutually to learn in the training stage, the performance of each network to be trained is improved in a manner of by mutually learning, to instruct Practise that low in energy consumption, speed is fast, and accuracy can match in excellence or beauty large-scale neural network model, can preferably solve through the above way Existing intelligent mobile terminal is since electricity is limited and the limited caused intelligent mobile terminal of computing resource is for deep learning The application of technology has the technical issues of certain limitation.
As can be seen from the above description, in the present embodiment, it firstly, obtaining network to be trained, then, calculates each wait train The corresponding Euclidean distance matrix of preset data of network output;And it is determined using multiple Euclidean distance matrixes one or more European Loss function.
In the present embodiment, step S204 is calculated European between the prediction data of each network output to be trained Distance includes: the Euclidean distance calculated in each prediction data between any two row vector, obtains the Euclidean distance matrix. The specific formula for calculation of Euclidean distance matrix will be specifically introduced in the following embodiments.
In the present embodiment, step S206 determines one or more European losses using the multiple Euclidean distance matrix Function includes step S2061 and S2062.
Step S2061, according to the incidence relation between multiple networks to be trained, from the multiple Euclidean distance matrix At least one Euclidean distance matrix group of middle determination, wherein in each Euclidean distance matrix group include multiple (such as two) to The Euclidean distance matrix of calculating.Above-mentioned incidence relation is pre-set.When training network be two when, at this point, this two A network to be trained has incidence relation, is illustrated in fig. 4 shown below.When training network is more than two, be illustrated in fig. 5 shown below three When a, network A to be trained and network B to be trained have incidence relation, and network A to be trained has with network C to be trained is associated with pass System, network C to be trained and network B to be trained have incidence relation.
Step S2062 determines one or more of European loss letters using at least one described Euclidean distance matrix group Number.An European loss function can be determined using each Euclidean distance matrix group, and the European loss function is for characterizing the Europe Otherness between the corresponding network to be trained of formula distance matrix group.That is, for multiple wait instruct with incidence relation Practice network, characterizes the otherness between multiple network to be trained by calculating an European loss function.
Optionally, step S2062 is determined one or more of European using at least one described Euclidean distance matrix group Loss function include: based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group, calculate target it is European away from From matrix;Obtain the weight of the target Euclidean distance matrix;By the product of the weight and the target Euclidean distance matrix As the European loss function.
Specifically, in the present embodiment, firstly, based on Euclidean distance to be calculated in each Euclidean distance matrix group Matrix calculates target Euclidean distance matrix.If Euclidean distance matrix to be calculated is two, it is to be calculated to calculate the two Euclidean distance between Euclidean distance matrix is as target Euclidean distance matrix;If Euclidean distance matrix to be calculated is 2 More than, for example, three, then the Euclidean distance square that any two are to be calculated in 3 Euclidean distance matrixes to be calculated is calculated first Euclidean distance between battle array obtains an intermediate Euclidean distance matrix, then calculate again the intermediate Euclidean distance matrix and 3 to Euclidean distance in the Euclidean distance matrix of calculating between remaining Euclidean distance matrix to be calculated, obtains target Euclidean distance square Battle array.The specific formula for calculation of Euclidean distance between any two Euclidean distance matrix to be calculated will in the following embodiments It is specifically introduced.
In another example, every two network pair to be trained in multiple networks to be trained can also be calculated in step S206 The Euclidean distance between Euclidean distance matrix answered, obtains target Euclidean distance matrix.
Later, the weight of target Euclidean distance matrix is obtained.
Finally, using the weight and the product of the target Euclidean distance matrix as the European loss function, wherein When the European loss function is multiple, weight corresponding to the European loss function of any two is same or different.For example, If the weight of target Euclidean distance matrix is W, European loss function can be indicated are as follows: Euclidean Loss=W* (D-Mutual), wherein EuclideanLoss is European loss function.It should be noted that W be rule of thumb provide or The optimum value obtained using parameter search.
After obtaining European loss function Euclidean Loss by above-mentioned described method, so that it may be bound to Few one or more European loss functions and each network loss function training to be trained network each to be trained.
In an optional embodiment, step S206, in conjunction with one or more European loss functions and each wait instruct Practice network losses function training network each to be trained to include the following steps:
Step S2061 sums the loss function of one or more European loss functions and multiple networks to be trained Operation obtains target loss function;
Step S2062 utilizes target loss function training network each to be trained.
In the present embodiment, after obtaining one or more European loss functions, one or more European losses are utilized Function, and each the loss function of network to be trained determines target loss function.For example, can be by one or more European damages It loses function and each the loss function of network to be trained carries out summation operation, obtain target loss function.Obtaining target loss After function, so that it may utilize target loss function training network each to be trained.
As can be seen from the above description, European loss function is that each network to be trained treated in trained network was being trained What the prediction data exported in journey was calculated, which is used to characterize the difference between each network to be trained It is anisotropic.One or more European loss functions are being added with each respective loss function of network to be trained as last optimization Target loss function, and by the target loss function to it is each when training network be trained when, can using this method While having trained each network to be trained, the advantages of mutually learning other side between each network to be trained also is guided, jointly It is promoted, to improve the performance of each network to be trained.
For example, including two networks to be trained, respectively structure network simply to be trained and one in network model A complicated network to be trained.One structure network simply to be trained and another complicated network to be trained It is trained using above-mentioned described method, enables to mutually learn the excellent of other side between the two mind networks to be trained Point.If the advantages of complicated network to be trained is that performance is good, accuracy is high, then structure network simply to be trained exists In trained process, the advantage will be learnt so that structure network simply to be trained performance equally with higher and Accuracy.
Practice have shown that abundant in training, weight M is arranged in reasonable situation, method energy provided by the embodiment of the present invention Can enough it make each wait train network than all having higher accuracy when respectively individually training.
In the present embodiment, to it is above-mentioned after training network training after the completion of, so that it may according to intelligent mobile terminal to mould Type power consumption, speed and the requirement of accuracy, taking out wherein some network to be trained, (for example structure is relatively simple to training net Network) weigh every gain and loss for intelligent mobile terminal after use.
Under normal circumstances, in each branch group, including two networks to be trained.In the case, as shown in figure 3, step Rapid S204 obtains multiple European the corresponding Euclidean distance matrix of prediction data for calculating each network output to be trained Distance matrix includes the following steps:
Step S301 obtains the first prediction data and the second preset data to be calculated, wherein the first prediction data and Two preset datas are the prediction data of two network output to be trained;
Step S302 calculates the Euclidean distance in first prediction data between any two row vector, obtains first Euclidean distance matrix;
Step S303 calculates the Euclidean distance in second preset data between any two row vector, obtains second Euclidean distance matrix;
Step S304 calculates the European damage based on the first Euclidean distance matrix and the second Euclidean distance matrix Lose function.
In the present embodiment, firstly, obtaining the first prediction data and the second prediction data to be calculated.
Then, the Euclidean distance in the first prediction data between any two row vector is calculated, the first Euclidean distance is obtained Matrix.It is alternatively possible to pass through formulaCalculate in the first prediction data any two row vector it Between Euclidean distance, obtain the first Euclidean distance matrixWherein, dakjFor the first Europe The element that row k jth arranges in formula distance matrix, akiFor the element that row k i-th in the first prediction data arranges, ajiFor the first prediction The element that jth row i-th arranges in data.
Later, the Euclidean distance in the second preset data between any two row vector is calculated, the second Euclidean distance is obtained Matrix.It is alternatively possible to pass through formulaIt calculates in the second prediction data between any two row vector Euclidean distance, obtain the second Euclidean distance matrixWherein, dbkjFor the second Europe The element that row k jth arranges in formula distance matrix, bkiFor the element that row k i-th in the second prediction data arranges, bjiIt is described second The element that jth row i-th arranges in prediction data.
Finally, European loss function can be calculated based on the first Euclidean distance matrix and the second Euclidean distance matrix, from And one or more European loss functions are obtained, and combine one or more European loss functions and each damage of network to be trained Lose function training network each to be trained.
In an optional embodiment, as shown in figure 4, if multiple Euclidean distance matrixes to be calculated include: One Euclidean distance matrix and the second Euclidean distance matrix, then based on to be calculated European in each Euclidean distance matrix group Distance matrix calculates target Euclidean distance matrix and includes the following steps:
Successively calculate the first element dakjWith second element dbkjBetween Euclidean distance, obtain the target Euclidean distance Element dm in matrixkj, wherein the first element dakjFor the element that row k jth in the first Euclidean distance matrix arranges, the Was Used dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
Specifically, in the present embodiment, formula can be passed throughCalculate first Euclidean distance Euclidean distance matrix between matrix and the second Euclidean distance matrix obtains the target Euclidean distance matrix, wherein institute Target Euclidean distance matrix is stated to be expressed as:dmkjIt is European for the target The element that row k jth arranges in distance matrix, the first element dakjThe member arranged for row k jth in the first Euclidean distance matrix Element, second element dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
Embodiment two:
For ease of understanding, a kind of specific application example is present embodiments provided, to introduce described in above-described embodiment Neural network training method.
Scene 1, network to be trained include two networks to be trained.
The two networks to be trained include neural network A and neural network B, wherein neural network A is that structure is relatively simple Neural network, neural network B be the more complicated neural network of structure.As shown in Figure 4 is a kind of optionally to nerve The flow chart schematic diagram that network A and neural network B are trained.Figure 4, it is seen that neural network A includes convolutional layer CNN-A, full articulamentum FC-A and softmax layers, wherein include one or more convolutional layers, full articulamentum in convolutional layer CNN-A It include one or more full articulamentums in FC-A.Neural network B includes convolutional layer CNN-B, full articulamentum FC-B and softmax Layer, wherein include one or more convolutional layers in convolutional layer CNN-B, include one or more full connections in full articulamentum FC-B Layer.
In the present embodiment, firstly, obtaining the prediction data A that neural network A and neural network B is exported in the training process With prediction data B, wherein the prediction data that the softmax layer that prediction data A is neural network A exports, prediction data B are mind The prediction data of softmax layer output through network B.It should be understood that in another example, prediction data A and prediction data B can also Think the data of full articulamentum output in neural network.
Then, pass through formulaIt calculates any two in prediction data A (that is, first prediction data) Euclidean distance between a row vector obtains Euclidean distance matrix A (the first Euclidean distance matrix)Wherein, dakjFor the element that row k jth in the first Euclidean distance matrix arranges, akiFor The element that row k i-th arranges in first prediction data, ajiThe element arranged for jth row i-th in first prediction data.And pass through FormulaIt calculates European between any two row vector in prediction data B (that is, second prediction data) Distance obtains Euclidean distance matrix B (that is, second Euclidean distance matrix)Wherein, dbkj For the element that row k jth in the second Euclidean distance matrix arranges, bkiThe member arranged for row k i-th in second prediction data Element, bjiThe element arranged for jth row i-th in second prediction data.
Later, the Euclidean distance matrix between Euclidean distance matrix A and Euclidean distance matrix B is calculated, it is European to obtain target Distance matrix.And obtain the weight M (or hyper parameter M) of the target Euclidean distance matrix;Finally, weight M and target is European The product of distance matrix is as European loss function.
After obtaining European loss function, so that it may by the loss function A and mind of European loss function and neural network A Loss function B through network B carries out summation operation, obtains target loss function.And utilize target loss function training nerve net Network A and neural network B, finally, obtaining neural network A and neural network B after training.
As can be seen from the above description, neural network A has better simply network structure and less parameter, neural network B Structure it is bigger.After being trained using method provided by the present embodiment, it is able to ascend the performance of neural network B, and Neural network A can be more than performance of the more complicated neural network B in independent training in simple structure.Therefore it is had both The neural network A of the advantages such as low-power consumption, quick, high accuracy solves power consumption, speed and mould for intelligent mobile terminal use Type accuracy is difficult to satisfactory to both parties problem.
It include more than two networks to be trained in scene 2, network to be trained, for example, three networks to be trained, i.e. nerve net Network A, neural network B and neural network C, as shown in Figure 5.
Neural network A is the relatively simple neural network of structure, and neural network B and neural network C are that structure is complex Neural network.Neural network A includes convolutional layer CNN-A, and full articulamentum FC-A and softmax layers, wherein convolutional layer CNN-A In include one or more convolutional layers, include one or more full articulamentums in full articulamentum FC-A.Neural network B includes convolution Layer CNN-B, full articulamentum FC-B and softmax layer, wherein it include one or more convolutional layers in convolutional layer CNN-B, full connection It include one or more full articulamentums in layer FC-B.Neural network C includes convolutional layer CNN-C, full articulamentum FC-C and softmax Layer, wherein include one or more convolutional layers in convolutional layer CNN-C, include one or more full connections in full articulamentum FC-C Layer.
In the present embodiment, it is exported in the training process firstly, obtaining neural network A, neural network B and neural network C Prediction data A, prediction data B and prediction data C, wherein prediction data A be neural network A softmax layer output it is pre- Measured data, the prediction data that the softmax layer that prediction data B is neural network B exports, prediction data C are neural network C's The prediction data of softmax layers of output.Then, any two in prediction data A, prediction data B and prediction data C are calculated separately Euclidean distance between row vector respectively obtains Euclidean distance matrix A, Euclidean distance matrix B and Euclidean distance Matrix C.
Later, the Euclidean distance matrix between Euclidean distance matrix A and Euclidean distance matrix B is calculated, it is European to obtain target Distance matrix 1;The Euclidean distance matrix between Euclidean distance matrix A and Euclidean distance Matrix C is calculated, target Euclidean distance is obtained Matrix 2;And the Euclidean distance matrix between Euclidean distance Matrix C and Euclidean distance matrix B, obtain target Euclidean distance matrix 3。
Next, obtaining the weight M1 of the target Euclidean distance matrix 1;Obtain the weight of the target Euclidean distance matrix 2 M2;Obtain the weight M3 of the target Euclidean distance matrix 3.Wherein, M1, M2 and M3 are same or different.
Finally, using the product of weight M1 and target Euclidean distance matrix 1 as European loss function 1;By weight M2 and mesh The product of Euclidean distance matrix 2 is marked as European loss function 2;Using the product of weight M3 and target Euclidean distance matrix 3 as European loss function 3.
After obtaining the European loss function of above three, so that it may by the European loss function of above three and neural network A Loss function A, neural network B loss function B and neural network C loss function C carry out summation operation, obtain target damage Lose function.And using target loss function training neural network A, neural network B and neural network C, finally, obtaining after training Neural network A, neural network B and neural network C.
More complicated structure can also be set up, for example 3 branches, 4 branch models are merged into training, is finally selected Performance and power consumption is taken to be in a model of optimal equalization point as using in mobile phone terminal.Structure with the 3 branches such as following figure It is shown:
In conclusion advantage of the invention is that the following:
1. by a kind of preferable training method of versatility, train low in energy consumption, speed is fast, but accuracy matches in excellence or beauty large-sized model Miniature neural network model, preferably solve that cell phone electricity is low, computing resource is limited answers depth learning technology Limitation.
2. the training method of the present embodiment is simple, easy to spread.The loss function that the present embodiment uses is more general, to mould Type structural requirement is not high, therefore can use between different model structures, can also be directed to different deep learning tasks It uses, there is preferably transportable property.
3. under the premise of not needing the structure of change model, can maximally develop have basic model task it is latent Power reduces the cost and investment of the application of mobile terminal depth learning technology.
Example IV:
The embodiment of the invention also provides a kind of training device of neural network, the training device of the neural network is mainly used In the training method for executing neural network provided by above content of the embodiment of the present invention, below to provided in an embodiment of the present invention The training device of neural network does specific introduction.
Fig. 6 is a kind of schematic diagram of the training device of neural network according to an embodiment of the present invention, as shown in fig. 6, the mind Training device through network mainly includes acquiring unit 10, computing unit 20, determination unit 30 and training unit 40, in which:
Acquiring unit 10, for obtaining network to be trained, wherein the network to be trained is multiple;
Computing unit 20, for calculating the corresponding Euclidean distance square of prediction data of each network output to be trained Battle array, obtains multiple Euclidean distance matrixes;
Determination unit 30, for determining one or more European loss functions using the multiple Euclidean distance matrix;
Training unit 40, in conjunction with one or more of European loss functions and it is each described in network to be trained damage Lose each network to be trained of function training.
As can be seen from the above description, in the present embodiment, using the Europe for characterizing the otherness between network to be trained The mode of the loss function training network each to be trained of formula loss function and each network to be trained, can guide each wait instruct Practice network mutually to learn in the training stage, the performance of each network to be trained is improved in a manner of by mutually learning, to instruct Practise that low in energy consumption, speed is fast, and accuracy can match in excellence or beauty large-scale neural network model, can preferably solve through the above way Existing intelligent mobile terminal is since electricity is limited and the limited caused intelligent mobile terminal of computing resource is for deep learning The application of technology has the technical issues of certain limitation.
Optionally, training unit 40 is used for: by one or more of European loss functions and multiple described to training net The loss function of network carries out summation operation, obtains target loss function;Utilize target loss function training each institute State network to be trained.
Optionally it is determined that unit includes: the first determining module, for according to the association between multiple networks to be trained Relationship determines at least one Euclidean distance matrix group, wherein each Euclidean distance from the multiple Euclidean distance matrix It include multiple Euclidean distance matrixes to be calculated in matrix group;Second determining module, for using it is described at least one it is European away from One or more of European loss functions are determined from matrix group.
Optionally, the second determining module is used for: based on Euclidean distance to be calculated in each Euclidean distance matrix group Matrix calculates target Euclidean distance matrix;Obtain the weight of the target Euclidean distance matrix;By the weight and the target The product of Euclidean distance matrix is as the European loss function.
Optionally, the second determining module is also used to: including the first Euclidean distance in multiple Euclidean distance matrixes to be calculated In the case where matrix and the second Euclidean distance matrix, the first element da is successively calculatedkjWith second element dbkjBetween Euclidean away from From obtaining the element dm in the target Euclidean distance matrixkj, wherein the first element dakjFor the first Euclidean distance square The element of row k jth column, second element db in battle arraykjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j It successively takes 1 to m.
Optionally, computing unit 20 is used for: being calculated European between any two row vector in each prediction data Distance obtains the multiple Euclidean distance matrix.
Optionally, computing unit 20 is also used to: passing through formulaIt calculates in the prediction data and appoints Euclidean distance between two row vectors of anticipating, obtains Euclidean distance matrixWherein, dakjFor the element that row k jth in the Euclidean distance matrix arranges, akiFor in the prediction data row k i-th arrange element, ajiThe element arranged for jth row i-th in the prediction data.
The technical effect and preceding method embodiment phase of device provided by the embodiment of the present invention, realization principle and generation Together, to briefly describe, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
In the present embodiment, the computer for additionally providing a kind of non-volatile program code that can be performed with processor can Medium is read, said program code makes the processor execute the training method of above-mentioned neural network.
In addition, in the description of the embodiment of the present invention unless specifically defined or limited otherwise, term " installation ", " phase Even ", " connection " shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or be integrally connected;It can To be mechanical connection, it is also possible to be electrically connected;It can be directly connected, can also can be indirectly connected through an intermediary Connection inside two elements.For the ordinary skill in the art, above-mentioned term can be understood at this with concrete condition Concrete meaning in invention.
In the description of the present invention, it should be noted that term " center ", "upper", "lower", "left", "right", "vertical", The orientation or positional relationship of the instructions such as "horizontal", "inner", "outside" be based on the orientation or positional relationship shown in the drawings, merely to Convenient for description the present invention and simplify description, rather than the device or element of indication or suggestion meaning must have a particular orientation, It is constructed and operated in a specific orientation, therefore is not considered as limiting the invention.In addition, term " first ", " second ", " third " is used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, of the invention Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words The form of product embodies, which is stored in a storage medium, including some instructions use so that One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with Store the medium of program code.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. a kind of training method of neural network characterized by comprising
Obtain network to be trained, wherein the network to be trained is multiple;
The corresponding Euclidean distance matrix of prediction data for calculating each network output to be trained, obtains multiple Euclidean distance squares Battle array;
One or more European loss functions are determined using the multiple Euclidean distance matrix;
In conjunction with one or more of European loss functions and network train loss function it is trained each described in Network to be trained.
2. the method according to claim 1, wherein in conjunction with one or more of European loss functions and each Each the network to be trained includes: for the loss function training of the network to be trained
The loss function of one or more of European loss functions and multiple networks to be trained is subjected to summation operation, is obtained To target loss function;
Utilize target loss function training each network to be trained.
3. method according to claim 1 or 2, which is characterized in that determine one using the multiple Euclidean distance matrix Or multiple European loss functions include:
According to the incidence relation between multiple networks to be trained, at least one is determined from the multiple Euclidean distance matrix Euclidean distance matrix group, wherein include multiple Euclidean distance matrixes to be calculated in each Euclidean distance matrix group;
One or more of European loss functions are determined using at least one described Euclidean distance matrix group.
4. according to the method described in claim 3, it is characterized in that, determining institute using at least one described Euclidean distance matrix group Stating one or more European loss functions includes:
Based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group, target Euclidean distance matrix is calculated;
Obtain the weight of the target Euclidean distance matrix;
Using the weight and the product of the target Euclidean distance matrix as the European loss function.
5. according to the method described in claim 4, it is characterized in that, the multiple Euclidean distance matrix to be calculated includes: One Euclidean distance matrix and the second Euclidean distance matrix;
Based on Euclidean distance matrix to be calculated in each Euclidean distance matrix group, target Euclidean distance matrix packet is calculated It includes:
Successively calculate the first element dakjWith second element dbkjBetween Euclidean distance, obtain in the target Euclidean distance matrix Element dmkj, wherein the first element dakjFor the element that row k jth in the first Euclidean distance matrix arranges, second element dbkjFor the element that row k jth in the second Euclidean distance matrix arranges, k and j are successively taken 1 to m.
6. method according to claim 1 or 2, which is characterized in that calculate the prediction of each network output to be trained Euclidean distance between data includes:
The Euclidean distance in each prediction data between any two row vector is calculated, the multiple Euclidean distance square is obtained Battle array.
7. according to the method described in claim 6, it is characterized in that, calculating any two row vector in each prediction data Between Euclidean distance include:
Pass through formulaThe Euclidean distance in the prediction data between any two row vector is calculated, Obtain Euclidean distance matrixWherein, dakjFor row k in the Euclidean distance matrix The element of jth column, akiFor the element that row k i-th in the prediction data arranges, ajiIt is arranged for jth row i-th in the prediction data Element.
8. a kind of training device of neural network characterized by comprising
Acquiring unit, for obtaining network to be trained, wherein the network to be trained is multiple;
Computing unit is obtained for calculating the corresponding Euclidean distance matrix of prediction data of each network output to be trained Multiple Euclidean distance matrixes;
Determination unit, for determining one or more European loss functions using the multiple Euclidean distance matrix;
Training unit, in conjunction with one or more of European loss functions and it is each described in network to be trained loss function The each network to be trained of training.
9. a kind of electronic equipment, including memory, processor and it is stored on the memory and can transports on the processor Capable computer program, which is characterized in that the processor realizes the claims 1 to 7 when executing the computer program Any one of described in method.
10. a kind of computer-readable medium for the non-volatile program code that can be performed with processor, which is characterized in that described Program code makes the processor execute method described in any one of the claims 1 to 7.
CN201810847796.2A 2018-07-27 2018-07-27 Training method, device, electronic equipment and the computer-readable medium of neural network Pending CN109086871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810847796.2A CN109086871A (en) 2018-07-27 2018-07-27 Training method, device, electronic equipment and the computer-readable medium of neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810847796.2A CN109086871A (en) 2018-07-27 2018-07-27 Training method, device, electronic equipment and the computer-readable medium of neural network

Publications (1)

Publication Number Publication Date
CN109086871A true CN109086871A (en) 2018-12-25

Family

ID=64833335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810847796.2A Pending CN109086871A (en) 2018-07-27 2018-07-27 Training method, device, electronic equipment and the computer-readable medium of neural network

Country Status (1)

Country Link
CN (1) CN109086871A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298240A (en) * 2019-05-21 2019-10-01 北京迈格威科技有限公司 A kind of user vehicle recognition methods, device, system and storage medium
CN110310629A (en) * 2019-07-16 2019-10-08 湖南检信智能科技有限公司 Speech recognition control system based on text emotion classification
CN111667066A (en) * 2020-04-23 2020-09-15 北京旷视科技有限公司 Network model training and character recognition method and device and electronic equipment
CN112085041A (en) * 2019-06-12 2020-12-15 北京地平线机器人技术研发有限公司 Training method and training device for neural network and electronic equipment
CN112489732A (en) * 2019-09-12 2021-03-12 罗伯特·博世有限公司 Graphic transducer neural network force field for predicting atomic force and energy in MD simulation
CN114120289A (en) * 2022-01-25 2022-03-01 中科视语(北京)科技有限公司 Method and system for identifying driving area and lane line

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298240A (en) * 2019-05-21 2019-10-01 北京迈格威科技有限公司 A kind of user vehicle recognition methods, device, system and storage medium
CN110298240B (en) * 2019-05-21 2022-05-06 北京迈格威科技有限公司 Automobile user identification method, device, system and storage medium
CN112085041A (en) * 2019-06-12 2020-12-15 北京地平线机器人技术研发有限公司 Training method and training device for neural network and electronic equipment
CN112085041B (en) * 2019-06-12 2024-07-12 北京地平线机器人技术研发有限公司 Training method and training device of neural network and electronic equipment
CN110310629A (en) * 2019-07-16 2019-10-08 湖南检信智能科技有限公司 Speech recognition control system based on text emotion classification
CN112489732A (en) * 2019-09-12 2021-03-12 罗伯特·博世有限公司 Graphic transducer neural network force field for predicting atomic force and energy in MD simulation
CN111667066A (en) * 2020-04-23 2020-09-15 北京旷视科技有限公司 Network model training and character recognition method and device and electronic equipment
CN111667066B (en) * 2020-04-23 2024-06-11 北京旷视科技有限公司 Training method and device of network model, character recognition method and device and electronic equipment
CN114120289A (en) * 2022-01-25 2022-03-01 中科视语(北京)科技有限公司 Method and system for identifying driving area and lane line
CN114120289B (en) * 2022-01-25 2022-05-03 中科视语(北京)科技有限公司 Method and system for identifying driving area and lane line

Similar Documents

Publication Publication Date Title
US11710041B2 (en) Feature map and weight selection method and accelerating device
CN109086871A (en) Training method, device, electronic equipment and the computer-readable medium of neural network
CN110473141B (en) Image processing method, device, storage medium and electronic equipment
CN109543832B (en) Computing device and board card
CN109376852B (en) Arithmetic device and arithmetic method
CN110163358A (en) A kind of computing device and method
CN108764466A (en) Convolutional neural networks hardware based on field programmable gate array and its accelerated method
CN110263909A (en) Image-recognizing method and device
CN111047022B (en) Computing device and related product
CN109670581B (en) Computing device and board card
CN109754084B (en) Network structure processing method and device and related products
CN114416260B (en) Image processing method, device, electronic equipment and storage medium
CN110163350A (en) A kind of computing device and method
CN106373112A (en) Image processing method, image processing device and electronic equipment
CN109711540B (en) Computing device and board card
CN113240128B (en) Collaborative training method and device for data unbalance, electronic equipment and storage medium
CN114360018A (en) Rendering method and device of three-dimensional facial expression, storage medium and electronic device
CN109635706A (en) Gesture identification method, equipment, storage medium and device neural network based
CN109711538B (en) Operation method, device and related product
CN109740730B (en) Operation method, device and related product
CN108960420A (en) Processing method and accelerator
CN109740729B (en) Operation method, device and related product
CN112801276B (en) Data processing method, processor and electronic equipment
CN109146069B (en) Arithmetic device, arithmetic method, and chip
CN112990370B (en) Image data processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181225