CN107633242A - Training method, device, equipment and the storage medium of network model - Google Patents
Training method, device, equipment and the storage medium of network model Download PDFInfo
- Publication number
- CN107633242A CN107633242A CN201710993043.8A CN201710993043A CN107633242A CN 107633242 A CN107633242 A CN 107633242A CN 201710993043 A CN201710993043 A CN 201710993043A CN 107633242 A CN107633242 A CN 107633242A
- Authority
- CN
- China
- Prior art keywords
- network model
- relaying
- loss function
- loss
- renewal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The embodiment of the invention discloses a kind of training method of network model, device, equipment and storage medium, methods described includes:When first network model reaches default renewal stop condition, first object network model is determined according to the renewal result of the first network model, insertion relaying loses Internet to determine the second network model after the default pond layer in the first object network model;Loss function is relayed according to corresponding to second network model and the relaying loss Internet determine the relaying loss Internet;The global loss function of second network model is determined according to second network model and the relaying loss function;The parameter of second network model is updated using the relaying loss function and the global loss function, to obtain the second network model after updating.Solves the situation of network model training process feature over-fitting on the middle and senior level and low-level feature poor fitting so that network model training is more thorough, the degree of accuracy is higher.
Description
Technical field
The present invention relates to deep learning field, more particularly to a kind of training method of network model, device, equipment and storage
Medium.
Background technology
Existing human face recognition model is normally based on deep learning algorithm model and trains what is obtained, deep learning algorithm mould
The quality of type training influences the result of recognition of face.
The deep learning model for recognition of face is substantially that one is added after top (characteristic layer) of network at present
Individual or multiple loss function layers, for training a parameter for renewal deep learning network model.Prior art is typically in network
Characteristic layer after be trained plus one or more loss functions, but in the training process due to parameter bang path too
Grow and cause the situation of high-level characteristic over-fitting and middle level features poor fitting, cause the undertrained thorough of whole network, so as to
Cause the parameter in depth network learning model network intermediate layer can not be updated well, and then cause to train the model come
Result in the recognition of face of reality is undesirable.
The content of the invention
The embodiment of the present invention provides a kind of training method of network model, device, equipment and storage medium, solves network
The situation of high-level characteristic over-fitting caused by parameter bang path is oversize during model training and low-level feature poor fitting so that
Acquisition network model training more thoroughly, the degree of accuracy it is higher.
In a first aspect, the embodiments of the invention provide a kind of training method of network model, methods described includes:
It is true according to the renewal result of the first network model when first network model reaches default renewal stop condition
Determine first object network model, after the default pond layer in the first object network model insertion relaying loss Internet with
Determine the second network model;
Determine that the relaying loss Internet is corresponding according to second network model and the relaying loss Internet
Relaying loss function;
Determine that the global of second network model loses according to second network model and the relaying loss function
Function;
The parameter of second network model is carried out more using the relaying loss function and the global loss function
Newly, the second network model after being updated with acquisition.
Second aspect, the embodiment of the present invention additionally provide a kind of trainer of network model, and described device includes:
Second network model determining module, for reaching default renewal stop condition when first network model, according to institute
The renewal result for stating first network model determines first object network model, the default pond in the first object network model
Insertion relaying loses Internet to determine the second network model after changing layer;
Loss function determining module is relayed, for true according to second network model and the relaying loss Internet
Relaying loss function corresponding to the fixed relaying loss Internet;
Global loss function determining module, for determining institute according to second network model and the relaying loss function
State the global loss function of the second network model;
Second network model update module, for the application relaying loss function and the global loss function to described
The parameter of second network model is updated, to obtain the second network model after updating.
The third aspect, the embodiment of the present invention additionally provide a kind of computer equipment, including memory, processor and are stored in
Realized on memory and the computer program that can run on a processor, during the computing device described program as the present invention is real
Apply the training method of any described network model in example.
Fourth aspect, the embodiment of the present invention additionally provide a kind of computer-readable recording medium, are stored thereon with computer
Program, the training method of the network model as described in any in the embodiment of the present invention is realized when the program is executed by processor.
In the embodiment of the present invention, by when first network model reach it is default renewal stop condition when, according to described
The renewal result of first network model determines first object network model, the default pond in the first object network model
Insertion relaying loss Internet is lost with determining the second network model according to second network model and the relaying after layer
Internet determines relaying loss function corresponding to the relaying loss Internet, then according to second network model and described
Relaying loss function determines the global loss function of second network model, using the relaying loss function and the overall situation
Loss function is updated to the parameter of second network model, to obtain the second network model after updating.Solves net
The situation of high-level characteristic over-fitting caused by parameter bang path is oversize during network model training and low-level feature poor fitting, makes
Must obtain network model training more thoroughly, the degree of accuracy it is higher.
Brief description of the drawings
Fig. 1 a are a kind of flow charts of the training method of network model in the embodiment of the present invention one;
Fig. 1 b are a kind of schematic diagrames in the intermediate layer for first network model that the embodiment of the present invention one is applicable;
Fig. 1 c are a kind of centres of the second network model of combination relaying loss function that the embodiment of the present invention one is applicable
The schematic diagram of layer;
Fig. 2 a are a kind of flow charts of the training method of network model in the embodiment of the present invention two;
Fig. 2 b are the signals of the position of each key point in a kind of face critical point detection that the embodiment of the present invention two is applicable
Figure;
Fig. 2 c are the signals of the position of each key point after a kind of face key point that the embodiment of the present invention two is applicable is alignd
Figure;
Fig. 3 is a kind of flow chart of the training method of network model in the embodiment of the present invention three;
Fig. 4 is a kind of flow chart of the training method of network model in the embodiment of the present invention four;
Fig. 5 is a kind of structural representation of the trainer of network model in the embodiment of the present invention five;
Fig. 6 is a kind of structural representation of computer equipment in the embodiment of the present invention six.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just
Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
Embodiment one
Fig. 1 a are a kind of flow chart of the training method for network model that the embodiment of the present invention one provides, and the present embodiment can fit
The situation of the oversize demand optimization network model of parameter bang path in network model, this method can be embodiment by the present invention
A kind of trainer of the network model provided is performed, and the device can be realized by the way of software and/or hardware.Reference chart
1a, this method specifically may include steps of:
S110, reach default renewal stop condition when first network model, according to the renewal of the first network model
As a result first object network model is determined, relaying loss net is inserted after the default pond layer in the first object network model
Network layers are to determine the second network model.
Specifically, in the training process to first network model, one default renewal stop condition of setting is realized,
After first network model reaches the renewal stop condition of setting, first network model stops renewal.Wherein, first network model
Renewal process is the difference of the output valve and input value according to first network model, to update the model of first network model ginseng
Number, is updated by the undated parameter of first network model to first network model, that is, often updating a first network mould
The parameter of type just can obtain a new first network model.After renewal stops, obtained first network model is the first mesh
Mark network model.
Wherein, insertion relaying loses Internet to determine the second net after the default pond layer in first object network model
Network model.Optionally, the pond layer that pond layer can be the centre position in whole pond layers is preset, for example, first network mould
Type one shares 8 pond layers, then can be using the 4th pond layer as default pond layer.
S120, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet
Corresponding relaying loss function.
Specifically, the difference that the output of Internet and the input of the second network model are lost according to relaying determines relaying damage
Lose relaying loss function corresponding to Internet.Wherein, relaying loss function is characterized in the relaying loss output of Internet and defeated
Enter the extent of damage compared.Optionally, the input of the second network model can be treated picture to be trained etc..
S130, the overall situation of second network model determined according to second network model and the relaying loss function
Loss function.
Specifically, global loss function shows the difference in input and the output of the second network model, global loss function
It is to be superimposed on the basis of former global loss function with relaying loss function.Optionally, the quantity of loss function is relayed
Match with the quantity of relaying loss Internet.
Optionally, it is described according to second network model and it is described relaying loss function determine second network model
Global loss function, including:Obtain the initial global loss function in the network model;By it is described relaying loss function with
The initial global loss function is combined to determine the global loss function of second network model.
Wherein, the initial global loss function in network model is obtained, will initial global loss function and relaying loss letter
Number is overlapped to obtain the global loss function of the second network model, and superposition rule can be according to superposition rule set in advance
It is determined that can give different weighted values to each relaying loss function.In a specific example, if adding in 3
After loss Internet, loss function is relayed because each relaying loss Internet corresponds to one, then 3 relaying loss Internets
3 relaying loss functions are had, the weighted value of 3 relaying loss functions is 0.2,0.2,0.3 respectively.
In a specific example, if initial global loss function is Loss=Soft maxLoss+ λ1CenterLoss, relaying loss function take RelayLoss, then are not determined according to the second network with the relaying loss function
The global loss function of the second network model be Loss=Soft maxLoss+ λ1CenterLoss+λ2RelayLoss, it is optional
, RelayLoss can select different type function, can take herein SoftmaxLoss, TripletLoss,
ContrastiveLoss, λ1Can be with value 0.008, λ2Can be with value 0.1.Such function selection and parameter selection improve mould
The precision of type.
S140, using it is described relaying loss function and the global loss function parameter of second network model is entered
Row renewal, to obtain the second network model after updating.
Specifically, using relaying loss function and global loss function is updated to the parameter of the second network model, profit
The second network model after renewal is determined with the parameter after renewal.
In the embodiment of the present invention, by when first network model reach it is default renewal stop condition when, according to described
The renewal result of first network model determines first object network model, the default pond in the first object network model
Insertion relaying loss Internet is lost with determining the second network model according to second network model and the relaying after layer
Internet determines relaying loss function corresponding to the relaying loss Internet, then according to second network model and described
Relaying loss function determines the global loss function of second network model, using the relaying loss function and the overall situation
Loss function is updated to the parameter of second network model, to obtain the second network model after updating.Solves net
The situation of high-level characteristic over-fitting caused by parameter bang path is oversize during network model training and low-level feature poor fitting, makes
Must obtain network model training more thoroughly, the degree of accuracy it is higher.
Optionally, the first network model and second network model include convolutional layer, pond layer and connect rank entirely
Layer, wherein, the default pond layer is any one layer in the pond layer.
Wherein, first network model and the second network model include convolutional layer, pond layer and full articulamentum, network model
Formed by topology information and configuration parameter information architecture, topology information includes the number of plies of convolutional layer, the layer of pond layer
Topology connection order between the number of plies of several, full articulamentum and each layer;Configuration parameter information includes:The convolution step of each convolutional layer
The nerve of long and convolution kernel size and number, the size of the pond step-length of each pond layer and pond window and each full articulamentum
First quantity.Default pond layer and any one layer in the pond layer.
Insertion relaying loss Internet is to determine the after the default pond layer in the first object network model
Two network models, including:At least one relaying loss net is inserted after default pond layer in the first object network model
Network layers are to determine the second network model.
Specifically, at least one relaying loss network is inserted after default pond layer in the first object network model
Layer is to determine the second network model.Optionally, default pond layer can be close to intermediate layer position in whole network model
The pond layer at place.Because the network model does not limit the specific form for using network model, in a specific example, only
Illustrated using only a network model used in recognition of face, Fig. 1 b show a kind of centre of first network model
Layer, Fig. 1 c show a kind of intermediate layer of second network model of combination relaying loss function, wherein, because length is limited,
In Fig. 1 b and Fig. 1 c, the structure of overall network model is not shown, illustrate only the structure of the network model in intermediate layer.It is optional
, in Figure 1b, c1 can be expressed as:Center_loss_finetune_vgg_face_dataset, s1 can be expressed as
Softmax_loss, f1 can be expressed as:The network model that fc6_finetune_vgg_face_dataset, Fig. 1 b are used exists
Occurs the problem of poor fitting during training so that whole network training is insufficient, therefore, adds relaying loss network in figure 1 c
Layer, here only exemplified by adding a relaying loss Internet.In figure 1 c, 150 relaying to add loses Internet, wherein
P_d can be expressed as:Pool4_relay_vgg_face_dataset, rs can be expressed as:reley_loss.With reference to figure 1b and
Fig. 1 c, a relaying loss Internet can be inserted afterwards in pond layer 4 (pool4), such as the p_d and rs in Fig. 1 c, wherein, insertion
Relaying loss Internet quantity be not construed as limiting, according to insert relaying loss Internet after determine the second network model.It is real
The determination of the second network model after adding relaying loss function is showed.Solve the problems, such as middle layer network poor fitting, so as to
So that model training is more thorough, the degree of accuracy that model training is improved after relaying loss function is added.
Embodiment two
Fig. 2 a are a kind of flow chart of the training method for network model that the embodiment of the present invention two provides, and the present embodiment is upper
State and realize on the basis of embodiment.With reference to figure 2a, this method specifically may include steps of:
S210, picture to be trained is input in first network model be trained, according to training result to described first
Network model is updated.
Wherein, picture to be trained is input in first network model and be trained, according to training result to first network
Model is updated.Optionally, picture to be trained can be the picture in special scenes, and specific scene can be VTM
(Video Teller Machine, remote teller machine), jeweler's shop member identification etc..Face is gathered in specific scene to shine
Piece, video pictures are gathered using camera, and by network transmission and data wire storage in computer systems.
Face datection is carried out to the face picture collected, face picture is extracted and is stored in computer equipment,
Then face picture is labeled.It should be noted that when being labeled to face picture, it is necessary to reference to manually to detection
And the face picture extracted is classified and marked, the human face photo for belonging to identical people is put together and marked
Note.In a specific example, it is assumed that total number of persons is N, and everyone has M pictures, and in specific scene, everyone has M
Pictures it may is that camera had photographed the photo or difference of the different angle at everyone same moment
The photo of the identical or different action at moment.
Face alignment operation is carried out to face picture, the facial angle and face location in face picture be it is inconsistent,
In order to ensure to extract stable feature and obtain preferable recognition of face effect, it is necessary to carry out key point alignment to face picture
Operation, to remove the influence that facial angle is brought to recognition of face.Wherein, key point includes the positions such as eyes, nose and the corners of the mouth.
Fig. 2 b show a kind of position of each key point in face critical point detection, and Fig. 2 c show that a kind of face key point is each after aliging
The position of key point, in figure 2b, 271 represent that eye key point, 272 represent that nose key point, 273 represent mouth key point;
In figure 2 c, 281 represent that the eye key point after alignment, 282 represent the nose key point after alignment, after 283 expression alignment
Mouth key point.
Training set picture is extracted, is believed being labeled and having been randomly selected in the photo of face alignment comprising face identity
The face picture of breath is trained, and it is as follows to extract each group of training sample:
Face picture img_1, img_1 identity information (classification number)
……
Face picture img_N, img_N identity information (classification number)
Wherein, face picture img_1 refers to the store path of the 1st face picture, such as C:\Program Files\
Adobe, classification number refer to people's said tag set in advance for each participation experiment, and classification number is typically since 0, such as
0,1,2,3 ... ..., classification number only table takes over the identity at family for use.
Optionally, picture to be trained is the face picture by aforesaid operations processing.
S220, the first training precision determined according to the training result are respectively less than the first default essence in the number of setting
When spending threshold value, the renewal result of the first network model last time is recorded as first object network model and described waits to instruct
Practice the training result of picture, using the training result of the picture to be trained as the first training result, in the first object net
Insertion relaying loses Internet to determine the second network model after default pond layer in network model.
Specifically, will be after training picture be input to and is trained in first network model, according to training result to institute
First network model is stated to be updated.Specific renewal process is as follows, updates network mould by updating the parameter of network model
Type, if corresponding parameters all same in network model, network model is identical, and the picture of training set is input into the first net
It is trained in network model, often trains once one training result of output, compares the picture of training result and the training set of input
Data difference, determine the first training precision;Then the parameter of first network model is updated with reality according to training result
Now first network model is updated, by the first network model after the training result input renewal of last training set picture
Training result is obtained again, continues to be updated the parameter of first network model according to training result to realize to first network
Model modification, and the first training precision is determined according to training result.Wherein, the first training precision is the standard according to setting to defeated
The picture of the training set entered carries out computing acquisition with training result, and each pair first network model modification once, obtains a training
Precision, that is, training precision is also changing with the renewal of first network model.
When the first training precision is respectively less than the first default precision threshold in the number of setting, first network model is recorded
The renewal result of last time is as first object network model and the training result of picture to be trained, by the picture to be trained
Training result as the first training result.In a specific example, the number of setting can be 4 times, if that is, 4 times
Training precision be both less than the first default precision threshold, then show to be in one using the training result of first network model
Stable state.
S230, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet
Corresponding relaying loss function.
S240, the overall situation of second network model determined according to second network model and the relaying loss function
Loss function.
S250, using it is described relaying loss function and the global loss function parameter of second network model is entered
Row renewal, to obtain the second network model after updating.
In the embodiment of the present invention, it is trained by the way that picture to be trained is input in first network model, according to training
As a result the first network model is updated, according to the training result determine the first training precision setting number
In when being respectively less than the first default precision threshold, record the renewal result of the first network model last time as first object
The training result of network model and the picture to be trained, tied the training result of the picture to be trained as the first training
Fruit.Realize the determination to first object network model and treat the acquisition of the training result of training picture.
Optionally, it is described according to second network model and it is described relaying loss Internet determine it is described relaying loss
Relaying loss function corresponding to Internet, including:The picture for treating first training result is input to the second network mould
Type is trained, and obtains output result of the picture in the relaying loss Internet of first training result;According to described
Output result determines relaying loss function.
Specifically, it is determined that after first object network model, by instruction of the picture training set in first object network model
White silk result, which continues to be input in the second network model, to be trained, and the picture for obtaining the first training result loses Internet in relaying
Output result, according to the output result determine relaying loss function.In a specific example, relaying loss Internet
After the layer of default pond, it can be pool4 to preset pond layer.Compare output result and the first training of relaying loss Internet
As a result, it is determined that relaying loss function, wherein, relaying loss function characterizes the first of the output for relaying loss Internet and input
The difference of the picture of training result.Realize the determination to relaying loss function.
Embodiment three
Fig. 3 is a kind of flow chart of the training method for network model that the embodiment of the present invention three provides, and the present embodiment is upper
On the basis of stating embodiment, to " the application relaying loss function and the global loss function are to second network model
Parameter be updated, with obtain update after the second network model " be optimized.With reference to figure 3, this method can specifically wrap
Include following steps:
S310, reach default renewal stop condition when first network model, according to the renewal of the first network model
As a result first object network model is determined, relaying loss net is inserted after the default pond layer in the first object network model
Network layers are to determine the second network model.
S320, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet
Corresponding relaying loss function.
S330, the overall situation of second network model determined according to second network model and the relaying loss function
Loss function.
S340, the parameter using the relaying loss function to each layer before default pond layer in second network model
It is updated, meanwhile, the parameter of each layer in second network model is updated using global loss function.
Specifically, each layer parameter before the default pond layer in the second network model is entered using loss function is relayed
Row renewal, in a specific example, so that network model shares 30 layers as an example, it is pool4 to preset pond layer, is whole network
The 12nd layer of layer, then relay loss function and each layer parameter before the 12nd layer and 12 layers is updated.Meanwhile using the overall situation
Loss function is updated to each layer parameter in the second network model, wherein, global loss function renewal is whole network
The parameter of each layer of model, in this specific example, as 30 layers and parameter before.
Exemplary, after global loss function Loss is obtained, network model is tried to achieve according to Loss and chain type Rule for derivation
In each parameter Grad, then according to stochastic gradient descent method update model parameter.
S350, the second network model after result obtains renewal is updated according to the parameter of each layer.
Specifically, the model parameter of each layer after renewal is replaced into the model parameter before renewal, to obtain the after updating
Two network models.
In the embodiment of the present invention, before relaying loss function to default pond layer in second network model by application
The parameter of each layer is updated, meanwhile, the parameter of each layer in second network model is carried out more using global loss function
Newly, the second network model after result obtains renewal is updated according to the parameter of each layer.Realize by model parameter more
Newly the second network model is updated.
Example IV
Fig. 4 is a kind of flow chart of the training method for network model that the embodiment of the present invention four provides, and the present embodiment is upper
State and realize on the basis of embodiment.With reference to figure 4, this method specifically may include steps of:
S410, reach default renewal stop condition when first network model, according to the renewal of the first network model
As a result first object network model is determined, relaying loss net is inserted after the default pond layer in the first object network model
Network layers are to determine the second network model.
S420, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet
Corresponding relaying loss function.
S430, the overall situation of second network model determined according to second network model and the relaying loss function
Loss function.
S440, using it is described relaying loss function and the global loss function parameter of second network model is entered
Row renewal, to obtain the second network model after updating.
S450, the picture that the second network model after the renewal is concentrated to picture checking is trained and trained with obtaining
Precision.
Specifically, the second network model after renewal is verified to the picture concentrated is trained to obtain training essence to picture
Degree, optionally, the selection rule of picture checking collection are as follows:One shared N number of people participates in experiment, and wherein K people take part in training set
Making, then by the photo of remaining N-K people be used for make checking collection.Checking collects the human face photo checking pair by randomly selecting
Composition, decimation rule are as follows:
Positive sample pair:The a pictures of n-th people and the b pictures of n-th of people;
……
Negative sample pair:The c pictures of i-th people and the d pictures of j-th of people;
Wherein, any two photos in human face photo of the positive sample to referring to same person, negative sample is to being different
Any two photos in the human face photo of people.In a specific example, according to international standard LFW (Labeled Faces
In the Wild) rule calculates measuring accuracy, take herein positive sample to 3000 with negative sample to 3000, test order is:Will just
Two photos of sample centering are judged as same person, then correct judgment, another xi=1, two photos of negative sample centering are sentenced
Break not to be same person, then correct judgment xi=1, other situations are then considered misjudgment, i.e. xi=0.The table of measuring accuracy
It is as follows up to formula:A=A represents measuring accuracy.
If S460, the training precision are more than the second default precision threshold, stop to second network model more
Newly, using the renewal result of last time as the second objective network model.
Specifically, the end condition of the second network model training is:Training precision is more than the second default precision threshold, one
In individual specific example, the computational methods of training precision can apply international standard LEW rules to calculate.If training precision is more than second
Default precision threshold, then stop being updated the second network model, using the renewal result of last time as the second target network
Network model.
In the embodiment of the present invention, by the way that the second network model after renewal is verified into the picture concentrated is trained to picture
To obtain training precision, when the training precision is more than the second default precision threshold, stop to second network model
Renewal, using the renewal result of last time as the second objective network model.Realize pre- by training of judgement precision and second
If the determination of relation pair the second objective network model of precision threshold.
On the basis of above-mentioned technical proposal, the application scenarios of the embodiment of the present invention can be used in face recognition algorithms
In bank member identification project, face picture is gathered under true application scenarios, then these face pictures are detected, be right
Neat operation, and corresponding face training set is made, train face recognition algorithms model using previously described method, i.e., it is our
The second objective network model in case, so as to obtain the people in bank member identifies scene with high discrimination and recognition effect
Face recognizer, this method, which can preferably reach, " to be reduced the change between same person, while increases between different people not
Effect together ".
Second network model is applied in specific application scenarios, can be by comparing face characteristic feat-ID, and adopt
Recognition of face flow is carried out with Euclidean distance.Verified by actual scene, can prove to combine the people of relaying loss Internet
Face recognizer (with reference to figure 1c) can be than the face recognition algorithms of the multitask deep learning network of common practice (with reference to figure 1b)
Accuracy rate is higher.It should be noted that the example above, by taking recognition of face as an example, the program can expand to the knowledge of in general image
Not in.
Embodiment five
Fig. 5 be the present invention be embodiment five provide a kind of network model trainer structural representation, the device
It is adapted for carrying out a kind of training method for network model that the embodiment of the present invention is supplied to.As shown in figure 5, the device specifically can be with
Including:
Second network model determining module 510, for reaching default renewal stop condition when first network model, according to
The renewal result of the first network model determines first object network model, default in the first object network model
Insertion relaying loses Internet to determine the second network model after the layer of pond;
Loss function determining module 520 is relayed, for according to second network model and the relaying loss network
Layer determines relaying loss function corresponding to the relaying loss Internet;
Global loss function determining module 530, for true according to second network model and the relaying loss function
The global loss function of fixed second network model;
Second network model update module 540, for the application relaying loss function and the global loss function pair
The parameter of second network model is updated, to obtain the second network model after updating.
Further, the default renewal stop condition includes:Time of the training precision of first network model in setting
The first default precision threshold is respectively less than in number;
Described device also includes:
First network model modification module, for reaching default renewal stop condition when first network model, according to
The renewal result of the first network model determines first object network model, default in the first object network model
Insertion relaying loses Internet so that before determining the second network model, picture to be trained is input into first network mould after the layer of pond
It is trained in type, the first network model is updated according to training result;
Second network model determining module 520 is specifically used for being set according to the first training precision that the training result determines
When the first default precision threshold is respectively less than in fixed number, the renewal result conduct of the first network model last time is recorded
The training result of first object network model and the picture to be trained, using the training result of the picture to be trained as first
Training result.
Further, loss function determining module 520 is relayed to be specifically used for:
The picture for treating first training result is input into second network model to be trained, obtains described first
Output result of the picture of training result in the relaying loss Internet;
Relaying loss function is determined according to the output result.
Further, global loss function determining module 530 is specifically used for:
Obtain the initial global loss function in the network model;
The relaying loss function and the initial global loss function are combined to determine the second network mould
The global loss function of type.
Further, the second network model update module 540 is specifically used for:
The parameter of each layer before pond layer is preset in second network model is carried out using the relaying loss function
Renewal, meanwhile, the parameter of each layer in second network model is updated using global loss function;
The second network model after result obtains renewal is updated according to the parameter of each layer.
Further, in addition to:
Training precision acquisition module, for applying the relaying loss function and the global loss function to institute described
The parameter for stating the second network model is updated, after obtaining the second network model after updating,
The second network model after the renewal is verified that the picture concentrated is trained to obtain training precision to picture;
If the training precision is more than the second default precision threshold, stop the renewal to second network model, will
The renewal result of last time is as the second objective network model.
Further, the first network model and second network model include convolutional layer, pond layer and connected entirely
Stratum, wherein, the default pond layer is any one layer in the pond layer;
Second network model determining module 510 is specifically used for:
At least one relaying loss Internet is inserted after default pond layer in the first object network model with true
Fixed second network model.
The trainer of network model provided in an embodiment of the present invention can perform the network that any embodiment of the present invention provides
The coaching method of model, possess the corresponding functional module of execution method and beneficial effect.
Embodiment six
Fig. 6 is a kind of structural representation for computer equipment that the embodiment of the present invention six provides.Fig. 6 is shown suitable for being used for
Realize the block diagram of the exemplary computer device 12 of embodiment of the present invention.The computer equipment 12 that Fig. 6 is shown is only one
Example, any restrictions should not be brought to the function and use range of the embodiment of the present invention.
As shown in fig. 6, computer equipment 12 is showed in the form of universal computing device.The component of computer equipment 12 can be with
Including but not limited to:One or more processor or processing unit 16, system storage 28, connect different system component
The bus 18 of (including system storage 28 and processing unit 16).
Bus 18 represents the one or more in a few class bus structures, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.Lift
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, MCA (MAC)
Bus, enhanced isa bus, VESA's (VESA) local bus and periphery component interconnection (PCI) bus.
Computer equipment 12 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by
The usable medium that computer equipment 12 accesses, including volatibility and non-volatile media, moveable and immovable medium.
System storage 28 can include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (RAM) 30 and/or cache memory 32.Computer equipment 12 may further include it is other it is removable/can not
Mobile, volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for read-write not
Movably, non-volatile magnetic media (Fig. 6 is not shown, is commonly referred to as " hard disk drive ").Although not shown in Fig. 6, can with
There is provided for the disc driver to may move non-volatile magnetic disk (such as " floppy disk ") read-write, and to removable non-volatile
The CD drive of CD (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driving
Device can be connected by one or more data media interfaces with bus 18.Memory 28 can include at least one program and produce
Product, the program product have one group of (for example, at least one) program module, and these program modules are configured to perform of the invention each
The function of embodiment.
Program/utility 40 with one group of (at least one) program module 42, such as memory 28 can be stored in
In, such program module 42 includes --- but being not limited to --- operating system, one or more application program, other programs
Module and routine data, the realization of network environment may be included in each or certain combination in these examples.Program mould
Block 42 generally performs function and/or method in embodiment described in the invention.
Computer equipment 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24
Deng) communication, the equipment communication interacted with the computer equipment 12 can be also enabled a user to one or more, and/or with making
Obtain any equipment that the computer equipment 12 can be communicated with one or more of the other computing device (such as network interface card, modulatedemodulate
Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, computer equipment 12 may be used also
To pass through network adapter 20 and one or more network (such as LAN (LAN), wide area network (WAN) and/or public network
Network, such as internet) communication.As illustrated, network adapter 20 is led to by bus 18 and other modules of computer equipment 12
Letter.It should be understood that although not shown in Fig. 6, computer equipment 12 can be combined and use other hardware and/or software module, bag
Include but be not limited to:Microcode, device driver, redundant processing unit, external disk drive array, RAID system, magnetic tape drive
Device and data backup storage system etc..
Processing unit 16 is stored in program in system storage 28 by operation, so as to perform various function application and
Data processing, such as realize the training method for the network model that the embodiment of the present invention is provided:
That is, the processing unit is realized when performing described program:Stop when first network model reaches default renewal
Condition, first object network model is determined according to the renewal result of the first network model, in the first object network mould
Insertion relaying loses Internet to determine the second network model after default pond layer in type;According to second network model with
And the relaying loss Internet determines relaying loss function corresponding to the relaying loss Internet;According to second network
Model and the relaying loss function determine the global loss function of second network model;Using the relaying loss function
The parameter of second network model is updated with the global loss function, to obtain the second network mould after updating
Type.
Embodiment seven
The embodiment of the present invention seven provides a kind of computer-readable recording medium, is stored thereon with computer program, the journey
The training method of the network model provided such as all inventive embodiments of the application is provided when sequence is executed by processor:
That is, the program is realized when being executed by processor:When first network model reaches default renewal stop condition, root
First object network model is determined according to the renewal result of the first network model, it is pre- in the first object network model
If insertion relaying loses Internet to determine the second network model after the layer of pond;According to second network model and it is described in
Determine to relay loss function corresponding to the relaying loss Internet after loss Internet;According to second network model and institute
State the global loss function that relaying loss function determines second network model;Using the relaying loss function and described complete
Office's loss function is updated to the parameter of second network model, to obtain the second network model after updating.
Any combination of one or more computer-readable media can be used.Computer-readable medium can be calculated
Machine readable signal medium or computer-readable recording medium.Computer-readable recording medium for example can be --- but it is unlimited
In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than combination.Calculate
The more specifically example (non exhaustive list) of machine readable storage medium storing program for executing includes:Electrical connection with one or more wires, just
Take formula computer disk, hard disk, random access memory (RAM), read-only storage (ROM), erasable type and may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In this document, computer-readable recording medium can any include or store journey
The tangible medium of sequence, the program can be commanded the either device use or in connection of execution system, device.
Computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but
It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be
Any computer-readable medium beyond computer-readable recording medium, the computer-readable medium can send, propagate or
Transmit for by instruction execution system, device either device use or program in connection.
The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but it is unlimited
In --- wireless, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
It can be write with one or more programming languages or its combination for performing the computer that operates of the present invention
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
Also include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
Fully perform, partly perform on the user computer on the user computer, the software kit independent as one performs, portion
Divide and partly perform or performed completely on remote computer or server on the remote computer on the user computer.
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or
Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as carried using Internet service
Pass through Internet connection for business).
Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that
The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes,
Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention
It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also
Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.
Claims (10)
- A kind of 1. training method of network model, it is characterised in that including:When first network model reaches default renewal stop condition, is determined according to the renewal result of the first network model One objective network model, insertion relaying loss Internet is to determine after the default pond layer in the first object network model Second network model;According to corresponding to second network model and the relaying loss Internet determine the relaying loss Internet After loss function;The global loss function of second network model is determined according to second network model and the relaying loss function;The parameter of second network model is updated using the relaying loss function and the global loss function, with Obtain the second network model after renewal.
- 2. according to the method for claim 1, it is characterised in that the default renewal stop condition includes:First network The training precision of model is respectively less than the first default precision threshold in the number of setting;Reaching default renewal stop condition when first network model, determined according to the renewal result of the first network model First object network model, insertion relaying loss Internet is with true after the default pond layer in the first object network model Before fixed second network model, in addition to:Picture to be trained is input in first network model and is trained, the first network model is entered according to training result Row renewal;It is described when first network model reaches default renewal stop condition, it is true according to the renewal result of the first network model Determine first object network model, including:When the first training precision determined according to the training result is respectively less than the first default precision threshold in the number of setting, The renewal result of the first network model last time is recorded as first object network model and the picture to be trained Training result, using the training result of the picture to be trained as the first training result.
- 3. according to the method for claim 2, it is characterised in that described according to second network model and the relaying Loss Internet determines relaying loss function corresponding to the relaying loss Internet, including:The picture for treating first training result is input into second network model to be trained, obtains first training As a result output result of the picture in the relaying loss Internet;Relaying loss function is determined according to the output result.
- 4. according to the method for claim 1, it is characterised in that described to be damaged according to second network model with the relaying The global loss function that function determines second network model is lost, including:Obtain the initial global loss function in the network model;The relaying loss function and the initial global loss function are combined to determine second network model Global loss function.
- 5. according to the method for claim 1, it is characterised in that the application relaying loss function and the global damage The parameter for losing the second network model described in function pair is updated, to obtain the second network model after updating, including:The parameter of each layer before presetting pond layer in second network model is updated using the relaying loss function, Meanwhile the parameter of each layer in second network model is updated using global loss function;The second network model after result obtains renewal is updated according to the parameter of each layer.
- 6. according to the method for claim 1, it is characterised in that the application relaying loss function and the global damage The parameter for losing the second network model described in function pair is updated, after obtaining the second network model after updating, in addition to:The second network model after the renewal is verified that the picture concentrated is trained to obtain training precision to picture;If the training precision is more than the second default precision threshold, stop the renewal to second network model, will be last Renewal result once is as the second objective network model.
- 7. according to the method for claim 1, it is characterised in that the first network model and second network model are equal Including convolutional layer, pond layer and Quan Lian stratum, wherein, the default pond layer is any one layer in the pond layer;Insertion relaying loses Internet to determine the second net after the default pond layer in the first object network model Network model, including:At least one relaying loss Internet is inserted after default pond layer in the first object network model to determine the Two network models.
- A kind of 8. trainer of network model, it is characterised in that including:Second network model determining module, for reaching default renewal stop condition when first network model, according to described The renewal result of one network model determines first object network model, the default pond layer in the first object network model Insertion relaying loses Internet to determine the second network model afterwards;Loss function determining module is relayed, for determining institute according to second network model and the relaying loss Internet State relaying loss function corresponding to relaying loss Internet;Global loss function determining module, for determining described the according to second network model and the relaying loss function The global loss function of two network models;Second network model update module, for the application relaying loss function and the global loss function to described second The parameter of network model is updated, to obtain the second network model after updating.
- 9. a kind of computer equipment, including memory, processor and storage are on a memory and the meter that can run on a processor Calculation machine program, it is characterised in that the side as described in any in claim 1-7 is realized during the computing device described program Method.
- 10. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor The method as described in any in claim 1-7 is realized during execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710993043.8A CN107633242A (en) | 2017-10-23 | 2017-10-23 | Training method, device, equipment and the storage medium of network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710993043.8A CN107633242A (en) | 2017-10-23 | 2017-10-23 | Training method, device, equipment and the storage medium of network model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107633242A true CN107633242A (en) | 2018-01-26 |
Family
ID=61105785
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710993043.8A Pending CN107633242A (en) | 2017-10-23 | 2017-10-23 | Training method, device, equipment and the storage medium of network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107633242A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108776768A (en) * | 2018-04-19 | 2018-11-09 | 广州视源电子科技股份有限公司 | Image-recognizing method and device |
CN109766872A (en) * | 2019-01-31 | 2019-05-17 | 广州视源电子科技股份有限公司 | Image-recognizing method and device |
CN109918237A (en) * | 2019-04-01 | 2019-06-21 | 北京中科寒武纪科技有限公司 | Abnormal network layer determines method and Related product |
CN110097188A (en) * | 2019-04-30 | 2019-08-06 | 科大讯飞股份有限公司 | A kind of model training method, working node and parameter update server |
CN110263921A (en) * | 2019-06-28 | 2019-09-20 | 深圳前海微众银行股份有限公司 | A kind of training method and device of federation's learning model |
CN110334735A (en) * | 2019-05-31 | 2019-10-15 | 北京奇艺世纪科技有限公司 | Multitask network generation method, device, computer equipment and storage medium |
WO2019228358A1 (en) * | 2018-05-31 | 2019-12-05 | 华为技术有限公司 | Deep neural network training method and apparatus |
CN111178115A (en) * | 2018-11-12 | 2020-05-19 | 北京深醒科技有限公司 | Training method and system of object recognition network |
WO2020125251A1 (en) * | 2018-12-17 | 2020-06-25 | 深圳前海微众银行股份有限公司 | Federated learning-based model parameter training method, device, apparatus, and medium |
CN113554097A (en) * | 2021-07-26 | 2021-10-26 | 北京市商汤科技开发有限公司 | Model quantization method and device, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106778543A (en) * | 2016-11-29 | 2017-05-31 | 北京小米移动软件有限公司 | Single face detecting method, device and terminal |
CN107271925A (en) * | 2017-06-26 | 2017-10-20 | 湘潭大学 | The level converter Fault Locating Method of modularization five based on depth convolutional network |
-
2017
- 2017-10-23 CN CN201710993043.8A patent/CN107633242A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106778543A (en) * | 2016-11-29 | 2017-05-31 | 北京小米移动软件有限公司 | Single face detecting method, device and terminal |
CN107271925A (en) * | 2017-06-26 | 2017-10-20 | 湘潭大学 | The level converter Fault Locating Method of modularization five based on depth convolutional network |
Non-Patent Citations (1)
Title |
---|
SHENXIAOLU1984: "【人体姿态】Convolutional Pose Machines", 《CSDN》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108776768A (en) * | 2018-04-19 | 2018-11-09 | 广州视源电子科技股份有限公司 | Image-recognizing method and device |
WO2019228358A1 (en) * | 2018-05-31 | 2019-12-05 | 华为技术有限公司 | Deep neural network training method and apparatus |
CN111178115B (en) * | 2018-11-12 | 2024-01-12 | 北京深醒科技有限公司 | Training method and system for object recognition network |
CN111178115A (en) * | 2018-11-12 | 2020-05-19 | 北京深醒科技有限公司 | Training method and system of object recognition network |
WO2020125251A1 (en) * | 2018-12-17 | 2020-06-25 | 深圳前海微众银行股份有限公司 | Federated learning-based model parameter training method, device, apparatus, and medium |
CN109766872A (en) * | 2019-01-31 | 2019-05-17 | 广州视源电子科技股份有限公司 | Image-recognizing method and device |
CN109918237A (en) * | 2019-04-01 | 2019-06-21 | 北京中科寒武纪科技有限公司 | Abnormal network layer determines method and Related product |
CN109918237B (en) * | 2019-04-01 | 2022-12-09 | 中科寒武纪科技股份有限公司 | Abnormal network layer determining method and related product |
CN110097188A (en) * | 2019-04-30 | 2019-08-06 | 科大讯飞股份有限公司 | A kind of model training method, working node and parameter update server |
CN110334735B (en) * | 2019-05-31 | 2022-07-08 | 北京奇艺世纪科技有限公司 | Multitask network generation method and device, computer equipment and storage medium |
CN110334735A (en) * | 2019-05-31 | 2019-10-15 | 北京奇艺世纪科技有限公司 | Multitask network generation method, device, computer equipment and storage medium |
CN110263921B (en) * | 2019-06-28 | 2021-06-04 | 深圳前海微众银行股份有限公司 | Method and device for training federated learning model |
CN110263921A (en) * | 2019-06-28 | 2019-09-20 | 深圳前海微众银行股份有限公司 | A kind of training method and device of federation's learning model |
CN113554097A (en) * | 2021-07-26 | 2021-10-26 | 北京市商汤科技开发有限公司 | Model quantization method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107633242A (en) | Training method, device, equipment and the storage medium of network model | |
CN109766872B (en) | Image recognition method and device | |
CN111709409B (en) | Face living body detection method, device, equipment and medium | |
US20190295223A1 (en) | Aesthetics-guided image enhancement | |
CN111325115B (en) | Cross-modal countervailing pedestrian re-identification method and system with triple constraint loss | |
WO2018028546A1 (en) | Key point positioning method, terminal, and computer storage medium | |
JP6159489B2 (en) | Face authentication method and system | |
CN111723786B (en) | Method and device for detecting wearing of safety helmet based on single model prediction | |
US10726289B2 (en) | Method and system for automatic image caption generation | |
CN110348387B (en) | Image data processing method, device and computer readable storage medium | |
CN107240395A (en) | A kind of acoustic training model method and apparatus, computer equipment, storage medium | |
CN109583501A (en) | Picture classification, the generation method of Classification and Identification model, device, equipment and medium | |
CN109214298B (en) | Asian female color value scoring model method based on deep convolutional network | |
CN108182409A (en) | Biopsy method, device, equipment and storage medium | |
CN106897746A (en) | Data classification model training method and device | |
CN107992807B (en) | Face recognition method and device based on CNN model | |
CN108389224A (en) | Image processing method and device, electronic equipment and storage medium | |
CN110222780A (en) | Object detecting method, device, equipment and storage medium | |
CN106650670A (en) | Method and device for detection of living body face video | |
CN109919252A (en) | The method for generating classifier using a small number of mark images | |
KR102285665B1 (en) | A method, system and apparatus for providing education curriculum | |
CN107609463A (en) | Biopsy method, device, equipment and storage medium | |
US11734570B1 (en) | Training a network to inhibit performance of a secondary task | |
CN113239914B (en) | Classroom student expression recognition and classroom state evaluation method and device | |
CN110222607A (en) | The method, apparatus and system of face critical point detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180126 |