CN107633242A - Training method, device, equipment and the storage medium of network model - Google Patents

Training method, device, equipment and the storage medium of network model Download PDF

Info

Publication number
CN107633242A
CN107633242A CN201710993043.8A CN201710993043A CN107633242A CN 107633242 A CN107633242 A CN 107633242A CN 201710993043 A CN201710993043 A CN 201710993043A CN 107633242 A CN107633242 A CN 107633242A
Authority
CN
China
Prior art keywords
network model
relaying
loss function
loss
renewal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710993043.8A
Other languages
Chinese (zh)
Inventor
张玉兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN201710993043.8A priority Critical patent/CN107633242A/en
Publication of CN107633242A publication Critical patent/CN107633242A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the invention discloses a kind of training method of network model, device, equipment and storage medium, methods described includes:When first network model reaches default renewal stop condition, first object network model is determined according to the renewal result of the first network model, insertion relaying loses Internet to determine the second network model after the default pond layer in the first object network model;Loss function is relayed according to corresponding to second network model and the relaying loss Internet determine the relaying loss Internet;The global loss function of second network model is determined according to second network model and the relaying loss function;The parameter of second network model is updated using the relaying loss function and the global loss function, to obtain the second network model after updating.Solves the situation of network model training process feature over-fitting on the middle and senior level and low-level feature poor fitting so that network model training is more thorough, the degree of accuracy is higher.

Description

Training method, device, equipment and the storage medium of network model
Technical field
The present invention relates to deep learning field, more particularly to a kind of training method of network model, device, equipment and storage Medium.
Background technology
Existing human face recognition model is normally based on deep learning algorithm model and trains what is obtained, deep learning algorithm mould The quality of type training influences the result of recognition of face.
The deep learning model for recognition of face is substantially that one is added after top (characteristic layer) of network at present Individual or multiple loss function layers, for training a parameter for renewal deep learning network model.Prior art is typically in network Characteristic layer after be trained plus one or more loss functions, but in the training process due to parameter bang path too Grow and cause the situation of high-level characteristic over-fitting and middle level features poor fitting, cause the undertrained thorough of whole network, so as to Cause the parameter in depth network learning model network intermediate layer can not be updated well, and then cause to train the model come Result in the recognition of face of reality is undesirable.
The content of the invention
The embodiment of the present invention provides a kind of training method of network model, device, equipment and storage medium, solves network The situation of high-level characteristic over-fitting caused by parameter bang path is oversize during model training and low-level feature poor fitting so that Acquisition network model training more thoroughly, the degree of accuracy it is higher.
In a first aspect, the embodiments of the invention provide a kind of training method of network model, methods described includes:
It is true according to the renewal result of the first network model when first network model reaches default renewal stop condition Determine first object network model, after the default pond layer in the first object network model insertion relaying loss Internet with Determine the second network model;
Determine that the relaying loss Internet is corresponding according to second network model and the relaying loss Internet Relaying loss function;
Determine that the global of second network model loses according to second network model and the relaying loss function Function;
The parameter of second network model is carried out more using the relaying loss function and the global loss function Newly, the second network model after being updated with acquisition.
Second aspect, the embodiment of the present invention additionally provide a kind of trainer of network model, and described device includes:
Second network model determining module, for reaching default renewal stop condition when first network model, according to institute The renewal result for stating first network model determines first object network model, the default pond in the first object network model Insertion relaying loses Internet to determine the second network model after changing layer;
Loss function determining module is relayed, for true according to second network model and the relaying loss Internet Relaying loss function corresponding to the fixed relaying loss Internet;
Global loss function determining module, for determining institute according to second network model and the relaying loss function State the global loss function of the second network model;
Second network model update module, for the application relaying loss function and the global loss function to described The parameter of second network model is updated, to obtain the second network model after updating.
The third aspect, the embodiment of the present invention additionally provide a kind of computer equipment, including memory, processor and are stored in Realized on memory and the computer program that can run on a processor, during the computing device described program as the present invention is real Apply the training method of any described network model in example.
Fourth aspect, the embodiment of the present invention additionally provide a kind of computer-readable recording medium, are stored thereon with computer Program, the training method of the network model as described in any in the embodiment of the present invention is realized when the program is executed by processor.
In the embodiment of the present invention, by when first network model reach it is default renewal stop condition when, according to described The renewal result of first network model determines first object network model, the default pond in the first object network model Insertion relaying loss Internet is lost with determining the second network model according to second network model and the relaying after layer Internet determines relaying loss function corresponding to the relaying loss Internet, then according to second network model and described Relaying loss function determines the global loss function of second network model, using the relaying loss function and the overall situation Loss function is updated to the parameter of second network model, to obtain the second network model after updating.Solves net The situation of high-level characteristic over-fitting caused by parameter bang path is oversize during network model training and low-level feature poor fitting, makes Must obtain network model training more thoroughly, the degree of accuracy it is higher.
Brief description of the drawings
Fig. 1 a are a kind of flow charts of the training method of network model in the embodiment of the present invention one;
Fig. 1 b are a kind of schematic diagrames in the intermediate layer for first network model that the embodiment of the present invention one is applicable;
Fig. 1 c are a kind of centres of the second network model of combination relaying loss function that the embodiment of the present invention one is applicable The schematic diagram of layer;
Fig. 2 a are a kind of flow charts of the training method of network model in the embodiment of the present invention two;
Fig. 2 b are the signals of the position of each key point in a kind of face critical point detection that the embodiment of the present invention two is applicable Figure;
Fig. 2 c are the signals of the position of each key point after a kind of face key point that the embodiment of the present invention two is applicable is alignd Figure;
Fig. 3 is a kind of flow chart of the training method of network model in the embodiment of the present invention three;
Fig. 4 is a kind of flow chart of the training method of network model in the embodiment of the present invention four;
Fig. 5 is a kind of structural representation of the trainer of network model in the embodiment of the present invention five;
Fig. 6 is a kind of structural representation of computer equipment in the embodiment of the present invention six.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
Embodiment one
Fig. 1 a are a kind of flow chart of the training method for network model that the embodiment of the present invention one provides, and the present embodiment can fit The situation of the oversize demand optimization network model of parameter bang path in network model, this method can be embodiment by the present invention A kind of trainer of the network model provided is performed, and the device can be realized by the way of software and/or hardware.Reference chart 1a, this method specifically may include steps of:
S110, reach default renewal stop condition when first network model, according to the renewal of the first network model As a result first object network model is determined, relaying loss net is inserted after the default pond layer in the first object network model Network layers are to determine the second network model.
Specifically, in the training process to first network model, one default renewal stop condition of setting is realized, After first network model reaches the renewal stop condition of setting, first network model stops renewal.Wherein, first network model Renewal process is the difference of the output valve and input value according to first network model, to update the model of first network model ginseng Number, is updated by the undated parameter of first network model to first network model, that is, often updating a first network mould The parameter of type just can obtain a new first network model.After renewal stops, obtained first network model is the first mesh Mark network model.
Wherein, insertion relaying loses Internet to determine the second net after the default pond layer in first object network model Network model.Optionally, the pond layer that pond layer can be the centre position in whole pond layers is preset, for example, first network mould Type one shares 8 pond layers, then can be using the 4th pond layer as default pond layer.
S120, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet Corresponding relaying loss function.
Specifically, the difference that the output of Internet and the input of the second network model are lost according to relaying determines relaying damage Lose relaying loss function corresponding to Internet.Wherein, relaying loss function is characterized in the relaying loss output of Internet and defeated Enter the extent of damage compared.Optionally, the input of the second network model can be treated picture to be trained etc..
S130, the overall situation of second network model determined according to second network model and the relaying loss function Loss function.
Specifically, global loss function shows the difference in input and the output of the second network model, global loss function It is to be superimposed on the basis of former global loss function with relaying loss function.Optionally, the quantity of loss function is relayed Match with the quantity of relaying loss Internet.
Optionally, it is described according to second network model and it is described relaying loss function determine second network model Global loss function, including:Obtain the initial global loss function in the network model;By it is described relaying loss function with The initial global loss function is combined to determine the global loss function of second network model.
Wherein, the initial global loss function in network model is obtained, will initial global loss function and relaying loss letter Number is overlapped to obtain the global loss function of the second network model, and superposition rule can be according to superposition rule set in advance It is determined that can give different weighted values to each relaying loss function.In a specific example, if adding in 3 After loss Internet, loss function is relayed because each relaying loss Internet corresponds to one, then 3 relaying loss Internets 3 relaying loss functions are had, the weighted value of 3 relaying loss functions is 0.2,0.2,0.3 respectively.
In a specific example, if initial global loss function is Loss=Soft maxLoss+ λ1CenterLoss, relaying loss function take RelayLoss, then are not determined according to the second network with the relaying loss function The global loss function of the second network model be Loss=Soft maxLoss+ λ1CenterLoss+λ2RelayLoss, it is optional , RelayLoss can select different type function, can take herein SoftmaxLoss, TripletLoss, ContrastiveLoss, λ1Can be with value 0.008, λ2Can be with value 0.1.Such function selection and parameter selection improve mould The precision of type.
S140, using it is described relaying loss function and the global loss function parameter of second network model is entered Row renewal, to obtain the second network model after updating.
Specifically, using relaying loss function and global loss function is updated to the parameter of the second network model, profit The second network model after renewal is determined with the parameter after renewal.
In the embodiment of the present invention, by when first network model reach it is default renewal stop condition when, according to described The renewal result of first network model determines first object network model, the default pond in the first object network model Insertion relaying loss Internet is lost with determining the second network model according to second network model and the relaying after layer Internet determines relaying loss function corresponding to the relaying loss Internet, then according to second network model and described Relaying loss function determines the global loss function of second network model, using the relaying loss function and the overall situation Loss function is updated to the parameter of second network model, to obtain the second network model after updating.Solves net The situation of high-level characteristic over-fitting caused by parameter bang path is oversize during network model training and low-level feature poor fitting, makes Must obtain network model training more thoroughly, the degree of accuracy it is higher.
Optionally, the first network model and second network model include convolutional layer, pond layer and connect rank entirely Layer, wherein, the default pond layer is any one layer in the pond layer.
Wherein, first network model and the second network model include convolutional layer, pond layer and full articulamentum, network model Formed by topology information and configuration parameter information architecture, topology information includes the number of plies of convolutional layer, the layer of pond layer Topology connection order between the number of plies of several, full articulamentum and each layer;Configuration parameter information includes:The convolution step of each convolutional layer The nerve of long and convolution kernel size and number, the size of the pond step-length of each pond layer and pond window and each full articulamentum First quantity.Default pond layer and any one layer in the pond layer.
Insertion relaying loss Internet is to determine the after the default pond layer in the first object network model Two network models, including:At least one relaying loss net is inserted after default pond layer in the first object network model Network layers are to determine the second network model.
Specifically, at least one relaying loss network is inserted after default pond layer in the first object network model Layer is to determine the second network model.Optionally, default pond layer can be close to intermediate layer position in whole network model The pond layer at place.Because the network model does not limit the specific form for using network model, in a specific example, only Illustrated using only a network model used in recognition of face, Fig. 1 b show a kind of centre of first network model Layer, Fig. 1 c show a kind of intermediate layer of second network model of combination relaying loss function, wherein, because length is limited, In Fig. 1 b and Fig. 1 c, the structure of overall network model is not shown, illustrate only the structure of the network model in intermediate layer.It is optional , in Figure 1b, c1 can be expressed as:Center_loss_finetune_vgg_face_dataset, s1 can be expressed as Softmax_loss, f1 can be expressed as:The network model that fc6_finetune_vgg_face_dataset, Fig. 1 b are used exists Occurs the problem of poor fitting during training so that whole network training is insufficient, therefore, adds relaying loss network in figure 1 c Layer, here only exemplified by adding a relaying loss Internet.In figure 1 c, 150 relaying to add loses Internet, wherein P_d can be expressed as:Pool4_relay_vgg_face_dataset, rs can be expressed as:reley_loss.With reference to figure 1b and Fig. 1 c, a relaying loss Internet can be inserted afterwards in pond layer 4 (pool4), such as the p_d and rs in Fig. 1 c, wherein, insertion Relaying loss Internet quantity be not construed as limiting, according to insert relaying loss Internet after determine the second network model.It is real The determination of the second network model after adding relaying loss function is showed.Solve the problems, such as middle layer network poor fitting, so as to So that model training is more thorough, the degree of accuracy that model training is improved after relaying loss function is added.
Embodiment two
Fig. 2 a are a kind of flow chart of the training method for network model that the embodiment of the present invention two provides, and the present embodiment is upper State and realize on the basis of embodiment.With reference to figure 2a, this method specifically may include steps of:
S210, picture to be trained is input in first network model be trained, according to training result to described first Network model is updated.
Wherein, picture to be trained is input in first network model and be trained, according to training result to first network Model is updated.Optionally, picture to be trained can be the picture in special scenes, and specific scene can be VTM (Video Teller Machine, remote teller machine), jeweler's shop member identification etc..Face is gathered in specific scene to shine Piece, video pictures are gathered using camera, and by network transmission and data wire storage in computer systems.
Face datection is carried out to the face picture collected, face picture is extracted and is stored in computer equipment, Then face picture is labeled.It should be noted that when being labeled to face picture, it is necessary to reference to manually to detection And the face picture extracted is classified and marked, the human face photo for belonging to identical people is put together and marked Note.In a specific example, it is assumed that total number of persons is N, and everyone has M pictures, and in specific scene, everyone has M Pictures it may is that camera had photographed the photo or difference of the different angle at everyone same moment The photo of the identical or different action at moment.
Face alignment operation is carried out to face picture, the facial angle and face location in face picture be it is inconsistent, In order to ensure to extract stable feature and obtain preferable recognition of face effect, it is necessary to carry out key point alignment to face picture Operation, to remove the influence that facial angle is brought to recognition of face.Wherein, key point includes the positions such as eyes, nose and the corners of the mouth. Fig. 2 b show a kind of position of each key point in face critical point detection, and Fig. 2 c show that a kind of face key point is each after aliging The position of key point, in figure 2b, 271 represent that eye key point, 272 represent that nose key point, 273 represent mouth key point; In figure 2 c, 281 represent that the eye key point after alignment, 282 represent the nose key point after alignment, after 283 expression alignment Mouth key point.
Training set picture is extracted, is believed being labeled and having been randomly selected in the photo of face alignment comprising face identity The face picture of breath is trained, and it is as follows to extract each group of training sample:
Face picture img_1, img_1 identity information (classification number)
……
Face picture img_N, img_N identity information (classification number)
Wherein, face picture img_1 refers to the store path of the 1st face picture, such as C:\Program Files\ Adobe, classification number refer to people's said tag set in advance for each participation experiment, and classification number is typically since 0, such as 0,1,2,3 ... ..., classification number only table takes over the identity at family for use.
Optionally, picture to be trained is the face picture by aforesaid operations processing.
S220, the first training precision determined according to the training result are respectively less than the first default essence in the number of setting When spending threshold value, the renewal result of the first network model last time is recorded as first object network model and described waits to instruct Practice the training result of picture, using the training result of the picture to be trained as the first training result, in the first object net Insertion relaying loses Internet to determine the second network model after default pond layer in network model.
Specifically, will be after training picture be input to and is trained in first network model, according to training result to institute First network model is stated to be updated.Specific renewal process is as follows, updates network mould by updating the parameter of network model Type, if corresponding parameters all same in network model, network model is identical, and the picture of training set is input into the first net It is trained in network model, often trains once one training result of output, compares the picture of training result and the training set of input Data difference, determine the first training precision;Then the parameter of first network model is updated with reality according to training result Now first network model is updated, by the first network model after the training result input renewal of last training set picture Training result is obtained again, continues to be updated the parameter of first network model according to training result to realize to first network Model modification, and the first training precision is determined according to training result.Wherein, the first training precision is the standard according to setting to defeated The picture of the training set entered carries out computing acquisition with training result, and each pair first network model modification once, obtains a training Precision, that is, training precision is also changing with the renewal of first network model.
When the first training precision is respectively less than the first default precision threshold in the number of setting, first network model is recorded The renewal result of last time is as first object network model and the training result of picture to be trained, by the picture to be trained Training result as the first training result.In a specific example, the number of setting can be 4 times, if that is, 4 times Training precision be both less than the first default precision threshold, then show to be in one using the training result of first network model Stable state.
S230, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet Corresponding relaying loss function.
S240, the overall situation of second network model determined according to second network model and the relaying loss function Loss function.
S250, using it is described relaying loss function and the global loss function parameter of second network model is entered Row renewal, to obtain the second network model after updating.
In the embodiment of the present invention, it is trained by the way that picture to be trained is input in first network model, according to training As a result the first network model is updated, according to the training result determine the first training precision setting number In when being respectively less than the first default precision threshold, record the renewal result of the first network model last time as first object The training result of network model and the picture to be trained, tied the training result of the picture to be trained as the first training Fruit.Realize the determination to first object network model and treat the acquisition of the training result of training picture.
Optionally, it is described according to second network model and it is described relaying loss Internet determine it is described relaying loss Relaying loss function corresponding to Internet, including:The picture for treating first training result is input to the second network mould Type is trained, and obtains output result of the picture in the relaying loss Internet of first training result;According to described Output result determines relaying loss function.
Specifically, it is determined that after first object network model, by instruction of the picture training set in first object network model White silk result, which continues to be input in the second network model, to be trained, and the picture for obtaining the first training result loses Internet in relaying Output result, according to the output result determine relaying loss function.In a specific example, relaying loss Internet After the layer of default pond, it can be pool4 to preset pond layer.Compare output result and the first training of relaying loss Internet As a result, it is determined that relaying loss function, wherein, relaying loss function characterizes the first of the output for relaying loss Internet and input The difference of the picture of training result.Realize the determination to relaying loss function.
Embodiment three
Fig. 3 is a kind of flow chart of the training method for network model that the embodiment of the present invention three provides, and the present embodiment is upper On the basis of stating embodiment, to " the application relaying loss function and the global loss function are to second network model Parameter be updated, with obtain update after the second network model " be optimized.With reference to figure 3, this method can specifically wrap Include following steps:
S310, reach default renewal stop condition when first network model, according to the renewal of the first network model As a result first object network model is determined, relaying loss net is inserted after the default pond layer in the first object network model Network layers are to determine the second network model.
S320, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet Corresponding relaying loss function.
S330, the overall situation of second network model determined according to second network model and the relaying loss function Loss function.
S340, the parameter using the relaying loss function to each layer before default pond layer in second network model It is updated, meanwhile, the parameter of each layer in second network model is updated using global loss function.
Specifically, each layer parameter before the default pond layer in the second network model is entered using loss function is relayed Row renewal, in a specific example, so that network model shares 30 layers as an example, it is pool4 to preset pond layer, is whole network The 12nd layer of layer, then relay loss function and each layer parameter before the 12nd layer and 12 layers is updated.Meanwhile using the overall situation Loss function is updated to each layer parameter in the second network model, wherein, global loss function renewal is whole network The parameter of each layer of model, in this specific example, as 30 layers and parameter before.
Exemplary, after global loss function Loss is obtained, network model is tried to achieve according to Loss and chain type Rule for derivation In each parameter Grad, then according to stochastic gradient descent method update model parameter.
S350, the second network model after result obtains renewal is updated according to the parameter of each layer.
Specifically, the model parameter of each layer after renewal is replaced into the model parameter before renewal, to obtain the after updating Two network models.
In the embodiment of the present invention, before relaying loss function to default pond layer in second network model by application The parameter of each layer is updated, meanwhile, the parameter of each layer in second network model is carried out more using global loss function Newly, the second network model after result obtains renewal is updated according to the parameter of each layer.Realize by model parameter more Newly the second network model is updated.
Example IV
Fig. 4 is a kind of flow chart of the training method for network model that the embodiment of the present invention four provides, and the present embodiment is upper State and realize on the basis of embodiment.With reference to figure 4, this method specifically may include steps of:
S410, reach default renewal stop condition when first network model, according to the renewal of the first network model As a result first object network model is determined, relaying loss net is inserted after the default pond layer in the first object network model Network layers are to determine the second network model.
S420, according to second network model and it is described relaying loss Internet determine it is described relaying loss Internet Corresponding relaying loss function.
S430, the overall situation of second network model determined according to second network model and the relaying loss function Loss function.
S440, using it is described relaying loss function and the global loss function parameter of second network model is entered Row renewal, to obtain the second network model after updating.
S450, the picture that the second network model after the renewal is concentrated to picture checking is trained and trained with obtaining Precision.
Specifically, the second network model after renewal is verified to the picture concentrated is trained to obtain training essence to picture Degree, optionally, the selection rule of picture checking collection are as follows:One shared N number of people participates in experiment, and wherein K people take part in training set Making, then by the photo of remaining N-K people be used for make checking collection.Checking collects the human face photo checking pair by randomly selecting Composition, decimation rule are as follows:
Positive sample pair:The a pictures of n-th people and the b pictures of n-th of people;
……
Negative sample pair:The c pictures of i-th people and the d pictures of j-th of people;
Wherein, any two photos in human face photo of the positive sample to referring to same person, negative sample is to being different Any two photos in the human face photo of people.In a specific example, according to international standard LFW (Labeled Faces In the Wild) rule calculates measuring accuracy, take herein positive sample to 3000 with negative sample to 3000, test order is:Will just Two photos of sample centering are judged as same person, then correct judgment, another xi=1, two photos of negative sample centering are sentenced Break not to be same person, then correct judgment xi=1, other situations are then considered misjudgment, i.e. xi=0.The table of measuring accuracy It is as follows up to formula:A=A represents measuring accuracy.
If S460, the training precision are more than the second default precision threshold, stop to second network model more Newly, using the renewal result of last time as the second objective network model.
Specifically, the end condition of the second network model training is:Training precision is more than the second default precision threshold, one In individual specific example, the computational methods of training precision can apply international standard LEW rules to calculate.If training precision is more than second Default precision threshold, then stop being updated the second network model, using the renewal result of last time as the second target network Network model.
In the embodiment of the present invention, by the way that the second network model after renewal is verified into the picture concentrated is trained to picture To obtain training precision, when the training precision is more than the second default precision threshold, stop to second network model Renewal, using the renewal result of last time as the second objective network model.Realize pre- by training of judgement precision and second If the determination of relation pair the second objective network model of precision threshold.
On the basis of above-mentioned technical proposal, the application scenarios of the embodiment of the present invention can be used in face recognition algorithms In bank member identification project, face picture is gathered under true application scenarios, then these face pictures are detected, be right Neat operation, and corresponding face training set is made, train face recognition algorithms model using previously described method, i.e., it is our The second objective network model in case, so as to obtain the people in bank member identifies scene with high discrimination and recognition effect Face recognizer, this method, which can preferably reach, " to be reduced the change between same person, while increases between different people not Effect together ".
Second network model is applied in specific application scenarios, can be by comparing face characteristic feat-ID, and adopt Recognition of face flow is carried out with Euclidean distance.Verified by actual scene, can prove to combine the people of relaying loss Internet Face recognizer (with reference to figure 1c) can be than the face recognition algorithms of the multitask deep learning network of common practice (with reference to figure 1b) Accuracy rate is higher.It should be noted that the example above, by taking recognition of face as an example, the program can expand to the knowledge of in general image Not in.
Embodiment five
Fig. 5 be the present invention be embodiment five provide a kind of network model trainer structural representation, the device It is adapted for carrying out a kind of training method for network model that the embodiment of the present invention is supplied to.As shown in figure 5, the device specifically can be with Including:
Second network model determining module 510, for reaching default renewal stop condition when first network model, according to The renewal result of the first network model determines first object network model, default in the first object network model Insertion relaying loses Internet to determine the second network model after the layer of pond;
Loss function determining module 520 is relayed, for according to second network model and the relaying loss network Layer determines relaying loss function corresponding to the relaying loss Internet;
Global loss function determining module 530, for true according to second network model and the relaying loss function The global loss function of fixed second network model;
Second network model update module 540, for the application relaying loss function and the global loss function pair The parameter of second network model is updated, to obtain the second network model after updating.
Further, the default renewal stop condition includes:Time of the training precision of first network model in setting The first default precision threshold is respectively less than in number;
Described device also includes:
First network model modification module, for reaching default renewal stop condition when first network model, according to The renewal result of the first network model determines first object network model, default in the first object network model Insertion relaying loses Internet so that before determining the second network model, picture to be trained is input into first network mould after the layer of pond It is trained in type, the first network model is updated according to training result;
Second network model determining module 520 is specifically used for being set according to the first training precision that the training result determines When the first default precision threshold is respectively less than in fixed number, the renewal result conduct of the first network model last time is recorded The training result of first object network model and the picture to be trained, using the training result of the picture to be trained as first Training result.
Further, loss function determining module 520 is relayed to be specifically used for:
The picture for treating first training result is input into second network model to be trained, obtains described first Output result of the picture of training result in the relaying loss Internet;
Relaying loss function is determined according to the output result.
Further, global loss function determining module 530 is specifically used for:
Obtain the initial global loss function in the network model;
The relaying loss function and the initial global loss function are combined to determine the second network mould The global loss function of type.
Further, the second network model update module 540 is specifically used for:
The parameter of each layer before pond layer is preset in second network model is carried out using the relaying loss function Renewal, meanwhile, the parameter of each layer in second network model is updated using global loss function;
The second network model after result obtains renewal is updated according to the parameter of each layer.
Further, in addition to:
Training precision acquisition module, for applying the relaying loss function and the global loss function to institute described The parameter for stating the second network model is updated, after obtaining the second network model after updating,
The second network model after the renewal is verified that the picture concentrated is trained to obtain training precision to picture;
If the training precision is more than the second default precision threshold, stop the renewal to second network model, will The renewal result of last time is as the second objective network model.
Further, the first network model and second network model include convolutional layer, pond layer and connected entirely Stratum, wherein, the default pond layer is any one layer in the pond layer;
Second network model determining module 510 is specifically used for:
At least one relaying loss Internet is inserted after default pond layer in the first object network model with true Fixed second network model.
The trainer of network model provided in an embodiment of the present invention can perform the network that any embodiment of the present invention provides The coaching method of model, possess the corresponding functional module of execution method and beneficial effect.
Embodiment six
Fig. 6 is a kind of structural representation for computer equipment that the embodiment of the present invention six provides.Fig. 6 is shown suitable for being used for Realize the block diagram of the exemplary computer device 12 of embodiment of the present invention.The computer equipment 12 that Fig. 6 is shown is only one Example, any restrictions should not be brought to the function and use range of the embodiment of the present invention.
As shown in fig. 6, computer equipment 12 is showed in the form of universal computing device.The component of computer equipment 12 can be with Including but not limited to:One or more processor or processing unit 16, system storage 28, connect different system component The bus 18 of (including system storage 28 and processing unit 16).
Bus 18 represents the one or more in a few class bus structures, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.Lift For example, these architectures include but is not limited to industry standard architecture (ISA) bus, MCA (MAC) Bus, enhanced isa bus, VESA's (VESA) local bus and periphery component interconnection (PCI) bus.
Computer equipment 12 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by The usable medium that computer equipment 12 accesses, including volatibility and non-volatile media, moveable and immovable medium.
System storage 28 can include the computer system readable media of form of volatile memory, such as arbitrary access Memory (RAM) 30 and/or cache memory 32.Computer equipment 12 may further include it is other it is removable/can not Mobile, volatile/non-volatile computer system storage medium.Only as an example, storage system 34 can be used for read-write not Movably, non-volatile magnetic media (Fig. 6 is not shown, is commonly referred to as " hard disk drive ").Although not shown in Fig. 6, can with There is provided for the disc driver to may move non-volatile magnetic disk (such as " floppy disk ") read-write, and to removable non-volatile The CD drive of CD (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driving Device can be connected by one or more data media interfaces with bus 18.Memory 28 can include at least one program and produce Product, the program product have one group of (for example, at least one) program module, and these program modules are configured to perform of the invention each The function of embodiment.
Program/utility 40 with one group of (at least one) program module 42, such as memory 28 can be stored in In, such program module 42 includes --- but being not limited to --- operating system, one or more application program, other programs Module and routine data, the realization of network environment may be included in each or certain combination in these examples.Program mould Block 42 generally performs function and/or method in embodiment described in the invention.
Computer equipment 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 Deng) communication, the equipment communication interacted with the computer equipment 12 can be also enabled a user to one or more, and/or with making Obtain any equipment that the computer equipment 12 can be communicated with one or more of the other computing device (such as network interface card, modulatedemodulate Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, computer equipment 12 may be used also To pass through network adapter 20 and one or more network (such as LAN (LAN), wide area network (WAN) and/or public network Network, such as internet) communication.As illustrated, network adapter 20 is led to by bus 18 and other modules of computer equipment 12 Letter.It should be understood that although not shown in Fig. 6, computer equipment 12 can be combined and use other hardware and/or software module, bag Include but be not limited to:Microcode, device driver, redundant processing unit, external disk drive array, RAID system, magnetic tape drive Device and data backup storage system etc..
Processing unit 16 is stored in program in system storage 28 by operation, so as to perform various function application and Data processing, such as realize the training method for the network model that the embodiment of the present invention is provided:
That is, the processing unit is realized when performing described program:Stop when first network model reaches default renewal Condition, first object network model is determined according to the renewal result of the first network model, in the first object network mould Insertion relaying loses Internet to determine the second network model after default pond layer in type;According to second network model with And the relaying loss Internet determines relaying loss function corresponding to the relaying loss Internet;According to second network Model and the relaying loss function determine the global loss function of second network model;Using the relaying loss function The parameter of second network model is updated with the global loss function, to obtain the second network mould after updating Type.
Embodiment seven
The embodiment of the present invention seven provides a kind of computer-readable recording medium, is stored thereon with computer program, the journey The training method of the network model provided such as all inventive embodiments of the application is provided when sequence is executed by processor:
That is, the program is realized when being executed by processor:When first network model reaches default renewal stop condition, root First object network model is determined according to the renewal result of the first network model, it is pre- in the first object network model If insertion relaying loses Internet to determine the second network model after the layer of pond;According to second network model and it is described in Determine to relay loss function corresponding to the relaying loss Internet after loss Internet;According to second network model and institute State the global loss function that relaying loss function determines second network model;Using the relaying loss function and described complete Office's loss function is updated to the parameter of second network model, to obtain the second network model after updating.
Any combination of one or more computer-readable media can be used.Computer-readable medium can be calculated Machine readable signal medium or computer-readable recording medium.Computer-readable recording medium for example can be --- but it is unlimited In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than combination.Calculate The more specifically example (non exhaustive list) of machine readable storage medium storing program for executing includes:Electrical connection with one or more wires, just Take formula computer disk, hard disk, random access memory (RAM), read-only storage (ROM), erasable type and may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In this document, computer-readable recording medium can any include or store journey The tangible medium of sequence, the program can be commanded the either device use or in connection of execution system, device.
Computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium beyond computer-readable recording medium, the computer-readable medium can send, propagate or Transmit for by instruction execution system, device either device use or program in connection.
The program code included on computer-readable medium can be transmitted with any appropriate medium, including --- but it is unlimited In --- wireless, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
It can be write with one or more programming languages or its combination for performing the computer that operates of the present invention Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Also include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with Fully perform, partly perform on the user computer on the user computer, the software kit independent as one performs, portion Divide and partly perform or performed completely on remote computer or server on the remote computer on the user computer. Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as carried using Internet service Pass through Internet connection for business).
Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims (10)

  1. A kind of 1. training method of network model, it is characterised in that including:
    When first network model reaches default renewal stop condition, is determined according to the renewal result of the first network model One objective network model, insertion relaying loss Internet is to determine after the default pond layer in the first object network model Second network model;
    According to corresponding to second network model and the relaying loss Internet determine the relaying loss Internet After loss function;
    The global loss function of second network model is determined according to second network model and the relaying loss function;
    The parameter of second network model is updated using the relaying loss function and the global loss function, with Obtain the second network model after renewal.
  2. 2. according to the method for claim 1, it is characterised in that the default renewal stop condition includes:First network The training precision of model is respectively less than the first default precision threshold in the number of setting;
    Reaching default renewal stop condition when first network model, determined according to the renewal result of the first network model First object network model, insertion relaying loss Internet is with true after the default pond layer in the first object network model Before fixed second network model, in addition to:
    Picture to be trained is input in first network model and is trained, the first network model is entered according to training result Row renewal;
    It is described when first network model reaches default renewal stop condition, it is true according to the renewal result of the first network model Determine first object network model, including:
    When the first training precision determined according to the training result is respectively less than the first default precision threshold in the number of setting, The renewal result of the first network model last time is recorded as first object network model and the picture to be trained Training result, using the training result of the picture to be trained as the first training result.
  3. 3. according to the method for claim 2, it is characterised in that described according to second network model and the relaying Loss Internet determines relaying loss function corresponding to the relaying loss Internet, including:
    The picture for treating first training result is input into second network model to be trained, obtains first training As a result output result of the picture in the relaying loss Internet;
    Relaying loss function is determined according to the output result.
  4. 4. according to the method for claim 1, it is characterised in that described to be damaged according to second network model with the relaying The global loss function that function determines second network model is lost, including:
    Obtain the initial global loss function in the network model;
    The relaying loss function and the initial global loss function are combined to determine second network model Global loss function.
  5. 5. according to the method for claim 1, it is characterised in that the application relaying loss function and the global damage The parameter for losing the second network model described in function pair is updated, to obtain the second network model after updating, including:
    The parameter of each layer before presetting pond layer in second network model is updated using the relaying loss function, Meanwhile the parameter of each layer in second network model is updated using global loss function;
    The second network model after result obtains renewal is updated according to the parameter of each layer.
  6. 6. according to the method for claim 1, it is characterised in that the application relaying loss function and the global damage The parameter for losing the second network model described in function pair is updated, after obtaining the second network model after updating, in addition to:
    The second network model after the renewal is verified that the picture concentrated is trained to obtain training precision to picture;
    If the training precision is more than the second default precision threshold, stop the renewal to second network model, will be last Renewal result once is as the second objective network model.
  7. 7. according to the method for claim 1, it is characterised in that the first network model and second network model are equal Including convolutional layer, pond layer and Quan Lian stratum, wherein, the default pond layer is any one layer in the pond layer;
    Insertion relaying loses Internet to determine the second net after the default pond layer in the first object network model Network model, including:
    At least one relaying loss Internet is inserted after default pond layer in the first object network model to determine the Two network models.
  8. A kind of 8. trainer of network model, it is characterised in that including:
    Second network model determining module, for reaching default renewal stop condition when first network model, according to described The renewal result of one network model determines first object network model, the default pond layer in the first object network model Insertion relaying loses Internet to determine the second network model afterwards;
    Loss function determining module is relayed, for determining institute according to second network model and the relaying loss Internet State relaying loss function corresponding to relaying loss Internet;
    Global loss function determining module, for determining described the according to second network model and the relaying loss function The global loss function of two network models;
    Second network model update module, for the application relaying loss function and the global loss function to described second The parameter of network model is updated, to obtain the second network model after updating.
  9. 9. a kind of computer equipment, including memory, processor and storage are on a memory and the meter that can run on a processor Calculation machine program, it is characterised in that the side as described in any in claim 1-7 is realized during the computing device described program Method.
  10. 10. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor The method as described in any in claim 1-7 is realized during execution.
CN201710993043.8A 2017-10-23 2017-10-23 Training method, device, equipment and the storage medium of network model Pending CN107633242A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710993043.8A CN107633242A (en) 2017-10-23 2017-10-23 Training method, device, equipment and the storage medium of network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710993043.8A CN107633242A (en) 2017-10-23 2017-10-23 Training method, device, equipment and the storage medium of network model

Publications (1)

Publication Number Publication Date
CN107633242A true CN107633242A (en) 2018-01-26

Family

ID=61105785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710993043.8A Pending CN107633242A (en) 2017-10-23 2017-10-23 Training method, device, equipment and the storage medium of network model

Country Status (1)

Country Link
CN (1) CN107633242A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776768A (en) * 2018-04-19 2018-11-09 广州视源电子科技股份有限公司 Image-recognizing method and device
CN109766872A (en) * 2019-01-31 2019-05-17 广州视源电子科技股份有限公司 Image-recognizing method and device
CN109918237A (en) * 2019-04-01 2019-06-21 北京中科寒武纪科技有限公司 Abnormal network layer determines method and Related product
CN110097188A (en) * 2019-04-30 2019-08-06 科大讯飞股份有限公司 A kind of model training method, working node and parameter update server
CN110263921A (en) * 2019-06-28 2019-09-20 深圳前海微众银行股份有限公司 A kind of training method and device of federation's learning model
CN110334735A (en) * 2019-05-31 2019-10-15 北京奇艺世纪科技有限公司 Multitask network generation method, device, computer equipment and storage medium
WO2019228358A1 (en) * 2018-05-31 2019-12-05 华为技术有限公司 Deep neural network training method and apparatus
CN111178115A (en) * 2018-11-12 2020-05-19 北京深醒科技有限公司 Training method and system of object recognition network
WO2020125251A1 (en) * 2018-12-17 2020-06-25 深圳前海微众银行股份有限公司 Federated learning-based model parameter training method, device, apparatus, and medium
CN113554097A (en) * 2021-07-26 2021-10-26 北京市商汤科技开发有限公司 Model quantization method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778543A (en) * 2016-11-29 2017-05-31 北京小米移动软件有限公司 Single face detecting method, device and terminal
CN107271925A (en) * 2017-06-26 2017-10-20 湘潭大学 The level converter Fault Locating Method of modularization five based on depth convolutional network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778543A (en) * 2016-11-29 2017-05-31 北京小米移动软件有限公司 Single face detecting method, device and terminal
CN107271925A (en) * 2017-06-26 2017-10-20 湘潭大学 The level converter Fault Locating Method of modularization five based on depth convolutional network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHENXIAOLU1984: "【人体姿态】Convolutional Pose Machines", 《CSDN》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776768A (en) * 2018-04-19 2018-11-09 广州视源电子科技股份有限公司 Image-recognizing method and device
WO2019228358A1 (en) * 2018-05-31 2019-12-05 华为技术有限公司 Deep neural network training method and apparatus
CN111178115B (en) * 2018-11-12 2024-01-12 北京深醒科技有限公司 Training method and system for object recognition network
CN111178115A (en) * 2018-11-12 2020-05-19 北京深醒科技有限公司 Training method and system of object recognition network
WO2020125251A1 (en) * 2018-12-17 2020-06-25 深圳前海微众银行股份有限公司 Federated learning-based model parameter training method, device, apparatus, and medium
CN109766872A (en) * 2019-01-31 2019-05-17 广州视源电子科技股份有限公司 Image-recognizing method and device
CN109918237A (en) * 2019-04-01 2019-06-21 北京中科寒武纪科技有限公司 Abnormal network layer determines method and Related product
CN109918237B (en) * 2019-04-01 2022-12-09 中科寒武纪科技股份有限公司 Abnormal network layer determining method and related product
CN110097188A (en) * 2019-04-30 2019-08-06 科大讯飞股份有限公司 A kind of model training method, working node and parameter update server
CN110334735B (en) * 2019-05-31 2022-07-08 北京奇艺世纪科技有限公司 Multitask network generation method and device, computer equipment and storage medium
CN110334735A (en) * 2019-05-31 2019-10-15 北京奇艺世纪科技有限公司 Multitask network generation method, device, computer equipment and storage medium
CN110263921B (en) * 2019-06-28 2021-06-04 深圳前海微众银行股份有限公司 Method and device for training federated learning model
CN110263921A (en) * 2019-06-28 2019-09-20 深圳前海微众银行股份有限公司 A kind of training method and device of federation's learning model
CN113554097A (en) * 2021-07-26 2021-10-26 北京市商汤科技开发有限公司 Model quantization method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107633242A (en) Training method, device, equipment and the storage medium of network model
CN109766872B (en) Image recognition method and device
CN111709409B (en) Face living body detection method, device, equipment and medium
US20190295223A1 (en) Aesthetics-guided image enhancement
CN111325115B (en) Cross-modal countervailing pedestrian re-identification method and system with triple constraint loss
WO2018028546A1 (en) Key point positioning method, terminal, and computer storage medium
JP6159489B2 (en) Face authentication method and system
CN111723786B (en) Method and device for detecting wearing of safety helmet based on single model prediction
US10726289B2 (en) Method and system for automatic image caption generation
CN110348387B (en) Image data processing method, device and computer readable storage medium
CN107240395A (en) A kind of acoustic training model method and apparatus, computer equipment, storage medium
CN109583501A (en) Picture classification, the generation method of Classification and Identification model, device, equipment and medium
CN109214298B (en) Asian female color value scoring model method based on deep convolutional network
CN108182409A (en) Biopsy method, device, equipment and storage medium
CN106897746A (en) Data classification model training method and device
CN107992807B (en) Face recognition method and device based on CNN model
CN108389224A (en) Image processing method and device, electronic equipment and storage medium
CN110222780A (en) Object detecting method, device, equipment and storage medium
CN106650670A (en) Method and device for detection of living body face video
CN109919252A (en) The method for generating classifier using a small number of mark images
KR102285665B1 (en) A method, system and apparatus for providing education curriculum
CN107609463A (en) Biopsy method, device, equipment and storage medium
US11734570B1 (en) Training a network to inhibit performance of a secondary task
CN113239914B (en) Classroom student expression recognition and classroom state evaluation method and device
CN110222607A (en) The method, apparatus and system of face critical point detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180126