CN110222842A - A kind of network model training method, device and storage medium - Google Patents

A kind of network model training method, device and storage medium Download PDF

Info

Publication number
CN110222842A
CN110222842A CN201910541586.5A CN201910541586A CN110222842A CN 110222842 A CN110222842 A CN 110222842A CN 201910541586 A CN201910541586 A CN 201910541586A CN 110222842 A CN110222842 A CN 110222842A
Authority
CN
China
Prior art keywords
layer
network model
layer structure
variable quantity
weight variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910541586.5A
Other languages
Chinese (zh)
Other versions
CN110222842B (en
Inventor
肖月庭
阳光
郑超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shukun Beijing Network Technology Co Ltd
Original Assignee
Digital Kun (beijing) Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital Kun (beijing) Network Technology Co Ltd filed Critical Digital Kun (beijing) Network Technology Co Ltd
Priority to CN201910541586.5A priority Critical patent/CN110222842B/en
Publication of CN110222842A publication Critical patent/CN110222842A/en
Application granted granted Critical
Publication of CN110222842B publication Critical patent/CN110222842B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a kind of network model training method, device and storage medium, belongs to network model technical field.The network model training method includes: to obtain the sample data for carrying mark;Current network model is trained using the sample data;The weight variable quantity of each layer of structure in the current network model after obtaining training;The layer structure of the current network model is changed according to the weight variable quantity of each layer of structure;The current network model after the change of layer structure is trained using the sample data.In the embodiment of the present application, when being trained to network model, by constantly adjusting the network structure of network model and the parameter of each level, the network model of best performance is determined with this, the model is allowed to have more preferably effect.

Description

A kind of network model training method, device and storage medium
Technical field
The application belongs to network model technical field, and in particular to a kind of network model training method, device and storage are situated between Matter.
Background technique
Deep learning rapidly develops in recent years, has been widely used for the fields such as image procossing and natural language processing.It is based on The network model of deep learning is widely used, such as convolutional neural networks, segmentation network, when using these network models, Need to utilize sample data to be trained it in advance, can meet demand, relate among these determine model-training mould Type-uses the processes such as model.Current model training method is all based on greatly great amount of samples data to determining initial network mould Type is repeatedly trained, and is realized by way of the network parameter (namely weight) for constantly optimizing initial network model.It should Training method trains the model come generally, performance be not it is optimal, there is also certain rooms for improvement, for example, existing Some image, semantics divide network, for example, FCN (Fully Convolutional Networks), CRF-RNN (Conditional Random Fields-Recurrent Neural Networks) etc., the instruction in cut zone edge White silk recognition effect is poor, and semantic segmentation accuracy rate is low.
Summary of the invention
In consideration of it, the application's is designed to provide a kind of network model training method, device and storage medium, to improve The performance of network model allows the model to have more preferably effect.
Embodiments herein is achieved in that
In a first aspect, the embodiment of the present application provides a kind of network model training method, comprising: acquisition carries mark Sample data;Current network model is trained using the sample data;The current network model after obtaining training In each layer of structure weight variable quantity;The current network model is changed according to the weight variable quantity of each layer of structure Layer structure;The current network model after the change of layer structure is trained using the sample data.
In the embodiment of the present application, when being trained to network model, network model is instructed first with sample data Practice, the weight variable quantity of each layer of structure in the rear network model obtained after training, and based on each layer of structure got Weight variable quantity change network model layer structure, then recycle sample data to layer structure change after network model into Row training, by adjusting the network structure of network model and the parameter of each level, to obtain performance more preferably network mould Type allows the model to have more preferably effect.
A kind of possible embodiment of embodiment with reference to first aspect, according to the weight variable quantity of each layer of structure Change the layer structure of the current network model, comprising: according to the weight variable quantity of each layer of structure, select weight variation Amount is greater than the destination layer structure of first threshold;If the destination layer structure be middle layer, before the destination layer structure and/ Or it is inserted into default layer structure below;If the destination layer structure is first layer, it is inserted into behind the destination layer structure default Layer structure;If the destination layer structure is the last layer, default layer structure is previously inserted into the destination layer structure.The application In embodiment, layer knot is preset by being inserted into before the destination layer structure that weight variable quantity is greater than first threshold and/or below Structure changes the structure of network model, since weight variation magnitude is larger, illustrate the influence of the destination layer structure to penalty values compared with Greatly, structure can also advanced optimize, so that performance is more excellent.
A kind of possible embodiment of embodiment with reference to first aspect changes according to the weight of each layer of structure Amount selects the destination layer structure that weight variable quantity is greater than first threshold, comprising: change according to the weight of each layer of structure Amount selects the layer structure that weight variable quantity is greater than the first threshold;Determine that weight variation magnitude is maximum in the layer structure selected Layer structure be the destination layer structure.It is only maximum in weight variable quantity in the embodiment of the present application, and it is greater than first threshold target It is inserted into before layer structure and/or below the structure for presetting layer structure to change network model, the number of small layers structure is carried out with this Amount, and then calculation amount is reduced, improve optimization efficiency.
A kind of possible embodiment of embodiment with reference to first aspect is not initial network in the current network model When model, the layer structure of the current network model is changed according to the weight variable quantity of each layer of structure, comprising: according to institute The weight variable quantity for stating each layer of structure selects the newly-increased layer structure that weight variable quantity is less than second threshold, it is new to obtain target Layer reinforced structure, wherein the newly-increased layer structure is the layer structure being newly inserted into the initial network model;Delete the mesh Mark new layer reinforced structure, or the replacement new layer reinforced structure of target.In the embodiment of the present application, if current network model is not initial When network model, the newly-increased layer structure for being less than second threshold by deleting weight variable quantity, or replacement weight variable quantity are small The layer structure of current network model is changed with this in the newly-increased layer structure of second threshold, delete newly increase to network model Performance influence lesser layer structure, to improve training effectiveness.
A kind of possible embodiment of embodiment with reference to first aspect changes according to the weight of each layer of structure Amount selects the newly-increased layer structure that weight variable quantity is less than second threshold, obtains the new layer reinforced structure of target, comprising: according to described The weight variable quantity of each layer of structure selects the newly-increased layer structure that weight variable quantity is less than the second threshold;Determination is selected Newly-increased layer structure in weight variation the smallest layer structure of magnitude be the new layer reinforced structure of the target.In the embodiment of the present application, If current network model is not initial network model, by deleting weight variable quantity minimum, and it is less than the target of second threshold New layer reinforced structure, or replacement weight variable quantity is minimum and is less than the new layer reinforced structure of target of second threshold, is worked as with this to change The layer structure of preceding network model, delete newly increase lesser layer structure is influenced on the performance of network model, by slowly more The layer mechanism for becoming network model, avoids the number for being repeatedly inserted into, deleting or replacing, to improve training effectiveness.
A kind of possible embodiment of embodiment with reference to first aspect is changing layer structure using the sample data After the current network model afterwards is trained, the method also includes: after the layer structure change after determining training The current network model meet preset condition;Optimal network model is determined from the network model that all training obtain. In the embodiment of the present application, when determining that the network model after the layer structure change after training meets preset condition, comparison will pass through The parameter for the network model that all training obtain therefrom determines optimal network model, guarantees finally obtained model with this Best performance.
A kind of possible embodiment of embodiment with reference to first aspect is changing layer structure using the sample data After the current network model afterwards is trained, the method also includes: after the layer structure change after determining training The current network model be unsatisfactory for preset condition;The current network mould after layer structure change after obtaining training The weight variable quantity of each layer of structure in type;According to each layer in the current network model after layer structure change The weight variable quantity of structure changes the layer structure of the current network model after the layer structure change;Utilize the sample number It is trained according to the current network model after being changed again to layer structure.In the embodiment of the present application, to changing layer knot After the network model of structure is trained, judge whether the model meets preset condition, if be unsatisfactory for, continues layers of alterations knot Structure, and continue to be trained the model of layers of alterations structure again, so that training obtains the network model of best performance.
A kind of possible embodiment of embodiment with reference to first aspect, the current network model are U-net network, institute The volume data that sample data is CT scan is stated, it is corresponding to be labeled as vascular bodies mark.
Second aspect, the embodiment of the present application also provides a kind of network model training devices, comprising: sample acquisition module, Training module, weight variable quantity obtain module and change module;Sample acquisition module, for obtaining the sample for carrying mark Data;Training module, for being trained using the sample data to current network model;Weight variable quantity obtains module, For obtaining the weight variable quantity of each layer of structure in the current network model after training;Module is changed, basis is used for The weight variable quantity of each layer of structure changes the layer structure of the current network model;The training module, is also used to benefit The current network model after the change of layer structure is trained with the sample data.
The third aspect, the embodiment of the present application also provides a kind of electronic equipment, comprising: memory and processor, it is described to deposit Reservoir is connected with the processor;The memory is for storing program;The processor is stored in the storage for calling Program in device, to execute above-mentioned first aspect embodiment and/or with reference to first aspect any possible implementation of embodiment The method that mode provides.
Fourth aspect, the embodiment of the present application also provides a kind of storage mediums, are stored thereon with computer program, the calculating Machine program executes above-mentioned first aspect embodiment and/or with reference to first aspect any possibility of embodiment when being run by computer Embodiment provide method.
Other feature and advantage of the application will be illustrated in subsequent specification, also, partly be become from specification It is clear that being understood and implementing the embodiment of the present application.The purpose of the application and other advantages can be by written Specifically noted structure is achieved and obtained in specification and attached drawing.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the application Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings Obtain other attached drawings.By the way that shown in attached drawing, above and other purpose, the feature and advantage of the application will be more clear.In whole Identical appended drawing reference indicates identical part in attached drawing.Attached drawing, emphasis deliberately are not drawn by actual size equal proportion scaling It is that the purport of the application is shown.
Fig. 1 shows a kind of flow diagram of network model training method provided by the embodiments of the present application.
Fig. 2 shows a kind of schematic illustrations of network model training method provided by the embodiments of the present application.
Fig. 3 shows a kind of module diagram of network model training device provided by the embodiments of the present application.
Fig. 4 shows the structural schematic diagram of a kind of electronic equipment provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application is described.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile the application's The relational terms of such as " first ", " second " or the like are used merely to an entity or operation and another entity in description Or operation distinguishes, without necessarily requiring or implying between these entities or operation there are any this actual relationship or Person's sequence.Moreover, the terms "include", "comprise" or any other variant thereof is intended to cover non-exclusive inclusion, to make Obtaining the process, method, article or equipment including a series of elements not only includes those elements, but also including not arranging clearly Other element out, or further include for elements inherent to such a process, method, article, or device.Not more In the case where limitation, the element that is limited by sentence "including a ...", it is not excluded that including process, the side of the element There is also other identical elements in method, article or equipment.
Furthermore term "and/or" in the application, only a kind of incidence relation for describing affiliated partner, expression can deposit In three kinds of relationships, for example, A and/or B, can indicate: individualism A exists simultaneously A and B, these three situations of individualism B.
First embodiment
Referring to Fig. 1, being a kind of network model training method provided by the embodiments of the present application, below in conjunction with Fig. 1 to it The step of institute includes is illustrated.
Step S101: the sample data for carrying mark is obtained.
When being trained to network model, it is necessary first to the sample data for obtaining training pattern, since network model is instructed Experienced process is that have the learning process of supervision, therefore the sample data needs to be the sample data for carrying mark (label), this The network model that sample trains can just be met the requirements.For example, sample data is the volume data of CT scan in coronary artery segmentation, It is corresponding to be labeled as vascular bodies mark.
Wherein, it should be noted that under different application scenarios, corresponding sample data is different, for example, for coronary artery point When cutting, sample data is the volume data of CT scan, corresponding to be labeled as vascular bodies mark, and corresponding network model can be point Cutting network is U-net network;And when for image classification, sample data is image, the corresponding object being labeled as in image The mark of classification, corresponding network model can be multi-tag disaggregated model.
Wherein, sample data can be divided into according to effect difference for trained sample and for the sample of verifying, namely After acquisition carries the sample data of mark, sample data can be divided into instruction respectively such as 3:1:1 according to a certain percentage Practice collection, cross validation collection and test set, then the network model for needing training is trained.Wherein, cross validation verifying collection, As its name suggests, it is exactly duplicate using data, obtained sample data is carried out cutting, group is combined into different training set and test Collection, with training set come training pattern, the quality predicted with test set come assessment models.Available multiple groups are different on this basis Training set and test set, certain sample in certain training set is likely to become the sample in test set in next time, i.e., so-called " to hand over Fork ".
Step S102: current network model is trained using the sample data.
After getting sample data, selected current network model is trained using sample data is got.Its In, under the difference of network model corresponding to different application demands namely different application demands, selected network model is deposited In difference, such as in coronary artery segmentation, network model can be U-net network;In image classification, network model can be with It is multi-tag disaggregated model.Wherein, current network model can be according to the selected initial network model of application demand, can also To be the initial network model for changing layer structure on the basis of selected initial network model.
Step S103: the weight variable quantity of each layer of structure in the current network model after obtaining training.
After being trained using sample data to current network model, in the current network model after obtaining training The weight variable quantity of each layer of structure.Since all network models are all corresponding with loss function, also just to there is penalty values, When network model BP (Back Propagation, reverse to propagate), LOSS value can be returned, and then can tie in the hope of each layer The weight of structure, can be in the hope of weight variable quantity dw by the variation of front and back weight.Wherein, Loss=f (O, L), O are the defeated of network Out, L is mark, and f is loss function, and Loss is the penalty values calculated.
Wherein, it should be noted that the weight variable quantity of each of above-mentioned layer of structure can be one time cycle of training (its In, a cycle of training is value, and all sample datas use once) after weight variable quantity, for example, for layer structure A For, there are 30 sample datas, it is assumed that it is every to will be updated the weight of a sublevel structure using 10 sample datas, then altogether can be with Obtain 4 weights (initial weight a1, using the 10th sample data when update weight a2, using the 20th sample data when The weight a3 of update, using the 30th sample data when the weight a4 that updates), then the weight variable quantity can be with last time Weight a4 subtract initial weight a1, i.e., (a4-a1), be also possible to (| a2-a1 |+| a3-a2 |+| a4-a3 |)/3.It certainly can also Be it is trained several times after weight variable quantity, such as layer structure A, it is assumed that weight when initial is A0, for the first time The weight obtained after cycle of training is A1, and the weight obtained after second of cycle of training is A2, and third time obtains after cycle of training Weight be A3, then the weight variable quantity of layer structure A can be (A3-A0), naturally it is also possible to be (| A1-A0 |+| A2-A1 |+| A3-A2|)/3.It is above-mentioned be only 3 cycles of training by several cycles of training for carry out exemplary, numerical value can be in addition to 3 Other numerical value, such as 2,4,5 numerical value.
Step S104: the layer structure of the current network model is changed according to the weight variable quantity of each layer of structure.
After the weight variable quantity of each layer of structure in the current network model after being trained, according to described each The weight variable quantity of a layer of structure changes the layer structure of the current network model.
As an implementation, the current network model is changed according to the weight variable quantity of each layer of structure The process of layer structure may is that the weight variable quantity according to each layer of structure, select weight variable quantity greater than first threshold Destination layer structure;If the destination layer structure is first layer, default layer structure is inserted into behind the destination layer structure;If The destination layer structure is the last layer, is previously inserted into default layer structure in the destination layer structure;If the destination layer knot Structure is middle layer, is inserted into before the destination layer structure and/or below and presets layer structure.Under a kind of embodiment, according to The weight variable quantity of each layer of structure, select weight variable quantity greater than first threshold destination layer structure may is that it is all Weight variable quantity be greater than first threshold layer structure be destination layer structure under destination layer structure namely the embodiment quantity For at least one;Under another embodiment, according to the weight variable quantity of each layer of structure, selects weight variable quantity and be greater than The destination layer structure of first threshold may is that according to the weight variable quantity of each layer of structure, selects weight variable quantity and is greater than The layer structure of the first threshold;Determine that the weight variation maximum layer structure of magnitude is the destination layer knot in the layer structure selected Under structure namely the embodiment, the quantity of destination layer structure only one.
In order to make it easy to understand, citing is illustrated below, it is assumed that the layer structure of current network model is 5 layers, is followed successively by A Layer, B layers, C layers, D layers, E layers, wherein A layers are first layer, and B layers, C layers, D layers are middle layer, and E layers are the last layer.Assuming that choosing The destination layer structure that weight variable quantity out is greater than first threshold is A layers, C layers and E layers, then in the layer of change current network model When structure, default layer structure is inserted into behind A layers, it is assumed that it is X layers, is all inserted into default layer structure C layers of front and back, It is assumed to be Y layers, is previously inserted into default layer structure at E layers, it is assumed that is Z layer, the layer structure of the network model after changing at this time is 9 Layer, respectively A layers, X layers, B layers, Y layers, C layers, Y layers, D layers, Z layers, E layers.
Above-mentioned example mode is that the layer structure that all weight variable quantities are greater than first threshold is used as destination layer structure.Make It is only maximum in weight variation magnitude for another embodiment, and weight variable quantity is greater than the layer structure of the first threshold Front and back is all inserted into default layer structure, for example, from weight variable quantity elected be greater than first threshold A layer, C layers and Selected in E layers weight variation the maximum layer structure of magnitude be destination layer structure, it is assumed that be C layers, then only before C layers and/or It is all inserted into default layer structure below, such as is all inserted into default layer structure C layers of front and back, it is assumed that is Y layers.
Wherein, if destination layer structure is middle layer, it can be and be only previously inserted into default layer knot in destination layer structure Structure is also possible to only be inserted into default layer structure behind destination layer structure, is also possible to before destination layer structure with after Face is all inserted into default layer structure.
Wherein, the layer structure that the default layer structure being inserted into is concentrated from default layer structure, such as default layer structure collection are { X-ception, Res, BottelNeck }, then the default layer structure being inserted into can be X-ception, Res or BottelNeck.The default layer mechanism that different destination layer structures is inserted into can be the same or different, for example, at A layers It is inserted into this layer of structure of X-ception below, is previously inserted into this layer of structure of Res at C layer, the insertion behind C layers This layer of structure of BottelNeck is previously inserted into this layer of structure of Res at E layers.Wherein, the front and back of destination layer structure and/or The quantity for the default layer structure being inserted into below can be one, be also possible to two or more.On it is understood that The default layer structure collection stated is merely for convenience of understanding shown example, can not be understood as being to the application Limitation, the default layer structure that default layer structure is concentrated are not limited to above-mentioned example.
If the current network model is not initial network model, as another embodiment, according to described each The layer structure that the weight variable quantity of layer structure changes the current network model may is that the weight according to each layer of structure Variable quantity selects the newly-increased layer structure that weight variable quantity is less than second threshold, obtains the new layer reinforced structure of target;Delete the mesh Mark new layer reinforced structure, or the replacement new layer reinforced structure of target.Under a kind of embodiment, according to the power of each layer of structure Weight variable quantity selects the newly-increased layer structure that weight variable quantity is less than second threshold, and the process for obtaining the new layer reinforced structure of target can To be: all weight variable quantities are less than the newly-increased layer structure of second threshold as the new layer reinforced structure of target namely the party The quantity of the new layer reinforced structure of the target obtained under formula is at least one;Under another embodiment, which be may is that according to institute The weight variable quantity for stating each layer of structure selects the newly-increased layer structure that weight variable quantity is less than the second threshold;Determine choosing The weight variation the smallest layer structure of magnitude is the new layer reinforced structure of the target in newly-increased layer structure out.Namely the embodiment Under, the quantity of the new layer reinforced structure of target only one.
Wherein, the newly-increased layer structure is the layer structure being newly inserted into the initial network model, for the ease of reason Solution, it is assumed that the layer structure of initial network model is 5 layers, is followed successively by A layers, B layers, C layers, D layers, E layers, it is assumed that current network model Layer mechanism is 9 layers, respectively A layers, X layers, B layers, Y layers, C layers, Y layers, D layers, Z layers, E layers, then newly-increased layer structure is X layers, position In the Y layer of C layers of front, positioned at C layers subsequent Y layers and Z layer.Assuming that the weight variable quantity selected is less than the target of second threshold New layer reinforced structure is Y layers subsequent for X layers and positioned at C layers, then changing the new layer reinforced structure of the target can be X layers and be located at C layers Subsequent Y layers is deleted;It is also possible to delete X layers, replacement is Y layers subsequent positioned at C layers;It is also possible to X layers of replacement, will be located at C layers of subsequent Y layers of deletion, are also possible to replace by X layers and positioned at C layers subsequent Y layers.Assuming that X layers for BottelNeck this One layer of structure when then replacing, can be replaced with Res, X-ception;Assuming that being located at C layers subsequent Y layers is this layer of Res Structure when then replacing, can be replaced with BottelNeck, X-ception.
Above-mentioned example mode is that the newly-increased layer structure that all weight variable quantities are less than second threshold is used as target new Layer reinforced structure.As another embodiment, only that weight variation magnitude is minimum, and weight variable quantity is less than the second threshold Newly-increased layer structure as the new layer reinforced structure of target.For example, being less than the X layer of second threshold from weight variable quantity elected Be located at C layer it is Y layer subsequent in select the weight variation the smallest layer structure of magnitude as the new layer reinforced structure of target, it is assumed that be positioned at C Subsequent Y layers of layer, then in the layer structure of the current network model of change, it will only be located at C layers subsequent Y layers and replace or delete It removes.
Wherein, second threshold and first threshold can be determined by preset rules, and second threshold is less than first threshold.Example Such as, it determines that the preset rules of first threshold can be a*w, determines that the preset rules of second threshold can be b*w, wherein a, b are equal For coefficient, and b is less than a, for example, a is the coefficient greater than 1, b is the coefficient less than 1;After w is a cycle of training, that acquires is each The average value of a layer of structure ratio variable quantity, or it is flat for each layer of structure ratio variable quantity after several cycles of training, acquired Mean of mean.For example, after w is a cycle of training, being asked by taking above-mentioned A layer, B layers, C layers, D layers, E layers of structure as an example When the average value of each layer of structure ratio variable quantity obtained, it is assumed that after a cycle of training, the corresponding power of layer structure A, B, C, D, E Weight variable quantity is respectively A1, B1, C1, D1, E1, then its average value is (A1+B1+C1+D1+E1)/5, then w under this kind of embodiment For (A1+B1+C1+D1+E1)/5, corresponding first threshold can be 1.5* (A1+B1+C1+D1+E1)/5;Second threshold can be with It is 0.8* (A1+B1+C1+D1+E1)/5.After w is several cycles of training, each layer of structure ratio variable quantity average value acquiring Average value when, by taking several cycles of training are 2 cycles of training as an example, it is assumed that after cycle of training first time, layer structure A, B, C, D, the corresponding weight variable quantity of E is respectively A1, B1, C1, D1, E1, it is assumed that after second of cycle of training, layer structure A, B, C, D, E Corresponding weight variable quantity is respectively A2, B2, C2, D2, E2, then w is [(A1+B1+C1+D1+E1)/5+ (A2+B2+C2+D2+ E2)/5]/2, corresponding first threshold can be 1.2* [(A1+B1+C1+D1+E1)/5+ (A2+B2+C2+D2+E2)/5]/2;The Two threshold values can be 0.9* [(A1+B1+C1+D1+E1)/5+ (A2+B2+C2+D2+E2)/5]/2.It is not ugly by above-mentioned example Out, first threshold and second threshold are real-time changes, are not a fixed value.
Wherein, it should be noted that the specific value of a, b of above-mentioned example are merely for convenience of understanding, can also be Other numerical value cannot be understood as being the limitation to the application.
Wherein, it should be noted that existing training method is will not to change the structure of network model, only in training During optimize the weight variable quantity of each layer structure, such as the layer structure of initial network model is 5 layers, then trained network The layer structure of model is still still 5 layers;And training method used by the application passes through the net of constantly adjustment network model The parameter of network structure and each level determines the network model of best performance with this, due to just starting selected original net The layer structure of network model is not optimal, therefore this method makes the performance of the network model trained be better than existing method The performance of the network model trained.
Step S105: the current network model after the change of layer structure is trained using the sample data.
After changing the layer structure of the current network model according to the weight variable quantity of each layer of structure, continue benefit The current network model after the change of layer structure is trained with sample data, until training terminates.For example, continuing to use sample Notebook data is trained to after above-mentioned layer structure change for 9 layers of network model.
As another embodiment, the current network model after being changed using the sample data to layer structure After being trained, the method also includes: the current network model after layer structure change after training of judgement is It is no to meet preset condition;If being unsatisfactory for preset condition, above-mentioned step S103-S105 is repeated, namely after determining training When the current network model after the layer structure change is unsatisfactory for preset condition;Layer structure change after obtaining training The weight variable quantity of each layer of structure in the current network model afterwards;According to described current after layer structure change The weight variable quantity of each layer of structure in network model changes the layer of the current network model after the layer structure change Structure;The current network model after being changed again using the sample data to layer structure is trained.If meeting default Condition then determines optimal network model, namely the layer after determining training from the network model that all training obtain When the current network model after structure change meets preset condition, determined most from the network model that all training obtain Excellent network model.Wherein, the process of optimal network model is determined from the network model that all training obtain are as follows: obtain all The corresponding penalty values of network model that training obtains;Penalty values minimum is filtered out from the network model that all training obtain Network model;Determine that the smallest network model of the penalty values is optimal network model.
The above process in order to facilitate understanding is illustrated below in conjunction with schematic diagram shown in Fig. 2, when just starting, Initial network model is trained with sample data, the weight of each layer of structure in initial network model after obtaining training Then variable quantity (is the first net after change according to the layer structure of the weight variable quantity of each layer of structure change initial network model Network model), then first network model is trained using sample data, then the first network model after training of judgement is It is no to meet preset condition, if not satisfied, the weight variable quantity of each layer of structure in the first network model after training is then obtained, Then the layer structure (being the second network model after change) of first network model is changed according to the weight variable quantity of each layer of structure, Then the second network model is trained using sample data, then whether the second network model after training of judgement meets pre- If condition, if not satisfied, the weight variable quantity of each layer of structure in the second network model after training is obtained, then according to each The weight variable quantity of a layer of structure changes the layer structure (being third network model after change) of the second network model, then utilizes sample Notebook data is trained third network model, and then whether the third network model after training of judgement meets preset condition, if It is unsatisfactory for, continues cycling through down, until meeting preset condition, if satisfied, the institute for then stopping iteration, and being obtained from training Have and select optimal network model in network model, in such example from after training initial network model, training after first network mould Optimal network model is selected in third network model after the second network model and training after type, training, third after for example training Network model.
Wherein, the above-mentioned default condition that meets can be the number of iterations (wherein, an iteration is a cycle of training) completely Sufficient condition, for example, the number of iterations be 100 times when then meet preset condition.Certainly this default meet condition and is also possible to network model Penalty values meet sets requirement, as long as the penalty values of the network model of training less or greater than setting threshold value when, it is full Sufficient threshold condition.Penalty values due to being possible to the network model can not approach the threshold value of setting forever, and training is caused to be held always Continue down, can not terminate, as another embodiment, which is also possible in summary two ways i.e. iteration time The mode of number+penalty values, as be arranged the number of iterations be 100 times when if meet preset condition, while be also provided with penalty values be less than or When person is greater than the threshold value of setting, meet threshold condition, during such iteration, is tied as long as penalty values reach setting threshold value Beam, and do not have to pipe and whether iterate to 100 times, if iterating to 100 times certainly, penalty values still without the threshold value for reaching setting, due to What the number of iterations reached, equally also terminate.
Second embodiment
The embodiment of the present application also provides a kind of network model training devices 100, as shown in Figure 3.Network model training Device 100 includes: sample acquisition module 110, training module 120, weight variable quantity acquisition module 130, change module 140.
Wherein, sample acquisition module 110, for obtaining the sample data for carrying mark.
Wherein, training module 120, for being trained using the sample data to current network model.
Wherein, weight variable quantity obtains module 130, for obtaining each layer in the current network model after training The weight variable quantity of structure.
Wherein, module 140 is changed, for changing the current network mould according to the weight variable quantity of each layer of structure The layer structure of type.Optionally, the change module 140 is specifically used for: according to the weight variable quantity of each layer of structure, selecting Weight variable quantity is greater than the destination layer structure of first threshold;If the destination layer structure is middle layer, in the destination layer structure Before and/or be inserted into default layer structure below;If the destination layer structure is first layer, behind the destination layer structure It is inserted into default layer structure;If the destination layer structure is the last layer, default layer knot is previously inserted into the destination layer structure Structure.And also particularly useful for: according to the weight variable quantity of each layer of structure, weight variable quantity is selected greater than first threshold The layer structure of value;Determine that the weight variation maximum layer structure of magnitude is the destination layer structure in the layer structure selected.Described When current network model is not initial network model, optionally, the change module 140 is specifically used for: according to each layer The weight variable quantity of structure selects the newly-increased layer structure that weight variable quantity is less than second threshold, obtains the new layer reinforced structure of target, Wherein, the newly-increased layer structure is the layer structure being newly inserted into the initial network model;Delete the new increasing layer of the target Structure, or the replacement new layer reinforced structure of target.And also particularly useful for: changed according to the weight of each layer of structure Amount selects the newly-increased layer structure that weight variable quantity is less than the second threshold;Determine weight in the newly-increased layer structure selected The variation the smallest layer structure of magnitude is the new layer reinforced structure of the target.
The training module, be also used to using the sample data to layer structure change after the current network model into Row training.
Optionally, the network model training device 100 further include: the first determining module and the second determining module;
First determining module, for determining that the current network model after the layer structure change after training meets in advance If condition.
Second determining module, for determining optimal network model from the network model that all training obtain.
Optionally, the network model training device 100 further include: third determining module.
Third determining module, for determining that the current network model after the layer structure change after training is unsatisfactory for Preset condition.
At this point, the weight variable quantity obtains module 130, the institute after layer structure change after being also used to obtain training State the weight variable quantity of each layer of structure in current network model.
The change module 140 is also used to according to each in the current network model after layer structure change The weight variable quantity of layer structure changes the layer structure of the current network model after the layer structure change;
The training module 120, the current net after being also used to change layer structure again using the sample data Network model is trained.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other.
The technical effect of network model training device 100 provided by the embodiment of the present application, realization principle and generation and Preceding method embodiment is identical, and to briefly describe, Installation practice part does not refer to place, can refer in preceding method embodiment Corresponding contents.
3rd embodiment
As shown in figure 4, Fig. 4 shows the structural block diagram of a kind of electronic equipment 200 provided by the embodiments of the present application.The electricity Sub- equipment 200 includes: transceiver 210, memory 220, communication bus 230 and processor 240.
The transceiver 210, the memory 220, each element of processor 240 directly or indirectly electrically connect between each other It connects, to realize the transmission or interaction of data.For example, these elements between each other can by one or more communication bus 230 or Signal wire, which is realized, to be electrically connected.Wherein, transceiver 210 is used for sending and receiving data.Memory 220 is for storing computer program, such as It is stored with software function module shown in Fig. 3, i.e. network model training device 100.Wherein, network model training device 100 The electricity can be stored in the memory 220 or is solidificated in including at least one in the form of software or firmware (firmware) Software function module in the operating system (operating system, OS) of sub- equipment 200.The processor 240, for holding The executable module stored in line storage 220, such as the software function module that includes of network model training device 100 or calculating Machine program.For example, processor 240, for obtaining the sample data for carrying mark;And it is also used to utilize the sample data Current network model is trained;And it is also used to obtain each layer of structure in the current network model after training Weight variable quantity;And it is also used to change the layer knot of the current network model according to the weight variable quantity of each layer of structure Structure;And it is also used to be trained the current network model after the change of layer structure using the sample data.
Wherein, memory 220 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Processor 240 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor can be General processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;It can also be digital signal processor (Digital Signal Processor, DSP), dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present application.It is general Processor can be microprocessor or the processor 240 is also possible to any conventional processor etc..
Wherein, above-mentioned electronic equipment 200, including but not limited to network server, database server, cloud server Deng.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other.
Fourth embodiment
The embodiment of the present application also provides a kind of non-volatile computer read/write memory medium, (hereinafter referred to as storage is situated between Matter), it is stored thereon with computer program, which executes sheet when being run by for example above-mentioned electronic equipment 200 of computer Apply for the step of network model training method provided by embodiment is included.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing Show the device of multiple embodiments according to the application, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part of the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that function marked in the box can also be to be different from some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can execute in the opposite order, this depends on the function involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use the dedicated base for executing defined function or movement It realizes, or can realize using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each functional module in each embodiment of the application can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer readable storage medium.Based on this understanding, the technical solution of the application is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, laptop, server or network equipment etc.) execute each embodiment the method for the application whole Or part steps.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. are various can store program The medium of code.
The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain Lid is within the scope of protection of this application.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.

Claims (10)

1. a kind of network model training method characterized by comprising
Obtain the sample data for carrying mark;
Current network model is trained using the sample data;
The weight variable quantity of each layer of structure in the current network model after obtaining training;
The layer structure of the current network model is changed according to the weight variable quantity of each layer of structure;
The current network model after the change of layer structure is trained using the sample data.
2. the method according to claim 1, wherein changing institute according to the weight variable quantity of each layer of structure State the layer structure of current network model, comprising:
According to the weight variable quantity of each layer of structure, the destination layer structure that weight variable quantity is greater than first threshold is selected;
If the destination layer structure is middle layer, it is inserted into before the destination layer structure and/or below and presets layer structure;
If the destination layer structure is first layer, default layer structure is inserted into behind the destination layer structure;
If the destination layer structure is the last layer, default layer structure is previously inserted into the destination layer structure.
3. according to the method described in claim 2, it is characterized in that, being selected according to the weight variable quantity of each layer of structure Weight variable quantity is greater than the destination layer structure of first threshold, comprising:
According to the weight variable quantity of each layer of structure, the layer structure that weight variable quantity is greater than the first threshold is selected;
Determine that the weight variation maximum layer structure of magnitude is the destination layer structure in the layer structure selected.
4. the method according to claim 1, wherein not being initial network model in the current network model When, the layer structure of the current network model is changed according to the weight variable quantity of each layer of structure, comprising:
According to the weight variable quantity of each layer of structure, the newly-increased layer structure that weight variable quantity is less than second threshold is selected, Obtain the new layer reinforced structure of target, wherein the newly-increased layer structure is the layer structure being newly inserted into the initial network model;
Delete the new layer reinforced structure of the target, or the replacement new layer reinforced structure of target.
5. according to the method described in claim 4, it is characterized in that, being selected according to the weight variable quantity of each layer of structure Weight variable quantity is less than the newly-increased layer structure of second threshold, obtains the new layer reinforced structure of target, comprising:
According to the weight variable quantity of each layer of structure, the newly-increased layer knot that weight variable quantity is less than the second threshold is selected Structure;
Determine that the weight variation the smallest layer structure of magnitude is the new layer reinforced structure of the target in the newly-increased layer structure selected.
6. method according to any one of claims 1-5, which is characterized in that in the utilization sample data to layer structure After the current network model after change is trained, the method also includes:
The current network model after layer structure change after determining training meets preset condition;
Optimal network model is determined from the network model that all training obtain.
7. method according to any one of claims 1-5, which is characterized in that in the utilization sample data to layer structure After the current network model after change is trained, the method also includes:
The current network model after layer structure change after determining training is unsatisfactory for preset condition;
The weight variable quantity of each layer of structure in the current network model after layer structure change after obtaining training;
According to the weight variable quantity change of each layer of structure in the current network model after layer structure change The layer structure of the current network model after layer structure change;
The current network model after being changed again using the sample data to layer structure is trained.
8. the method according to claim 1, wherein the current network model is U-net network, the sample Data are the volume data of CT scan, corresponding to be labeled as vascular bodies mark.
9. a kind of network model training device characterized by comprising
Sample acquisition module, for obtaining the sample data for carrying mark;
Training module, for being trained using the sample data to current network model;
Weight variable quantity obtains module, and the weight for obtaining each layer of structure in the current network model after training becomes Change amount;
Module is changed, for changing the layer structure of the current network model according to the weight variable quantity of each layer of structure;
The training module is also used to instruct the current network model after the change of layer structure using the sample data Practice.
10. a kind of storage medium, which is characterized in that be stored thereon with computer program, which is run by computer Shi Zhihang method for example of any of claims 1-8.
CN201910541586.5A 2019-06-21 2019-06-21 Network model training method and device and storage medium Active CN110222842B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910541586.5A CN110222842B (en) 2019-06-21 2019-06-21 Network model training method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910541586.5A CN110222842B (en) 2019-06-21 2019-06-21 Network model training method and device and storage medium

Publications (2)

Publication Number Publication Date
CN110222842A true CN110222842A (en) 2019-09-10
CN110222842B CN110222842B (en) 2021-04-06

Family

ID=67814256

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910541586.5A Active CN110222842B (en) 2019-06-21 2019-06-21 Network model training method and device and storage medium

Country Status (1)

Country Link
CN (1) CN110222842B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538079A (en) * 2020-04-17 2021-10-22 北京金山数字娱乐科技有限公司 Recommendation model training method and device, and recommendation method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548234A (en) * 2016-11-17 2017-03-29 北京图森互联科技有限责任公司 A kind of neural networks pruning method and device
CN110335250A (en) * 2019-05-31 2019-10-15 上海联影智能医疗科技有限公司 Network training method, device, detection method, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548234A (en) * 2016-11-17 2017-03-29 北京图森互联科技有限责任公司 A kind of neural networks pruning method and device
CN110335250A (en) * 2019-05-31 2019-10-15 上海联影智能医疗科技有限公司 Network training method, device, detection method, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张士昱 等: ""使用动态增减枝算法优化网络结构的DBN模型"", 《计算机科学与探索》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538079A (en) * 2020-04-17 2021-10-22 北京金山数字娱乐科技有限公司 Recommendation model training method and device, and recommendation method and device

Also Published As

Publication number Publication date
CN110222842B (en) 2021-04-06

Similar Documents

Publication Publication Date Title
Boda et al. Stochastic target hitting time and the problem of early retirement
CN104679743B (en) A kind of method and device of the preference pattern of determining user
CN102568205B (en) Traffic parameter short-time prediction method based on empirical mode decomposition and classification combination prediction in abnormal state
CN108694673A (en) A kind of processing method, device and the processing equipment of insurance business risk profile
CN110060144A (en) Amount model training method, amount appraisal procedure, device, equipment and medium
CN104468413B (en) A kind of network service method and system
CN114721833A (en) Intelligent cloud coordination method and device based on platform service type
CN109376995A (en) Financial data methods of marking, device, computer equipment and storage medium
EP3751496A1 (en) Method and system for building reinforcement learning (rl) based model for generating bids
CN109741177A (en) Appraisal procedure, device and the intelligent terminal of user credit
CN107330464A (en) Data processing method and device
CN103440309A (en) Automatic resource and environment model combination modeling semantic recognition and recommendation method
CN102185731A (en) Network health degree testing method and system
CN109993753A (en) The dividing method and device of urban function region in remote sensing image
CN112651534A (en) Method, device and storage medium for predicting resource supply chain demand
CN103646670A (en) Method and device for evaluating performances of storage system
CN110263136B (en) Method and device for pushing object to user based on reinforcement learning model
CN110222842A (en) A kind of network model training method, device and storage medium
CN108898648A (en) A kind of K line chart building method, system and relevant device
CN110705889A (en) Enterprise screening method, device, equipment and storage medium
CN110275895A (en) It is a kind of to lack the filling equipment of traffic data, device and method
CN112052549B (en) Method for selecting roads in small mesh gathering area
CN107908915A (en) Predict modeling and analysis method, the equipment and storage medium of tunnel crimp
CN114021776A (en) Material combination selection method and device and electronic equipment
CN110472991B (en) Data processing method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 100120 rooms 303, 304, 305, 321 and 322, building 3, No. 11, Chuangxin Road, science and Technology Park, Changping District, Beijing

Patentee after: Shukun (Beijing) Network Technology Co.,Ltd.

Address before: Room 1801-156, 16 / F, building 1, yard 16, Guangshun South Street, Chaoyang District, Beijing

Patentee before: SHUKUN (BEIJING) NETWORK TECHNOLOGY Co.,Ltd.