CN109961107A - Training method, device, electronic equipment and the storage medium of target detection model - Google Patents
Training method, device, electronic equipment and the storage medium of target detection model Download PDFInfo
- Publication number
- CN109961107A CN109961107A CN201910315195.1A CN201910315195A CN109961107A CN 109961107 A CN109961107 A CN 109961107A CN 201910315195 A CN201910315195 A CN 201910315195A CN 109961107 A CN109961107 A CN 109961107A
- Authority
- CN
- China
- Prior art keywords
- target detection
- network
- detection model
- loss function
- sorter network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Abstract
The embodiment of the present application provides training method, device, electronic equipment and the storage medium of a kind of target detection model, target detection model includes the first sorter network, this method comprises: at least one second sorter network is arranged, wherein, the input of the second sorter network is identical as the input of the first sorter network when training;Target detection model is trained based on total losses function, until total losses function convergence, wherein total losses function includes the loss function of target detection model and the loss function of the second sorter network.Compared with the training method of existing target detection model, the scheme of the embodiment of the present application, when being trained to target detection model, by adding the second sorter network, and the loss function of the loss function and model based on the second sorter network is trained to target detection model, model can effectively be reinforced to the study to erroneous detection and false-alarm, to improve the detection accuracy of target detection model.
Description
Technical field
This application involves technical field of image processing, specifically, this application involves a kind of training of target detection model
Method, apparatus, electronic equipment and storage medium.
Background technique
The task of target detection is to find out in image interested target, for example, when target is face, Face datection
Then it is intended to detect the face and its corresponding position in scene.Target detection is one of major issue of computer vision field,
There is long-range researching value in the fields such as security protection detection, human-computer interaction and is widely applied demand.
In recent years, target detection technique obtains fast development with the development of deep neural network and hardware device,
But target detection technique is often associated with a large amount of false-alarm in actual application, i.e., regards as certain nontarget areas
Target area has seriously affected the popularization and use of target detection technique.Therefore, how to inhibit the void in target detection network
It is alert, the precision of target detection is promoted, is the extremely important problem in the field.
Summary of the invention
The purpose of the application is intended at least can solve above-mentioned one of technological deficiency, especially false-alarm during target detection
The high technological deficiency of rate.
In a first aspect, the embodiment of the present application provides a kind of training method of target detection model, target detection model packet
The first sorter network is included, this method comprises:
At least one second sorter network is set, wherein the input of the second sorter network and the first sorter network when training
Input it is identical;
Target detection model is trained based on total losses function, until total losses function convergence, wherein total losses letter
Number includes the loss function of target detection model and the loss function of the second sorter network.
In the alternative embodiment of the application, target detection model includes single-stage detection network structure.
In the alternative embodiment of the application, it includes RetinaNet network structure that single-stage, which detects network structure,.
In the alternative embodiment of the application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein convolution
The input of layer is connect with the output of Backbone (backbone) network of RetinaNet network structure.
In the alternative embodiment of the application, the loss function of the second sorter network includes the output based on the second sorter network
Determining first-loss function, and second damage determining based on the output of the first sorter network and the output of the second sorter network
Lose at least one in function.
In the embodiment of the present application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate
Sample label, γ are regulatory factor.
In the alternative embodiment of the application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate
Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
In the alternative embodiment of the application, the loss function of target detection model includes the loss function of the first sorter network
With the loss function of target frame Recurrent networks.
Second aspect, the embodiment of the present application are also provided to a kind of image detecting method, this method comprises:
Obtain image to be detected;
Image to be detected is detected by target detection model, target detection model is by the embodiment of the present application
What the training method training of the target detection model in one side obtained;
Based on the output of target detection model, the testing result of image to be detected is obtained.
The third aspect, the embodiment of the present application provide a kind of training device of target detection model, target detection model packet
The first sorter network is included, which includes:
Training supervision network settings module, at least one second sorter network to be arranged, wherein the second classification when training
The input of network is identical as the input of the first sorter network;
Model training module, for being trained based on total losses function to target detection model, until total losses function
Convergence, wherein total losses function includes the loss function of target detection model and the loss function of the second sorter network.
In the alternative embodiment of the application, target detection model includes single-stage detection network structure.
In the alternative embodiment of the application, it includes RetinaNet network structure that single-stage, which detects network structure,.
In the alternative embodiment of the application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein convolution
The input of layer is connect with the output of the Backbone network of RetinaNet network structure.
In the alternative embodiment of the application, the loss function of the second sorter network includes the output based on the second sorter network
Determining first-loss function, and second damage determining based on the output of the first sorter network and the output of the second sorter network
Lose at least one in function.
In the alternative embodiment of the application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate
Sample label, γ are regulatory factor.
In the embodiment of the present application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate
Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
In the alternative embodiment of the application, the loss function of target detection model includes the loss function of the first sorter network
With the loss function of target frame Recurrent networks.
Fourth aspect, the embodiment of the present application also provides a kind of image detection device, which includes:
Image collection module, for obtaining image to be detected;
Image detection module is based on target detection model for detecting by target detection model to image to be checked
Output, obtain the testing result of image to be detected, which is by the embodiment of the present application first aspect
What the training method training of target detection model obtained.
5th aspect, this application provides a kind of electronic equipment, which includes: processor and memory;
Memory, for storing operational order;
Processor executes any implementation in the application first aspect or second aspect for instructing by call operation
Method shown in example.
6th aspect, this application provides a kind of computer readable storage medium, which is stored at least one
Instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, at least a Duan Chengxu, code set or instruction set by
Reason device is loaded and is executed to realize method shown in any embodiment in the application first aspect or second aspect.
Technical solution provided by the present application has the benefit that
In the scheme of the embodiment of the present application, compared with the training method of existing target detection model, to target detection
When model is trained, by adding the second sorter network, and the loss of the loss function based on the second sorter network and model
Function is trained to target detection model, can effectively reinforce model to the study to erroneous detection and false-alarm, to improve
The detection accuracy of target detection model.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, institute in being described below to the embodiment of the present application
Attached drawing to be used is needed to be briefly described.
Fig. 1 is a kind of flow diagram of the training method of target detection model provided by the embodiments of the present application;
Fig. 2 is a kind of schematic diagram of second sorter network provided by the embodiments of the present application;
Fig. 3 is the schematic diagram of another second sorter network provided by the embodiments of the present application;
Fig. 4 is the schematic diagram of another the second sorter network provided by the embodiments of the present application;
Fig. 5 is the schematic diagram of another second sorter network provided by the embodiments of the present application;
Fig. 6 is a kind of flow diagram of image detecting method provided by the embodiments of the present application;
Fig. 7 is a kind of structural schematic diagram of the training device of target detection model provided by the embodiments of the present application;
Fig. 8 is a kind of structural schematic diagram of image detection device provided by the embodiments of the present application;
Fig. 9 is the structural schematic diagram of a kind of electronic equipment provided by the embodiments of the present application.
Specific embodiment
Embodiments herein is described below in detail, the example of embodiment is shown in the accompanying drawings, wherein identical from beginning to end
Or similar label indicates same or similar element or element with the same or similar functions.It is retouched below with reference to attached drawing
The embodiment stated is exemplary, and is only used for explaining the application, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one
It is a ", " " and "the" may also comprise plural form.It is to be further understood that " the packet of wording used in the description of the present application
Include " refer to existing characteristics, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition it is one or more
Other features, integer, step, operation, element, component and/or their group.It should be understood that when we claim element to be " connected "
Or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be intermediary elements.This
Outside, " connection " or " coupling " used herein may include being wirelessly connected or wirelessly coupling.Wording "and/or" packet used herein
Include one or more associated wholes for listing item or any cell and all combination.
To keep the purposes, technical schemes and advantages of the application clearer, with target detection model in the embodiment of the present application
The scheme in the embodiment of the present application is illustrated for human face region in detection image, below in conjunction with attached drawing to the application
Embodiment is described in further detail.
Embodiments herein provides a kind of training method of target detection model, and target detection model includes first point
Class network, as shown in Figure 1, this method may include:
At least one second sorter network is arranged in step S110, wherein the input and first of the second sorter network when training
The input of sorter network is identical.
Wherein, the effect of target detection model is for the target area in detection image, for example, for target detection mould
Type can be the Face datection model for the human face region in detection image, or for the human region in detection image
Human testing model or other object detection models etc., sorter network export the classification the result is that for characterizing target area,
For example, the output result of sorter network can be the target in the input picture for indicating detected for Face datection model
Region is the probability of human face region.
Step S120 is trained target detection model based on total losses function, until total losses function convergence,
In, total losses function includes the loss function of target detection model and the loss function of the second sorter network.
Wherein, loss function is the inconsistent degree of the prediction result and legitimate reading for estimating model, it is one
Non-negative real-valued function, loss function is smaller, then the robustness of model is better, and loss function is the core of empirical risk function
Point and structure risk function important component.Loss function convergence is the concept of a limit, in general if function
Value is when variable tends to some finite value, then loss function is exactly convergent.
It is understood that the loss function of target detection model refers to the loss function part of model itself, the portion
The concrete form divided is related with the structure of target detection model.For example, including sorter network in target detection model, then target is examined
The loss function for surveying model includes loss function corresponding to sorter network, if further including Recurrent networks in target detection model,
Then the loss function of target detection model includes the corresponding loss function of sorter network and the corresponding loss function of Recurrent networks.Its
In, the form of the loss function of each sub-network (such as sorter network, Recurrent networks) of model is also related to the structure of each sub-network.
In the embodiment of the present application, the loss function of target detection model includes the loss function and mesh of the first sorter network
Mark the loss function of frame Recurrent networks.
That is, in the embodiment of the present application, target detection model may include that the first sorter network and target frame return
Return network, the loss function of target detection model then includes the loss function and target frame Recurrent networks of the first sorter network at this time
Loss function.Correspondingly, total losses function may include the damage of the loss function of the first sorter network, target frame Recurrent networks
Lose the loss function of function and the second sorter network.
It should be noted that in practical applications, judge loss function (such as above-mentioned total losses function) whether convergent side
Formula can be configured according to actual needs.For example, in training, if to tend to some limited for the functional value of total losses function
Value, it may be considered that function convergence.Generally, total losses function is the smaller the better, with the increase of frequency of training, total losses function
Value constantly can reduce and tend to be steady, as the condition of convergence can refer to the difference of the value of adjacent total losses function trained twice
Less than given threshold, then when training result meets the condition of convergence, it may be considered that total losses function convergence, naturally it is also possible to
Other conditions of convergence are configured according to actual needs or use the whether convergent mode of other discriminant functions.
In the embodiment of the present application, in training objective detection model, by adding at least one second sorter network, and
The loss function of loss function and target detection model based on the second sorter network is trained target detection model.Due to
The total losses function includes the loss function of target detection model and the loss function of the second sorter network, and the second sorter network
Loss function supervisory role can be played when being trained to target detection model, the instruction with existing target detection model
The mode of white silk is compared, and the precision of target detection model can be promoted, and therefore, the target detection model after using the training carries out mesh
When mark detection, the false-alarm in target detection network can be effectively inhibited, the precision of target detection technique is improved.
In addition, since the second sorter network is arranged outside the network structure of target detection model, it can't shadow
The network structure of original target detection model is rung, and carries out target area detection using the target detection model subsequent
When, since the network structure of target detection model does not change, and then it would not also influence detection speed.
In the alternative embodiment of the application, target detection model includes single-stage detection network structure.
Certainly, target detection model also may include multistage detection network structure.Single-stage detects network structure, that is, one-
Stage detects network of network structure, for example, can include but is not limited to YOLO (You Only Look Once, it is only necessary to see one
All over) structure, SSD (Single Shot MultiBox Detector, single network objectives detection framework) network structure or
RetinaNet network structure etc..
In the alternative embodiment of the application, it includes RetinaNet network structure that single-stage, which detects network structure,.
Wherein, RetinaNet network structure is the network for detecting target image, which is
By a Backbone network and two single networks being made of the sub-network of particular task, which is responsible for
Convolution feature is calculated in whole image, first sub-network executes image classification task (i.e. in the output of Backbone network
First sorter network), second sub-network is responsible for convolution frame and returns (i.e. Recurrent networks).It wherein, can in the first sorter network
To include cascade convolutional layer and full articulamentum, the corresponding loss function L of the RetinaNet network structureCIncluding the first classification
The loss function L of networkclsLoss function L corresponding with Recurrent networksbb, i.e. LC=Lbb+Lcls。
In the alternative embodiment of the application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein volume
The input of lamination is connect with the output of the Backbone network of RetinaNet network structure, Backbone role of network be with
The form of classifier has carried out pre-training, and feature is extracted from image.
In practical applications, the second sorter network can be according to the design of the first sorter network, the second sorter network
Structure type can be identical or not identical with the structure type of the first sorter network, when the structure type of the second sorter network
When identical as the structure type of the first sorter network, the second sorter network can not phase with the network parameter in the first sorter network
Together.In one example, if single-stage detection network structure is RetinaNet network structure, first in RetinaNet network structure
It include cascade convolutional layer and full articulamentum in sorter network, then the second sorter network also may include cascade convolutional layer and complete
Articulamentum, and the input of the convolutional layer in the second sorter network is defeated with the Backbone network of RetinaNet network structure
It connects out.
In one example, it is illustrated so that target detection model is RetinaNet network structure as an example, as shown in Fig. 2, this
Application embodiment provides the schematic diagram of a kind of RetinaNet network structure and the second sorter network.Wherein, Branch-c1 in figure
Corresponding network branches indicate the second sorter network, and second sorter network is by cascade convolutional layer Conv3 and full articulamentum
FC3 is formed, and the part in figure in dotted line frame is RetinaNet network structure, includes in the RetinaNet network structure
Two network branches of Branch-c and Branch-b.Wherein, network branches corresponding to Branch-c indicate the first sorter network,
First sorter network is made of cascade convolutional layer Conv1 and full articulamentum, and network branches corresponding to Branch-b are back
Return network, Recurrent networks are made of cascade convolutional layer Conv2 and full articulamentum FC2 in the example.In this example,
The output of network branches corresponding to Branch-b can be the coordinate of target area, that is, output result is the seat of target area
The output of network branches corresponding to mark, Branch-c and Branch-c1 can be the probability that detection zone is target area,
Exactly exporting result is probability.
In the alternative embodiment of the application, the loss function of the second sorter network includes based on the defeated of the second sorter network
Determining first-loss function out, and based on the output of the first sorter network and export determination second of the second sorter network
At least one of in loss function.
That is, in practical applications, the loss function of the second sorter network may exist different combined situations, i.e.,
It may include the first-loss function determining based on the output of the second sorter network, or may include based on the first sorter network
Second loss function of output and the output determination of the second sorter network, also may include that the output based on the second sorter network is true
Fixed first-loss function, and second loss determining based on the output of the first sorter network and the output of the second sorter network
Function.
In the alternative embodiment of the application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y is
Sample label, γ are regulatory factor.
Wherein, sample label refers to marking in sample image with the presence or absence of added by target area, such as sample
Image is for that, with the presence or absence of human face region, at this point, including face in sample image, then can set y in detection image
1, face (i.e. the sample image is background image), then can set 0 for y if it does not exist.
In the alternative embodiment of the application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, LC2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y is
Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
That is, working as p>=th, M=1 then thinks that result is positive, as target area, if p<th, M=0, then it is assumed that
As a result it is negative, neither target area.
Below for the specific optinal plan of the loss function of the second sorter network, it is with target detection model
For RetinaNet network structure, and total losses function is described in detail in conjunction with specific example.
1, the loss function of the second sorter network includes the first-loss function determining based on the output of the second sorter network
LC1, at this point, total losses function L includes the corresponding loss function L of RetinaNet network structureC(LC=Lbb+Lcls) and based on the
The determining first-loss function L of the output of two sorter networksC1, such as L=LC1+LC。
As an example, as shown in figure 3, inputting sample image when being trained to RetinaNet network structure
After Backbone network, in RetinaNet network structure the output result of the first sorter network be Probability p, Recurrent networks it is defeated
Result is indicated by Box (box) out, and the output result of the second sorter network is Probability p1, in the example, total losses letter
Number can indicate are as follows:
L=LC+LC1=Lbb+Lcls+LC1
Wherein,
Lcls=-(1- α) * pγlog(1-p)*(1-y)-α*(1-p)γlog(p)*y
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
2, the loss function of the second sorter network includes the defeated of output based on the first sorter network and the second sorter network
The the second loss function L determined outC2, at this point, total losses function L includes the corresponding loss function L of RetinaNet network structureC
(LC=Lbb+Lcls), and second loss function determining based on the output of the first sorter network and the output of the second sorter network
LC2, such as L=LC2+LC。
As an example, as shown in figure 4, inputting sample image when being trained to RetinaNet network structure
After Backbone network, in RetinaNet network structure the output result of the first sorter network be Probability p, Recurrent networks it is defeated
Result is indicated by Box out, and the output result of the second sorter network is Probability p1, in the example, total losses function can be indicated
Are as follows:
L=LC2+LC=Lbb+Lcls+LC2
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, the value of M can be determined by following template:
Wherein, LclsConcrete form and upper embodiment in LclsConcrete form it is the same, for details, reference can be made to above-described embodiment,
Details are not described herein again.
3, the loss function of the second sorter network includes the first-loss function determining based on the output of the second sorter network
LC1, and the second loss function L determining based on the output of the first sorter network and the output of the second sorter networkC2, at this point,
Total losses function L includes the corresponding loss function L of RetinaNet network structureC(LC=Lbb+Lcls), be based on the second sorter network
The determining first-loss function L of outputC1With the output determination of output and the second sorter network based on the first sorter network
Second loss function LC2, such as L=LC2+LC+LC1。
As an example, as shown in figure 5, inputting sample image when being trained to RetinaNet network structure
After Backbone network, in RetinaNet network structure the output result of the first sorter network be Probability p, Recurrent networks it is defeated
For result by being indicated for Box, the output result of the second sorter network is Probability p out1, in the example, total losses function can
To indicate are as follows:
L=LC2+LC+LC1=Lbb+Lcls+LC2+LC1
Wherein, Lcls、LC2And LC1Concrete form and upper embodiment in Lcls、LC2And LC1Concrete form it is the same, specifically may be used
Referring to above-described embodiment, details are not described herein again.
Target detection model based on the embodiment of the present invention, as shown in fig. 6, the embodiment of the present application also provides a kind of figure
As detection method, this method comprises:
Step S610 obtains image to be detected;
Step S620 detects image to be detected by target detection model;
Wherein, target detection model be through the foregoing embodiment in target detection model training method be trained after
Target detection model, and training method specific embodiment is referred to the training of target detection model in above-described embodiment
The description of method, details are not described herein again for the embodiment of the present application, as target detection model can be after training
RetinaNet network structure.
Step S630 obtains the testing result of image to be detected based on the output of target detection model.
Wherein, target area can sets itself according to actual needs, be not limited to the face area of people, such as or can be with
It is human region etc..
That is, image to be detected can be inputted to the target detection model after training, mesh when carrying out image detection
Detection model can export as a result, exporting the testing result as a result, available image to be detected based on target detection model.
Based on principle identical with method shown in Fig. 1, a kind of target detection mould is additionally provided in embodiments herein
The training device 70 of the training device 70 of type, target detection model includes the first sorter network, as shown in fig. 7, the target detection
The training device 70 of model may include: training supervision network settings module 710 and model training module 720, in which:
Training supervision network settings module 710, at least one second sorter network to be arranged, wherein second when training
The input of sorter network is identical as the input of the first sorter network;
Model training module 720, for being trained based on total losses function to target detection model, until total losses letter
Number convergence, wherein total losses function includes the loss function of target detection model and the loss function of the second sorter network.
In the embodiment of the present application, target detection model includes single-stage detection network structure.
In the embodiment of the present application, it includes RetinaNet network structure that single-stage, which detects network structure,.
In the embodiment of the present application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein convolutional layer it is defeated
Enter and is connect with the output of the Backbone network of RetinaNet network structure.
In the embodiment of the present application, the loss function of the second sorter network includes the output determination based on the second sorter network
First-loss function, and second loss function determining based on the output of the first sorter network and the output of the second sorter network
At least one of in.
In the embodiment of the present application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate
Sample label, γ are regulatory factor.
In the embodiment of the present application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p2 γlog(p2)*M-(1-α)*(1-p2)γlog(1-p2)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate
Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
In the embodiment of the present application, the loss function of target detection model includes the loss function and target of the first sorter network
The loss function of frame Recurrent networks.
One kind provided by embodiments herein can be performed in the training device of the target detection model of the embodiment of the present application
The training method of target detection model, realization principle is similar, the training of the target detection model in each embodiment of the application
Movement performed by each module in device is and the step in the training method of the target detection model in each embodiment of the application
Rapid corresponding, the detailed functions description of each module of the training device of target detection model specifically may refer to hereinbefore shown
Corresponding target detection model training method in description, details are not described herein again.
Based on principle identical with method shown in Fig. 6, a kind of image detection dress is additionally provided in embodiments herein
It sets shown in 80, Fig. 8, which may include: image collection module 810 and image detection module 820, in which:
Image collection module 810, for obtaining image to be detected;
Image detection module 820 is based on target detection mould for detecting by target detection model to image to be checked
The output of type obtains the testing result of image to be detected;Wherein, target detection model is middle target detection through the foregoing embodiment
What the training method of model obtained after being trained.
A kind of image detection side provided by embodiments herein can be performed in the image detection device of the embodiment of the present application
Method, realization principle is similar, movement performed by each module in image detection device in each embodiment of the application be with
The step in image detecting method in each embodiment of the application is corresponding, the detailed functions of each module of image detection device
Description specifically may refer to hereinbefore shown in description in corresponding image detecting method, details are not described herein again.
A kind of electronic equipment is additionally provided in embodiments herein, which can include but is not limited to: processing
Device and memory;Memory, for storing computer operation instruction;Processor, for by calling computer operation instruction to hold
Method shown in row embodiment.
The another embodiment of the application provides a kind of computer readable storage medium, which is stored at least one
Instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, at least a Duan Chengxu, code set or instruction set by
Reason device is loaded and is executed to realize the corresponding contents in preceding method embodiment.
A kind of electronic equipment is provided in one alternate embodiment, as shown in figure 9, electronic equipment shown in Fig. 9 4000
It include: processor 4001 and memory 4003.Wherein, processor 4001 is connected with memory 4003, such as passes through 4002 phase of bus
Even.Optionally, electronic equipment 4000 can also include transceiver 4004.It should be noted that transceiver 4004 in practical application
It is not limited to one, the structure of the electronic equipment 4000 does not constitute the restriction to the embodiment of the present application.
Processor 4001 can be CPU (Central Processing Unit, central processing unit), general processor,
DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific
Integrated Circuit, specific integrated circuit), (Field Programmable Gate Array, scene can compile FPGA
Journey gate array) either other programmable logic device, transistor logic, hardware component or any combination thereof.It can be with
It realizes or executes and combine various illustrative logic blocks, module and circuit described in present disclosure.Processor
4001 are also possible to realize the combination of computing function, such as combine comprising one or more microprocessors, DSP and microprocessor
Combination etc..
Bus 4002 may include an access, and information is transmitted between said modules.Bus 4002 can be PCI
(Peripheral Component Interconnect, Peripheral Component Interconnect standard) bus or EISA (Extended
Industry Standard Architecture, expanding the industrial standard structure) bus etc..It is total that bus 4002 can be divided into address
Line, data/address bus, control bus etc..Only to be indicated with a thick line in Fig. 9 convenient for indicating, it is not intended that only one total
Line or a type of bus.
Memory 4003 can be ROM (Read Only Memory, read-only memory) or can store static information and refer to
The other kinds of static storage device enabled, RAM (Random Access Memory, random access memory) or can store
The other kinds of dynamic memory of information and instruction is also possible to EEPROM (Electrically Erasable
Programmable Read Only Memory, Electrically Erasable Programmable Read-Only Memory), CD-ROM (Compact Disc
Read Only Memory, CD-ROM) or other optical disc storages, optical disc storage (including compression optical disc, laser disc, optical disc, number
The general optical disc of word, Blu-ray Disc etc.), magnetic disk storage medium or other magnetic storage apparatus or can be used in carrying or store
Desired program code with instruction or data structure form simultaneously can be but unlimited by any other medium of computer access
In this.
Memory 4003 is used to store the application code for executing application scheme, and is held by processor 4001 to control
Row.Processor 4001 is for executing the application code stored in memory 4003, to realize aforementioned either method embodiment
Shown in content.
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other
At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that those skilled in the art are come
It says, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications also should be regarded as
Protection scope of the present invention.
Claims (13)
1. a kind of training method of target detection model, which is characterized in that the target detection model includes the first sorter network,
The described method includes:
At least one second sorter network is set, wherein the input of the second sorter network described in when training and first classification
The input of network is identical;
The target detection model is trained based on total losses function, until the total losses function convergence, wherein described
Total losses function includes the loss function of the target detection model and the loss function of second sorter network.
2. the method according to claim 1, wherein the target detection model includes single-stage detection network knot
Structure.
3. according to the method described in claim 2, it is characterized in that, single-stage detection network structure includes RetinaNet net
Network structure.
4. according to the method described in claim 3, it is characterized in that, second sorter network includes cascade convolutional layer and complete
Articulamentum, wherein the input of the convolutional layer connects with the output of the backbone Backbone network of the RetinaNet network structure
It connects.
5. method according to claim 1 to 4, which is characterized in that the loss letter of second sorter network
Number includes the first-loss function determining based on the output of second sorter network, and based on first sorter network
At least one of in second loss function of output and the output determination of second sorter network.
6. according to the method described in claim 5, it is characterized in that, the first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate sample
Label, γ are regulatory factor.
7. according to the method described in claim 5, it is characterized in that, second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) * (α * p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate sample
Label, γ are regulatory factor, and M is target area result label, and the value of the M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
8. the method according to claim 1, wherein the loss function of the target detection model includes described
The loss function of one sorter network and the loss function of target frame Recurrent networks.
9. a kind of image detecting method, which is characterized in that the described method includes:
Obtain image to be detected;
Described image to be detected is detected by the target detection model, wherein the target detection model is to pass through
What method training described in any item of the claim 1 to 8 obtained;
Based on the output of the target detection model, the testing result in described image to be detected is obtained.
10. a kind of training device of target detection model, which is characterized in that the target detection model includes the first classification net
Network, described device include:
Training supervision network settings module, at least one second sorter network to be arranged, wherein the second classification described in when training
The input of network is identical as the input of first sorter network;
Model training module, for being trained based on total losses function to the target detection model, until the total losses
Function convergence, wherein the total losses function includes the loss function and second sorter network of the target detection model
Loss function.
11. a kind of image detection device, which is characterized in that described device includes:
Image collection module, for obtaining image to be detected;
Image detection module is based on the target detection for detecting by target detection model to the image to be checked
The output of model obtains the testing result of described image to be detected, wherein the target detection model is by claim 1
It is obtained to the training of method described in any one of 8.
12. a kind of electronic equipment, which is characterized in that the electronic equipment includes: processor and memory;
The memory, for storing operational order;
The processor, for by calling the operational order, method described in any one of perform claim requirement 1 to 9.
13. a kind of computer readable storage medium, which is characterized in that the storage medium is stored at least one instruction, at least
One Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or instruction set
It is loaded by processor and is executed to realize the method as described in any in claim 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910315195.1A CN109961107B (en) | 2019-04-18 | 2019-04-18 | Training method and device for target detection model, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910315195.1A CN109961107B (en) | 2019-04-18 | 2019-04-18 | Training method and device for target detection model, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109961107A true CN109961107A (en) | 2019-07-02 |
CN109961107B CN109961107B (en) | 2022-07-19 |
Family
ID=67026354
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910315195.1A Active CN109961107B (en) | 2019-04-18 | 2019-04-18 | Training method and device for target detection model, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109961107B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110533640A (en) * | 2019-08-15 | 2019-12-03 | 北京交通大学 | Based on the track circuit disease discrimination method for improving YOLOv3 network model |
CN110991312A (en) * | 2019-11-28 | 2020-04-10 | 重庆中星微人工智能芯片技术有限公司 | Method, apparatus, electronic device, and medium for generating detection information |
CN111768005A (en) * | 2020-06-19 | 2020-10-13 | 北京百度网讯科技有限公司 | Training method and device for lightweight detection model, electronic equipment and storage medium |
CN111950411A (en) * | 2020-07-31 | 2020-11-17 | 上海商汤智能科技有限公司 | Model determination method and related device |
CN112085096A (en) * | 2020-09-09 | 2020-12-15 | 华东师范大学 | Method for detecting local abnormal heating of object based on transfer learning |
CN112308150A (en) * | 2020-11-02 | 2021-02-02 | 平安科技(深圳)有限公司 | Target detection model training method and device, computer equipment and storage medium |
WO2021143231A1 (en) * | 2020-01-17 | 2021-07-22 | 初速度(苏州)科技有限公司 | Target detection model training method, and data labeling method and apparatus |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140073917A1 (en) * | 2012-09-10 | 2014-03-13 | Oregon Health & Science University | Quantification of local circulation with oct angiography |
CN106803071A (en) * | 2016-12-29 | 2017-06-06 | 浙江大华技术股份有限公司 | Object detecting method and device in a kind of image |
CN107145857A (en) * | 2017-04-29 | 2017-09-08 | 深圳市深网视界科技有限公司 | Face character recognition methods, device and method for establishing model |
US20170262733A1 (en) * | 2016-03-10 | 2017-09-14 | Siemens Healthcare Gmbh | Method and System for Machine Learning Based Classification of Vascular Branches |
CN108460341A (en) * | 2018-02-05 | 2018-08-28 | 西安电子科技大学 | Remote sensing image object detection method based on integrated depth convolutional network |
CN108520229A (en) * | 2018-04-04 | 2018-09-11 | 北京旷视科技有限公司 | Image detecting method, device, electronic equipment and computer-readable medium |
CN108694401A (en) * | 2018-05-09 | 2018-10-23 | 北京旷视科技有限公司 | Object detection method, apparatus and system |
CN108875521A (en) * | 2017-12-20 | 2018-11-23 | 北京旷视科技有限公司 | Method for detecting human face, device, system and storage medium |
CN109102024A (en) * | 2018-08-14 | 2018-12-28 | 中山大学 | A kind of Layer semantics incorporation model finely identified for object and its implementation |
CN109360198A (en) * | 2018-10-08 | 2019-02-19 | 北京羽医甘蓝信息技术有限公司 | Bone marrwo cell sorting method and sorter based on deep learning |
CN109472214A (en) * | 2018-10-17 | 2019-03-15 | 福州大学 | One kind is taken photo by plane foreign matter image real-time detection method based on deep learning |
CN109614968A (en) * | 2018-10-10 | 2019-04-12 | 浙江大学 | A kind of car plate detection scene picture generation method based on multiple dimensioned mixed image stylization |
CN109614985A (en) * | 2018-11-06 | 2019-04-12 | 华南理工大学 | A kind of object detection method based on intensive connection features pyramid network |
-
2019
- 2019-04-18 CN CN201910315195.1A patent/CN109961107B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140073917A1 (en) * | 2012-09-10 | 2014-03-13 | Oregon Health & Science University | Quantification of local circulation with oct angiography |
US20170262733A1 (en) * | 2016-03-10 | 2017-09-14 | Siemens Healthcare Gmbh | Method and System for Machine Learning Based Classification of Vascular Branches |
CN106803071A (en) * | 2016-12-29 | 2017-06-06 | 浙江大华技术股份有限公司 | Object detecting method and device in a kind of image |
CN107145857A (en) * | 2017-04-29 | 2017-09-08 | 深圳市深网视界科技有限公司 | Face character recognition methods, device and method for establishing model |
CN108875521A (en) * | 2017-12-20 | 2018-11-23 | 北京旷视科技有限公司 | Method for detecting human face, device, system and storage medium |
CN108460341A (en) * | 2018-02-05 | 2018-08-28 | 西安电子科技大学 | Remote sensing image object detection method based on integrated depth convolutional network |
CN108520229A (en) * | 2018-04-04 | 2018-09-11 | 北京旷视科技有限公司 | Image detecting method, device, electronic equipment and computer-readable medium |
CN108694401A (en) * | 2018-05-09 | 2018-10-23 | 北京旷视科技有限公司 | Object detection method, apparatus and system |
CN109102024A (en) * | 2018-08-14 | 2018-12-28 | 中山大学 | A kind of Layer semantics incorporation model finely identified for object and its implementation |
CN109360198A (en) * | 2018-10-08 | 2019-02-19 | 北京羽医甘蓝信息技术有限公司 | Bone marrwo cell sorting method and sorter based on deep learning |
CN109614968A (en) * | 2018-10-10 | 2019-04-12 | 浙江大学 | A kind of car plate detection scene picture generation method based on multiple dimensioned mixed image stylization |
CN109472214A (en) * | 2018-10-17 | 2019-03-15 | 福州大学 | One kind is taken photo by plane foreign matter image real-time detection method based on deep learning |
CN109614985A (en) * | 2018-11-06 | 2019-04-12 | 华南理工大学 | A kind of object detection method based on intensive connection features pyramid network |
Non-Patent Citations (5)
Title |
---|
TSUNG-YI LIN等: "Focal Loss for Dense Object Detection", 《INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
WANXIN TIAN等: "Learning Better Features for Face Detection with Feature Fusion and Segmentation Supervision", 《HTTPS://ARXIV.ORG/ABS/1811.08557V1》 * |
YUANYUAN WANG等: "Automatic Ship Detection Based on RetinaNet Using Multi-Resolution Gaofen-3 Imagery", 《REMOTE SENSE》 * |
何朋朋: "基于深度学习的交通场景多目标检测与分类研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
苏剑林: "何恺明大神的「Focal Loss」,如何更好地理解?", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/32423092》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110533640A (en) * | 2019-08-15 | 2019-12-03 | 北京交通大学 | Based on the track circuit disease discrimination method for improving YOLOv3 network model |
CN110533640B (en) * | 2019-08-15 | 2022-03-01 | 北京交通大学 | Improved YOLOv3 network model-based track line defect identification method |
CN110991312A (en) * | 2019-11-28 | 2020-04-10 | 重庆中星微人工智能芯片技术有限公司 | Method, apparatus, electronic device, and medium for generating detection information |
WO2021143231A1 (en) * | 2020-01-17 | 2021-07-22 | 初速度(苏州)科技有限公司 | Target detection model training method, and data labeling method and apparatus |
CN111768005A (en) * | 2020-06-19 | 2020-10-13 | 北京百度网讯科技有限公司 | Training method and device for lightweight detection model, electronic equipment and storage medium |
CN111768005B (en) * | 2020-06-19 | 2024-02-20 | 北京康夫子健康技术有限公司 | Training method and device for lightweight detection model, electronic equipment and storage medium |
CN111950411A (en) * | 2020-07-31 | 2020-11-17 | 上海商汤智能科技有限公司 | Model determination method and related device |
CN112085096A (en) * | 2020-09-09 | 2020-12-15 | 华东师范大学 | Method for detecting local abnormal heating of object based on transfer learning |
CN112308150A (en) * | 2020-11-02 | 2021-02-02 | 平安科技(深圳)有限公司 | Target detection model training method and device, computer equipment and storage medium |
CN112308150B (en) * | 2020-11-02 | 2022-04-15 | 平安科技(深圳)有限公司 | Target detection model training method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109961107B (en) | 2022-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109961107A (en) | Training method, device, electronic equipment and the storage medium of target detection model | |
CN110991311B (en) | Target detection method based on dense connection deep network | |
CN110084292B (en) | Target detection method based on DenseNet and multi-scale feature fusion | |
CN107729790A (en) | Quick Response Code localization method and device | |
CN109711440A (en) | A kind of data exception detection method and device | |
CN109615167A (en) | Determine the method, apparatus and electronic equipment of doubtful batch risk trade event | |
CN109840312A (en) | A kind of rejecting outliers method and apparatus of boiler load factor-efficiency curve | |
CN111797826B (en) | Large aggregate concentration area detection method and device and network model training method thereof | |
CN111310759B (en) | Target detection inhibition optimization method and device for dual-mode cooperation | |
CN110263824A (en) | The training method of model, calculates equipment and computer readable storage medium at device | |
CN116152254B (en) | Industrial leakage target gas detection model training method, detection method and electronic equipment | |
CN114882321A (en) | Deep learning model training method, target object detection method and device | |
CN109102026A (en) | A kind of vehicle image detection method, apparatus and system | |
KR102576157B1 (en) | Method and apparatus for high speed object detection using artificial neural network | |
CN111340139B (en) | Method and device for judging complexity of image content | |
CN111950517A (en) | Target detection method, model training method, electronic device and storage medium | |
CN115546586A (en) | Method and device for detecting infrared dim target, computing equipment and storage medium | |
CN110020780A (en) | The method, apparatus and electronic equipment of information output | |
CN111353428B (en) | Action information identification method and device, electronic equipment and storage medium | |
CN116958021A (en) | Product defect identification method based on artificial intelligence, related device and medium | |
CN114782494A (en) | Dynamic target analysis method, device, equipment and storage medium | |
CN110458393B (en) | Method and device for determining risk identification scheme and electronic equipment | |
Tan et al. | An application of an improved FCOS algorithm in detection and recognition of industrial instruments | |
CN113392455A (en) | House type graph scale detection method and device based on deep learning and electronic equipment | |
CN114494792B (en) | Target detection method, device and equipment based on single stage and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: Training methods, devices, electronic devices, and storage media for object detection models Effective date of registration: 20230404 Granted publication date: 20220719 Pledgee: Shanghai Yunxin Venture Capital Co.,Ltd. Pledgor: MEGVII (BEIJING) TECHNOLOGY Co.,Ltd. Registration number: Y2023990000192 |