CN109961107A - Training method, device, electronic equipment and the storage medium of target detection model - Google Patents

Training method, device, electronic equipment and the storage medium of target detection model Download PDF

Info

Publication number
CN109961107A
CN109961107A CN201910315195.1A CN201910315195A CN109961107A CN 109961107 A CN109961107 A CN 109961107A CN 201910315195 A CN201910315195 A CN 201910315195A CN 109961107 A CN109961107 A CN 109961107A
Authority
CN
China
Prior art keywords
target detection
network
detection model
loss function
sorter network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910315195.1A
Other languages
Chinese (zh)
Other versions
CN109961107B (en
Inventor
李永波
李伯勋
俞刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Maigewei Technology Co Ltd
Original Assignee
Beijing Maigewei Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Maigewei Technology Co Ltd filed Critical Beijing Maigewei Technology Co Ltd
Priority to CN201910315195.1A priority Critical patent/CN109961107B/en
Publication of CN109961107A publication Critical patent/CN109961107A/en
Application granted granted Critical
Publication of CN109961107B publication Critical patent/CN109961107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The embodiment of the present application provides training method, device, electronic equipment and the storage medium of a kind of target detection model, target detection model includes the first sorter network, this method comprises: at least one second sorter network is arranged, wherein, the input of the second sorter network is identical as the input of the first sorter network when training;Target detection model is trained based on total losses function, until total losses function convergence, wherein total losses function includes the loss function of target detection model and the loss function of the second sorter network.Compared with the training method of existing target detection model, the scheme of the embodiment of the present application, when being trained to target detection model, by adding the second sorter network, and the loss function of the loss function and model based on the second sorter network is trained to target detection model, model can effectively be reinforced to the study to erroneous detection and false-alarm, to improve the detection accuracy of target detection model.

Description

Training method, device, electronic equipment and the storage medium of target detection model
Technical field
This application involves technical field of image processing, specifically, this application involves a kind of training of target detection model Method, apparatus, electronic equipment and storage medium.
Background technique
The task of target detection is to find out in image interested target, for example, when target is face, Face datection Then it is intended to detect the face and its corresponding position in scene.Target detection is one of major issue of computer vision field, There is long-range researching value in the fields such as security protection detection, human-computer interaction and is widely applied demand.
In recent years, target detection technique obtains fast development with the development of deep neural network and hardware device, But target detection technique is often associated with a large amount of false-alarm in actual application, i.e., regards as certain nontarget areas Target area has seriously affected the popularization and use of target detection technique.Therefore, how to inhibit the void in target detection network It is alert, the precision of target detection is promoted, is the extremely important problem in the field.
Summary of the invention
The purpose of the application is intended at least can solve above-mentioned one of technological deficiency, especially false-alarm during target detection The high technological deficiency of rate.
In a first aspect, the embodiment of the present application provides a kind of training method of target detection model, target detection model packet The first sorter network is included, this method comprises:
At least one second sorter network is set, wherein the input of the second sorter network and the first sorter network when training Input it is identical;
Target detection model is trained based on total losses function, until total losses function convergence, wherein total losses letter Number includes the loss function of target detection model and the loss function of the second sorter network.
In the alternative embodiment of the application, target detection model includes single-stage detection network structure.
In the alternative embodiment of the application, it includes RetinaNet network structure that single-stage, which detects network structure,.
In the alternative embodiment of the application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein convolution The input of layer is connect with the output of Backbone (backbone) network of RetinaNet network structure.
In the alternative embodiment of the application, the loss function of the second sorter network includes the output based on the second sorter network Determining first-loss function, and second damage determining based on the output of the first sorter network and the output of the second sorter network Lose at least one in function.
In the embodiment of the present application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate Sample label, γ are regulatory factor.
In the alternative embodiment of the application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
In the alternative embodiment of the application, the loss function of target detection model includes the loss function of the first sorter network With the loss function of target frame Recurrent networks.
Second aspect, the embodiment of the present application are also provided to a kind of image detecting method, this method comprises:
Obtain image to be detected;
Image to be detected is detected by target detection model, target detection model is by the embodiment of the present application What the training method training of the target detection model in one side obtained;
Based on the output of target detection model, the testing result of image to be detected is obtained.
The third aspect, the embodiment of the present application provide a kind of training device of target detection model, target detection model packet The first sorter network is included, which includes:
Training supervision network settings module, at least one second sorter network to be arranged, wherein the second classification when training The input of network is identical as the input of the first sorter network;
Model training module, for being trained based on total losses function to target detection model, until total losses function Convergence, wherein total losses function includes the loss function of target detection model and the loss function of the second sorter network.
In the alternative embodiment of the application, target detection model includes single-stage detection network structure.
In the alternative embodiment of the application, it includes RetinaNet network structure that single-stage, which detects network structure,.
In the alternative embodiment of the application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein convolution The input of layer is connect with the output of the Backbone network of RetinaNet network structure.
In the alternative embodiment of the application, the loss function of the second sorter network includes the output based on the second sorter network Determining first-loss function, and second damage determining based on the output of the first sorter network and the output of the second sorter network Lose at least one in function.
In the alternative embodiment of the application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate Sample label, γ are regulatory factor.
In the embodiment of the present application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
In the alternative embodiment of the application, the loss function of target detection model includes the loss function of the first sorter network With the loss function of target frame Recurrent networks.
Fourth aspect, the embodiment of the present application also provides a kind of image detection device, which includes:
Image collection module, for obtaining image to be detected;
Image detection module is based on target detection model for detecting by target detection model to image to be checked Output, obtain the testing result of image to be detected, which is by the embodiment of the present application first aspect What the training method training of target detection model obtained.
5th aspect, this application provides a kind of electronic equipment, which includes: processor and memory;
Memory, for storing operational order;
Processor executes any implementation in the application first aspect or second aspect for instructing by call operation Method shown in example.
6th aspect, this application provides a kind of computer readable storage medium, which is stored at least one Instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, at least a Duan Chengxu, code set or instruction set by Reason device is loaded and is executed to realize method shown in any embodiment in the application first aspect or second aspect.
Technical solution provided by the present application has the benefit that
In the scheme of the embodiment of the present application, compared with the training method of existing target detection model, to target detection When model is trained, by adding the second sorter network, and the loss of the loss function based on the second sorter network and model Function is trained to target detection model, can effectively reinforce model to the study to erroneous detection and false-alarm, to improve The detection accuracy of target detection model.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, institute in being described below to the embodiment of the present application Attached drawing to be used is needed to be briefly described.
Fig. 1 is a kind of flow diagram of the training method of target detection model provided by the embodiments of the present application;
Fig. 2 is a kind of schematic diagram of second sorter network provided by the embodiments of the present application;
Fig. 3 is the schematic diagram of another second sorter network provided by the embodiments of the present application;
Fig. 4 is the schematic diagram of another the second sorter network provided by the embodiments of the present application;
Fig. 5 is the schematic diagram of another second sorter network provided by the embodiments of the present application;
Fig. 6 is a kind of flow diagram of image detecting method provided by the embodiments of the present application;
Fig. 7 is a kind of structural schematic diagram of the training device of target detection model provided by the embodiments of the present application;
Fig. 8 is a kind of structural schematic diagram of image detection device provided by the embodiments of the present application;
Fig. 9 is the structural schematic diagram of a kind of electronic equipment provided by the embodiments of the present application.
Specific embodiment
Embodiments herein is described below in detail, the example of embodiment is shown in the accompanying drawings, wherein identical from beginning to end Or similar label indicates same or similar element or element with the same or similar functions.It is retouched below with reference to attached drawing The embodiment stated is exemplary, and is only used for explaining the application, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " " and "the" may also comprise plural form.It is to be further understood that " the packet of wording used in the description of the present application Include " refer to existing characteristics, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition it is one or more Other features, integer, step, operation, element, component and/or their group.It should be understood that when we claim element to be " connected " Or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be intermediary elements.This Outside, " connection " or " coupling " used herein may include being wirelessly connected or wirelessly coupling.Wording "and/or" packet used herein Include one or more associated wholes for listing item or any cell and all combination.
To keep the purposes, technical schemes and advantages of the application clearer, with target detection model in the embodiment of the present application The scheme in the embodiment of the present application is illustrated for human face region in detection image, below in conjunction with attached drawing to the application Embodiment is described in further detail.
Embodiments herein provides a kind of training method of target detection model, and target detection model includes first point Class network, as shown in Figure 1, this method may include:
At least one second sorter network is arranged in step S110, wherein the input and first of the second sorter network when training The input of sorter network is identical.
Wherein, the effect of target detection model is for the target area in detection image, for example, for target detection mould Type can be the Face datection model for the human face region in detection image, or for the human region in detection image Human testing model or other object detection models etc., sorter network export the classification the result is that for characterizing target area, For example, the output result of sorter network can be the target in the input picture for indicating detected for Face datection model Region is the probability of human face region.
Step S120 is trained target detection model based on total losses function, until total losses function convergence, In, total losses function includes the loss function of target detection model and the loss function of the second sorter network.
Wherein, loss function is the inconsistent degree of the prediction result and legitimate reading for estimating model, it is one Non-negative real-valued function, loss function is smaller, then the robustness of model is better, and loss function is the core of empirical risk function Point and structure risk function important component.Loss function convergence is the concept of a limit, in general if function Value is when variable tends to some finite value, then loss function is exactly convergent.
It is understood that the loss function of target detection model refers to the loss function part of model itself, the portion The concrete form divided is related with the structure of target detection model.For example, including sorter network in target detection model, then target is examined The loss function for surveying model includes loss function corresponding to sorter network, if further including Recurrent networks in target detection model, Then the loss function of target detection model includes the corresponding loss function of sorter network and the corresponding loss function of Recurrent networks.Its In, the form of the loss function of each sub-network (such as sorter network, Recurrent networks) of model is also related to the structure of each sub-network.
In the embodiment of the present application, the loss function of target detection model includes the loss function and mesh of the first sorter network Mark the loss function of frame Recurrent networks.
That is, in the embodiment of the present application, target detection model may include that the first sorter network and target frame return Return network, the loss function of target detection model then includes the loss function and target frame Recurrent networks of the first sorter network at this time Loss function.Correspondingly, total losses function may include the damage of the loss function of the first sorter network, target frame Recurrent networks Lose the loss function of function and the second sorter network.
It should be noted that in practical applications, judge loss function (such as above-mentioned total losses function) whether convergent side Formula can be configured according to actual needs.For example, in training, if to tend to some limited for the functional value of total losses function Value, it may be considered that function convergence.Generally, total losses function is the smaller the better, with the increase of frequency of training, total losses function Value constantly can reduce and tend to be steady, as the condition of convergence can refer to the difference of the value of adjacent total losses function trained twice Less than given threshold, then when training result meets the condition of convergence, it may be considered that total losses function convergence, naturally it is also possible to Other conditions of convergence are configured according to actual needs or use the whether convergent mode of other discriminant functions.
In the embodiment of the present application, in training objective detection model, by adding at least one second sorter network, and The loss function of loss function and target detection model based on the second sorter network is trained target detection model.Due to The total losses function includes the loss function of target detection model and the loss function of the second sorter network, and the second sorter network Loss function supervisory role can be played when being trained to target detection model, the instruction with existing target detection model The mode of white silk is compared, and the precision of target detection model can be promoted, and therefore, the target detection model after using the training carries out mesh When mark detection, the false-alarm in target detection network can be effectively inhibited, the precision of target detection technique is improved.
In addition, since the second sorter network is arranged outside the network structure of target detection model, it can't shadow The network structure of original target detection model is rung, and carries out target area detection using the target detection model subsequent When, since the network structure of target detection model does not change, and then it would not also influence detection speed.
In the alternative embodiment of the application, target detection model includes single-stage detection network structure.
Certainly, target detection model also may include multistage detection network structure.Single-stage detects network structure, that is, one- Stage detects network of network structure, for example, can include but is not limited to YOLO (You Only Look Once, it is only necessary to see one All over) structure, SSD (Single Shot MultiBox Detector, single network objectives detection framework) network structure or RetinaNet network structure etc..
In the alternative embodiment of the application, it includes RetinaNet network structure that single-stage, which detects network structure,.
Wherein, RetinaNet network structure is the network for detecting target image, which is By a Backbone network and two single networks being made of the sub-network of particular task, which is responsible for Convolution feature is calculated in whole image, first sub-network executes image classification task (i.e. in the output of Backbone network First sorter network), second sub-network is responsible for convolution frame and returns (i.e. Recurrent networks).It wherein, can in the first sorter network To include cascade convolutional layer and full articulamentum, the corresponding loss function L of the RetinaNet network structureCIncluding the first classification The loss function L of networkclsLoss function L corresponding with Recurrent networksbb, i.e. LC=Lbb+Lcls
In the alternative embodiment of the application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein volume The input of lamination is connect with the output of the Backbone network of RetinaNet network structure, Backbone role of network be with The form of classifier has carried out pre-training, and feature is extracted from image.
In practical applications, the second sorter network can be according to the design of the first sorter network, the second sorter network Structure type can be identical or not identical with the structure type of the first sorter network, when the structure type of the second sorter network When identical as the structure type of the first sorter network, the second sorter network can not phase with the network parameter in the first sorter network Together.In one example, if single-stage detection network structure is RetinaNet network structure, first in RetinaNet network structure It include cascade convolutional layer and full articulamentum in sorter network, then the second sorter network also may include cascade convolutional layer and complete Articulamentum, and the input of the convolutional layer in the second sorter network is defeated with the Backbone network of RetinaNet network structure It connects out.
In one example, it is illustrated so that target detection model is RetinaNet network structure as an example, as shown in Fig. 2, this Application embodiment provides the schematic diagram of a kind of RetinaNet network structure and the second sorter network.Wherein, Branch-c1 in figure Corresponding network branches indicate the second sorter network, and second sorter network is by cascade convolutional layer Conv3 and full articulamentum FC3 is formed, and the part in figure in dotted line frame is RetinaNet network structure, includes in the RetinaNet network structure Two network branches of Branch-c and Branch-b.Wherein, network branches corresponding to Branch-c indicate the first sorter network, First sorter network is made of cascade convolutional layer Conv1 and full articulamentum, and network branches corresponding to Branch-b are back Return network, Recurrent networks are made of cascade convolutional layer Conv2 and full articulamentum FC2 in the example.In this example, The output of network branches corresponding to Branch-b can be the coordinate of target area, that is, output result is the seat of target area The output of network branches corresponding to mark, Branch-c and Branch-c1 can be the probability that detection zone is target area, Exactly exporting result is probability.
In the alternative embodiment of the application, the loss function of the second sorter network includes based on the defeated of the second sorter network Determining first-loss function out, and based on the output of the first sorter network and export determination second of the second sorter network At least one of in loss function.
That is, in practical applications, the loss function of the second sorter network may exist different combined situations, i.e., It may include the first-loss function determining based on the output of the second sorter network, or may include based on the first sorter network Second loss function of output and the output determination of the second sorter network, also may include that the output based on the second sorter network is true Fixed first-loss function, and second loss determining based on the output of the first sorter network and the output of the second sorter network Function.
In the alternative embodiment of the application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y is Sample label, γ are regulatory factor.
Wherein, sample label refers to marking in sample image with the presence or absence of added by target area, such as sample Image is for that, with the presence or absence of human face region, at this point, including face in sample image, then can set y in detection image 1, face (i.e. the sample image is background image), then can set 0 for y if it does not exist.
In the alternative embodiment of the application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, LC2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y is Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
That is, working as p>=th, M=1 then thinks that result is positive, as target area, if p<th, M=0, then it is assumed that As a result it is negative, neither target area.
Below for the specific optinal plan of the loss function of the second sorter network, it is with target detection model For RetinaNet network structure, and total losses function is described in detail in conjunction with specific example.
1, the loss function of the second sorter network includes the first-loss function determining based on the output of the second sorter network LC1, at this point, total losses function L includes the corresponding loss function L of RetinaNet network structureC(LC=Lbb+Lcls) and based on the The determining first-loss function L of the output of two sorter networksC1, such as L=LC1+LC
As an example, as shown in figure 3, inputting sample image when being trained to RetinaNet network structure After Backbone network, in RetinaNet network structure the output result of the first sorter network be Probability p, Recurrent networks it is defeated Result is indicated by Box (box) out, and the output result of the second sorter network is Probability p1, in the example, total losses letter Number can indicate are as follows:
L=LC+LC1=Lbb+Lcls+LC1
Wherein,
Lcls=-(1- α) * pγlog(1-p)*(1-y)-α*(1-p)γlog(p)*y
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
2, the loss function of the second sorter network includes the defeated of output based on the first sorter network and the second sorter network The the second loss function L determined outC2, at this point, total losses function L includes the corresponding loss function L of RetinaNet network structureC (LC=Lbb+Lcls), and second loss function determining based on the output of the first sorter network and the output of the second sorter network LC2, such as L=LC2+LC
As an example, as shown in figure 4, inputting sample image when being trained to RetinaNet network structure After Backbone network, in RetinaNet network structure the output result of the first sorter network be Probability p, Recurrent networks it is defeated Result is indicated by Box out, and the output result of the second sorter network is Probability p1, in the example, total losses function can be indicated Are as follows:
L=LC2+LC=Lbb+Lcls+LC2
LC2=-(y* (1-M)+(1-y) * M) *
(α*p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, the value of M can be determined by following template:
Wherein, LclsConcrete form and upper embodiment in LclsConcrete form it is the same, for details, reference can be made to above-described embodiment, Details are not described herein again.
3, the loss function of the second sorter network includes the first-loss function determining based on the output of the second sorter network LC1, and the second loss function L determining based on the output of the first sorter network and the output of the second sorter networkC2, at this point, Total losses function L includes the corresponding loss function L of RetinaNet network structureC(LC=Lbb+Lcls), be based on the second sorter network The determining first-loss function L of outputC1With the output determination of output and the second sorter network based on the first sorter network Second loss function LC2, such as L=LC2+LC+LC1
As an example, as shown in figure 5, inputting sample image when being trained to RetinaNet network structure After Backbone network, in RetinaNet network structure the output result of the first sorter network be Probability p, Recurrent networks it is defeated For result by being indicated for Box, the output result of the second sorter network is Probability p out1, in the example, total losses function can To indicate are as follows:
L=LC2+LC+LC1=Lbb+Lcls+LC2+LC1
Wherein, Lcls、LC2And LC1Concrete form and upper embodiment in Lcls、LC2And LC1Concrete form it is the same, specifically may be used Referring to above-described embodiment, details are not described herein again.
Target detection model based on the embodiment of the present invention, as shown in fig. 6, the embodiment of the present application also provides a kind of figure As detection method, this method comprises:
Step S610 obtains image to be detected;
Step S620 detects image to be detected by target detection model;
Wherein, target detection model be through the foregoing embodiment in target detection model training method be trained after Target detection model, and training method specific embodiment is referred to the training of target detection model in above-described embodiment The description of method, details are not described herein again for the embodiment of the present application, as target detection model can be after training RetinaNet network structure.
Step S630 obtains the testing result of image to be detected based on the output of target detection model.
Wherein, target area can sets itself according to actual needs, be not limited to the face area of people, such as or can be with It is human region etc..
That is, image to be detected can be inputted to the target detection model after training, mesh when carrying out image detection Detection model can export as a result, exporting the testing result as a result, available image to be detected based on target detection model.
Based on principle identical with method shown in Fig. 1, a kind of target detection mould is additionally provided in embodiments herein The training device 70 of the training device 70 of type, target detection model includes the first sorter network, as shown in fig. 7, the target detection The training device 70 of model may include: training supervision network settings module 710 and model training module 720, in which:
Training supervision network settings module 710, at least one second sorter network to be arranged, wherein second when training The input of sorter network is identical as the input of the first sorter network;
Model training module 720, for being trained based on total losses function to target detection model, until total losses letter Number convergence, wherein total losses function includes the loss function of target detection model and the loss function of the second sorter network.
In the embodiment of the present application, target detection model includes single-stage detection network structure.
In the embodiment of the present application, it includes RetinaNet network structure that single-stage, which detects network structure,.
In the embodiment of the present application, the second sorter network includes cascade convolutional layer and full articulamentum, wherein convolutional layer it is defeated Enter and is connect with the output of the Backbone network of RetinaNet network structure.
In the embodiment of the present application, the loss function of the second sorter network includes the output determination based on the second sorter network First-loss function, and second loss function determining based on the output of the first sorter network and the output of the second sorter network At least one of in.
In the embodiment of the present application, first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate Sample label, γ are regulatory factor.
In the embodiment of the present application, the second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) *
(α*p2 γlog(p2)*M-(1-α)*(1-p2)γlog(1-p2)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate Sample label, γ are regulatory factor, and M is target area result label, and the value of M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
In the embodiment of the present application, the loss function of target detection model includes the loss function and target of the first sorter network The loss function of frame Recurrent networks.
One kind provided by embodiments herein can be performed in the training device of the target detection model of the embodiment of the present application The training method of target detection model, realization principle is similar, the training of the target detection model in each embodiment of the application Movement performed by each module in device is and the step in the training method of the target detection model in each embodiment of the application Rapid corresponding, the detailed functions description of each module of the training device of target detection model specifically may refer to hereinbefore shown Corresponding target detection model training method in description, details are not described herein again.
Based on principle identical with method shown in Fig. 6, a kind of image detection dress is additionally provided in embodiments herein It sets shown in 80, Fig. 8, which may include: image collection module 810 and image detection module 820, in which:
Image collection module 810, for obtaining image to be detected;
Image detection module 820 is based on target detection mould for detecting by target detection model to image to be checked The output of type obtains the testing result of image to be detected;Wherein, target detection model is middle target detection through the foregoing embodiment What the training method of model obtained after being trained.
A kind of image detection side provided by embodiments herein can be performed in the image detection device of the embodiment of the present application Method, realization principle is similar, movement performed by each module in image detection device in each embodiment of the application be with The step in image detecting method in each embodiment of the application is corresponding, the detailed functions of each module of image detection device Description specifically may refer to hereinbefore shown in description in corresponding image detecting method, details are not described herein again.
A kind of electronic equipment is additionally provided in embodiments herein, which can include but is not limited to: processing Device and memory;Memory, for storing computer operation instruction;Processor, for by calling computer operation instruction to hold Method shown in row embodiment.
The another embodiment of the application provides a kind of computer readable storage medium, which is stored at least one Instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, at least a Duan Chengxu, code set or instruction set by Reason device is loaded and is executed to realize the corresponding contents in preceding method embodiment.
A kind of electronic equipment is provided in one alternate embodiment, as shown in figure 9, electronic equipment shown in Fig. 9 4000 It include: processor 4001 and memory 4003.Wherein, processor 4001 is connected with memory 4003, such as passes through 4002 phase of bus Even.Optionally, electronic equipment 4000 can also include transceiver 4004.It should be noted that transceiver 4004 in practical application It is not limited to one, the structure of the electronic equipment 4000 does not constitute the restriction to the embodiment of the present application.
Processor 4001 can be CPU (Central Processing Unit, central processing unit), general processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit, specific integrated circuit), (Field Programmable Gate Array, scene can compile FPGA Journey gate array) either other programmable logic device, transistor logic, hardware component or any combination thereof.It can be with It realizes or executes and combine various illustrative logic blocks, module and circuit described in present disclosure.Processor 4001 are also possible to realize the combination of computing function, such as combine comprising one or more microprocessors, DSP and microprocessor Combination etc..
Bus 4002 may include an access, and information is transmitted between said modules.Bus 4002 can be PCI (Peripheral Component Interconnect, Peripheral Component Interconnect standard) bus or EISA (Extended Industry Standard Architecture, expanding the industrial standard structure) bus etc..It is total that bus 4002 can be divided into address Line, data/address bus, control bus etc..Only to be indicated with a thick line in Fig. 9 convenient for indicating, it is not intended that only one total Line or a type of bus.
Memory 4003 can be ROM (Read Only Memory, read-only memory) or can store static information and refer to The other kinds of static storage device enabled, RAM (Random Access Memory, random access memory) or can store The other kinds of dynamic memory of information and instruction is also possible to EEPROM (Electrically Erasable Programmable Read Only Memory, Electrically Erasable Programmable Read-Only Memory), CD-ROM (Compact Disc Read Only Memory, CD-ROM) or other optical disc storages, optical disc storage (including compression optical disc, laser disc, optical disc, number The general optical disc of word, Blu-ray Disc etc.), magnetic disk storage medium or other magnetic storage apparatus or can be used in carrying or store Desired program code with instruction or data structure form simultaneously can be but unlimited by any other medium of computer access In this.
Memory 4003 is used to store the application code for executing application scheme, and is held by processor 4001 to control Row.Processor 4001 is for executing the application code stored in memory 4003, to realize aforementioned either method embodiment Shown in content.
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that those skilled in the art are come It says, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications also should be regarded as Protection scope of the present invention.

Claims (13)

1. a kind of training method of target detection model, which is characterized in that the target detection model includes the first sorter network, The described method includes:
At least one second sorter network is set, wherein the input of the second sorter network described in when training and first classification The input of network is identical;
The target detection model is trained based on total losses function, until the total losses function convergence, wherein described Total losses function includes the loss function of the target detection model and the loss function of second sorter network.
2. the method according to claim 1, wherein the target detection model includes single-stage detection network knot Structure.
3. according to the method described in claim 2, it is characterized in that, single-stage detection network structure includes RetinaNet net Network structure.
4. according to the method described in claim 3, it is characterized in that, second sorter network includes cascade convolutional layer and complete Articulamentum, wherein the input of the convolutional layer connects with the output of the backbone Backbone network of the RetinaNet network structure It connects.
5. method according to claim 1 to 4, which is characterized in that the loss letter of second sorter network Number includes the first-loss function determining based on the output of second sorter network, and based on first sorter network At least one of in second loss function of output and the output determination of second sorter network.
6. according to the method described in claim 5, it is characterized in that, the first-loss function are as follows:
LC1=-(1- α) * p1 γlog(1-p1)*(1-y)-α*(1-p1)γlog(p1)*y
Wherein, LC1Indicate first-loss function, α is weight factor, p1For the second sorter network output as a result, y indicate sample Label, γ are regulatory factor.
7. according to the method described in claim 5, it is characterized in that, second loss function are as follows:
LC2=-(y* (1-M)+(1-y) * M) * (α * p1 γlog(p1)*M-(1-α)*(1-p1)γlog(1-p1)*(1-M))
Wherein, Lc2Indicate that the second loss function, α are weight factor, p1For the second sorter network output as a result, y indicate sample Label, γ are regulatory factor, and M is target area result label, and the value of the M determines in the following manner:
Wherein, p indicates that the output of the first sorter network determines that th indicates preset threshold.
8. the method according to claim 1, wherein the loss function of the target detection model includes described The loss function of one sorter network and the loss function of target frame Recurrent networks.
9. a kind of image detecting method, which is characterized in that the described method includes:
Obtain image to be detected;
Described image to be detected is detected by the target detection model, wherein the target detection model is to pass through What method training described in any item of the claim 1 to 8 obtained;
Based on the output of the target detection model, the testing result in described image to be detected is obtained.
10. a kind of training device of target detection model, which is characterized in that the target detection model includes the first classification net Network, described device include:
Training supervision network settings module, at least one second sorter network to be arranged, wherein the second classification described in when training The input of network is identical as the input of first sorter network;
Model training module, for being trained based on total losses function to the target detection model, until the total losses Function convergence, wherein the total losses function includes the loss function and second sorter network of the target detection model Loss function.
11. a kind of image detection device, which is characterized in that described device includes:
Image collection module, for obtaining image to be detected;
Image detection module is based on the target detection for detecting by target detection model to the image to be checked The output of model obtains the testing result of described image to be detected, wherein the target detection model is by claim 1 It is obtained to the training of method described in any one of 8.
12. a kind of electronic equipment, which is characterized in that the electronic equipment includes: processor and memory;
The memory, for storing operational order;
The processor, for by calling the operational order, method described in any one of perform claim requirement 1 to 9.
13. a kind of computer readable storage medium, which is characterized in that the storage medium is stored at least one instruction, at least One Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or instruction set It is loaded by processor and is executed to realize the method as described in any in claim 1 to 9.
CN201910315195.1A 2019-04-18 2019-04-18 Training method and device for target detection model, electronic equipment and storage medium Active CN109961107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910315195.1A CN109961107B (en) 2019-04-18 2019-04-18 Training method and device for target detection model, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910315195.1A CN109961107B (en) 2019-04-18 2019-04-18 Training method and device for target detection model, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109961107A true CN109961107A (en) 2019-07-02
CN109961107B CN109961107B (en) 2022-07-19

Family

ID=67026354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910315195.1A Active CN109961107B (en) 2019-04-18 2019-04-18 Training method and device for target detection model, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109961107B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533640A (en) * 2019-08-15 2019-12-03 北京交通大学 Based on the track circuit disease discrimination method for improving YOLOv3 network model
CN110991312A (en) * 2019-11-28 2020-04-10 重庆中星微人工智能芯片技术有限公司 Method, apparatus, electronic device, and medium for generating detection information
CN111768005A (en) * 2020-06-19 2020-10-13 北京百度网讯科技有限公司 Training method and device for lightweight detection model, electronic equipment and storage medium
CN111950411A (en) * 2020-07-31 2020-11-17 上海商汤智能科技有限公司 Model determination method and related device
CN112085096A (en) * 2020-09-09 2020-12-15 华东师范大学 Method for detecting local abnormal heating of object based on transfer learning
CN112308150A (en) * 2020-11-02 2021-02-02 平安科技(深圳)有限公司 Target detection model training method and device, computer equipment and storage medium
WO2021143231A1 (en) * 2020-01-17 2021-07-22 初速度(苏州)科技有限公司 Target detection model training method, and data labeling method and apparatus

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140073917A1 (en) * 2012-09-10 2014-03-13 Oregon Health & Science University Quantification of local circulation with oct angiography
CN106803071A (en) * 2016-12-29 2017-06-06 浙江大华技术股份有限公司 Object detecting method and device in a kind of image
CN107145857A (en) * 2017-04-29 2017-09-08 深圳市深网视界科技有限公司 Face character recognition methods, device and method for establishing model
US20170262733A1 (en) * 2016-03-10 2017-09-14 Siemens Healthcare Gmbh Method and System for Machine Learning Based Classification of Vascular Branches
CN108460341A (en) * 2018-02-05 2018-08-28 西安电子科技大学 Remote sensing image object detection method based on integrated depth convolutional network
CN108520229A (en) * 2018-04-04 2018-09-11 北京旷视科技有限公司 Image detecting method, device, electronic equipment and computer-readable medium
CN108694401A (en) * 2018-05-09 2018-10-23 北京旷视科技有限公司 Object detection method, apparatus and system
CN108875521A (en) * 2017-12-20 2018-11-23 北京旷视科技有限公司 Method for detecting human face, device, system and storage medium
CN109102024A (en) * 2018-08-14 2018-12-28 中山大学 A kind of Layer semantics incorporation model finely identified for object and its implementation
CN109360198A (en) * 2018-10-08 2019-02-19 北京羽医甘蓝信息技术有限公司 Bone marrwo cell sorting method and sorter based on deep learning
CN109472214A (en) * 2018-10-17 2019-03-15 福州大学 One kind is taken photo by plane foreign matter image real-time detection method based on deep learning
CN109614968A (en) * 2018-10-10 2019-04-12 浙江大学 A kind of car plate detection scene picture generation method based on multiple dimensioned mixed image stylization
CN109614985A (en) * 2018-11-06 2019-04-12 华南理工大学 A kind of object detection method based on intensive connection features pyramid network

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140073917A1 (en) * 2012-09-10 2014-03-13 Oregon Health & Science University Quantification of local circulation with oct angiography
US20170262733A1 (en) * 2016-03-10 2017-09-14 Siemens Healthcare Gmbh Method and System for Machine Learning Based Classification of Vascular Branches
CN106803071A (en) * 2016-12-29 2017-06-06 浙江大华技术股份有限公司 Object detecting method and device in a kind of image
CN107145857A (en) * 2017-04-29 2017-09-08 深圳市深网视界科技有限公司 Face character recognition methods, device and method for establishing model
CN108875521A (en) * 2017-12-20 2018-11-23 北京旷视科技有限公司 Method for detecting human face, device, system and storage medium
CN108460341A (en) * 2018-02-05 2018-08-28 西安电子科技大学 Remote sensing image object detection method based on integrated depth convolutional network
CN108520229A (en) * 2018-04-04 2018-09-11 北京旷视科技有限公司 Image detecting method, device, electronic equipment and computer-readable medium
CN108694401A (en) * 2018-05-09 2018-10-23 北京旷视科技有限公司 Object detection method, apparatus and system
CN109102024A (en) * 2018-08-14 2018-12-28 中山大学 A kind of Layer semantics incorporation model finely identified for object and its implementation
CN109360198A (en) * 2018-10-08 2019-02-19 北京羽医甘蓝信息技术有限公司 Bone marrwo cell sorting method and sorter based on deep learning
CN109614968A (en) * 2018-10-10 2019-04-12 浙江大学 A kind of car plate detection scene picture generation method based on multiple dimensioned mixed image stylization
CN109472214A (en) * 2018-10-17 2019-03-15 福州大学 One kind is taken photo by plane foreign matter image real-time detection method based on deep learning
CN109614985A (en) * 2018-11-06 2019-04-12 华南理工大学 A kind of object detection method based on intensive connection features pyramid network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
TSUNG-YI LIN等: "Focal Loss for Dense Object Detection", 《INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 *
WANXIN TIAN等: "Learning Better Features for Face Detection with Feature Fusion and Segmentation Supervision", 《HTTPS://ARXIV.ORG/ABS/1811.08557V1》 *
YUANYUAN WANG等: "Automatic Ship Detection Based on RetinaNet Using Multi-Resolution Gaofen-3 Imagery", 《REMOTE SENSE》 *
何朋朋: "基于深度学习的交通场景多目标检测与分类研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
苏剑林: "何恺明大神的「Focal Loss」,如何更好地理解?", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/32423092》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533640A (en) * 2019-08-15 2019-12-03 北京交通大学 Based on the track circuit disease discrimination method for improving YOLOv3 network model
CN110533640B (en) * 2019-08-15 2022-03-01 北京交通大学 Improved YOLOv3 network model-based track line defect identification method
CN110991312A (en) * 2019-11-28 2020-04-10 重庆中星微人工智能芯片技术有限公司 Method, apparatus, electronic device, and medium for generating detection information
WO2021143231A1 (en) * 2020-01-17 2021-07-22 初速度(苏州)科技有限公司 Target detection model training method, and data labeling method and apparatus
CN111768005A (en) * 2020-06-19 2020-10-13 北京百度网讯科技有限公司 Training method and device for lightweight detection model, electronic equipment and storage medium
CN111768005B (en) * 2020-06-19 2024-02-20 北京康夫子健康技术有限公司 Training method and device for lightweight detection model, electronic equipment and storage medium
CN111950411A (en) * 2020-07-31 2020-11-17 上海商汤智能科技有限公司 Model determination method and related device
CN112085096A (en) * 2020-09-09 2020-12-15 华东师范大学 Method for detecting local abnormal heating of object based on transfer learning
CN112308150A (en) * 2020-11-02 2021-02-02 平安科技(深圳)有限公司 Target detection model training method and device, computer equipment and storage medium
CN112308150B (en) * 2020-11-02 2022-04-15 平安科技(深圳)有限公司 Target detection model training method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN109961107B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN109961107A (en) Training method, device, electronic equipment and the storage medium of target detection model
CN110991311B (en) Target detection method based on dense connection deep network
CN110084292B (en) Target detection method based on DenseNet and multi-scale feature fusion
CN107729790A (en) Quick Response Code localization method and device
CN109711440A (en) A kind of data exception detection method and device
CN109615167A (en) Determine the method, apparatus and electronic equipment of doubtful batch risk trade event
CN109840312A (en) A kind of rejecting outliers method and apparatus of boiler load factor-efficiency curve
CN111797826B (en) Large aggregate concentration area detection method and device and network model training method thereof
CN111310759B (en) Target detection inhibition optimization method and device for dual-mode cooperation
CN110263824A (en) The training method of model, calculates equipment and computer readable storage medium at device
CN116152254B (en) Industrial leakage target gas detection model training method, detection method and electronic equipment
CN114882321A (en) Deep learning model training method, target object detection method and device
CN109102026A (en) A kind of vehicle image detection method, apparatus and system
KR102576157B1 (en) Method and apparatus for high speed object detection using artificial neural network
CN111340139B (en) Method and device for judging complexity of image content
CN111950517A (en) Target detection method, model training method, electronic device and storage medium
CN115546586A (en) Method and device for detecting infrared dim target, computing equipment and storage medium
CN110020780A (en) The method, apparatus and electronic equipment of information output
CN111353428B (en) Action information identification method and device, electronic equipment and storage medium
CN116958021A (en) Product defect identification method based on artificial intelligence, related device and medium
CN114782494A (en) Dynamic target analysis method, device, equipment and storage medium
CN110458393B (en) Method and device for determining risk identification scheme and electronic equipment
Tan et al. An application of an improved FCOS algorithm in detection and recognition of industrial instruments
CN113392455A (en) House type graph scale detection method and device based on deep learning and electronic equipment
CN114494792B (en) Target detection method, device and equipment based on single stage and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Training methods, devices, electronic devices, and storage media for object detection models

Effective date of registration: 20230404

Granted publication date: 20220719

Pledgee: Shanghai Yunxin Venture Capital Co.,Ltd.

Pledgor: MEGVII (BEIJING) TECHNOLOGY Co.,Ltd.

Registration number: Y2023990000192