CN109800778A

CN109800778A - A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible

Info

Publication number: CN109800778A
Application number: CN201811463226.XA
Authority: CN
Inventors: 张烨; 樊一超; 郭艺玲; 许艇; 程康
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2018-12-03
Filing date: 2018-12-03
Publication date: 2019-05-24
Anticipated expiration: 2038-12-03
Also published as: CN109800778B

Abstract

A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible, it include: step 1, image object detection based on deep learning: step 2, based on online sample method for digging, the key parameter of use adjusts setting are as follows: step 3, negative difficulty divide sample excavate be hardly possible divide on sample basis by adjusting in training RPN formed the positive and negative sample proportion of mini-batch be 1:3, be trained；Step 4, redundancy frame is rejected, the multiple calculating of loss is avoided.The Suggestion box that RPN layer network generates reasonably is removed into redundancy using improved non-maxima suppression algorithm.The present invention relaxes the definition of negative sample in the case where not exptended sample, goes out more difficult training samples by sample online mining itself；Provided with positive and negative sample proportion, the largest loss, difficult sample train, rare rationally and are easily calculated；The loss returned to classification, frame has carried out equilibrating processing, can satisfy the lasting reduction of training loss.

Description

A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible

Technical field

Sample is divided to excavate skills such as (HNEM) based on online sample digging technology (OHEM) and negative difficulty the present invention relates to a kind of The method that art combines.

Technical background

In recent years with the rapid development of computer science and technology, image procossing, image object based on computer technology Detection etc. also obtains unprecedented fast development, and wherein deep learning is extracted crucial by the digital picture feature of study magnanimity Target signature has been more than the mankind in target detection, is brought to industry one and another pleasantly surprised.The two of computer vision Big task is image classification and target location detection, and the Image Feature Detection of conventional method is by exper ienced algorithm Engineers design corresponds to matching template of target, such as deformable model (DPM), HOG feature extraction etc., the former target inspection Survey is to be positioned by the window of sliding to target, is then classified using the template matching of fixed character to it.It is deposited The problem of be that time-consuming for detection, characteristic matching precision is low, to specific target effective, leads to model generalization ability weak equal one Series of problems.And the deep learning method risen recently can be directed to the target of complex scene, carrying out effectively feature identifies, Recognition effect is much better than conventional method, but there is also shortcoming: (1) training set of the huge deep learning of data set amount needs Want thousands of a samples, can effective learning data feature, and data are more, and effect is better, gives data collection task band Carry out certain difficulty；(2) hardware requirement is higher, training big data sample, need the video memory of at least four G, to hardware propose compared with High request；(3) training skills is strong, and when parameter setting is unreasonable, sample training effect is poor, the situation of difficult training occurs.

Summary of the invention

In order to overcome the above-mentioned deficiency of the prior art, the present invention provides one kind for sample problem and divides sample to dig based on difficulty The Faster RCNN object detection method of pick, the present invention are a kind of online sample digging technologies when not increasing sample Divide sample to excavate the technology combined with negative difficulty, model is enable to divide sample for existing difficulty, targetedly learns its spy Sign, achievees the effect that model further promotes generalization, robustness.

To achieve the above object, the invention adopts the following technical scheme:

A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible, includes the following steps:

Step 1, the image object detection based on deep learning；

Currently, the image object detection model based on deep learning is mostly based on convolutional neural networks, so master of the present invention It to be analyzed based on Faster RCNN, and propose a kind of reasonable improved method.

Faster RCNN uses Softmax Loss and Smooth L1 Loss and returns joint to class probability and frame Training, loss function formula are as follows:

Wherein, N_cls=256 indicate the number of prospect, and frame returns number N_reg=2400 be last characteristic pattern sliding Maximum number of dimensions 40 × 60；I indicates some Suggestion box；p_iExpression predicts the probability of corresponding classification, including prospect, back Scape；The value of expression prospect and background remembers prospectBackgroundThe loss of target frame recurrence is calculated with this； t_iIndicate the coordinate information of the Suggestion box, i.e. center point coordinate and the suggestion high t of frame width_i=(x_i,y_i,w_i,h_i), andIndicate true Just corresponding information on target object, similarlyAnd λ balances the power of frame recurrence and Classification Loss Weight；L_clsFor softmax Classification Loss function,Frame returns loss using smooth L1's Method.

As above method training to a certain extent, can be generated to the difficult sample data deficiency trained, shortage trains it etc. Problem.But sample excavation is able to solve difficulty and divides sample training problem.So being carried out to the method in step 1 following several The improvement of step.

Step 2 is based on online sample method for digging, and the key parameter that the present invention uses adjusts setting are as follows:

(1) setting hardly possible divides the Filtering system of sample.The difficulty of each iteration screening divides sample to be according to current total losses L ({p_i},{t_i) descending arrangement, and preceding B/N sample size is screened, wherein B=64, N are picture number N=trained every time 1, backpropagation speed trained so faster, because only that the gradient for needing to adjust on a small quantity.

(2) arithmetic speed is improved.In back transfer, the difficulty of screening divide sample to be got by forward direction costing bio disturbance, is passed through It blames a point sample losses and sets 0 operation, not can be reduced model video memory amount.So back transfer filters out B/N difficulty and sample is divided to carry out Gradient transmitting, trained video memory capacity are reduced to 3057M from 3527M.

(3) at RPN layers and finally, full articulamentum increases OHEM module simultaneously.Since good classification results depend on accurately Target positioning, i.e. the generation of Suggestion box, thus to RPN layer also increase OHEM module, be conducive to frame return out most accurately Position, even more improve the validity of the feature extraction of classification.

(4) adjustment Classification Loss and frame appropriate return the weight of loss.In loss function formula, Classification Loss L_clsLoss L is returned with frame_regIt is unbalanced, N_cls=256 be the number of classification, and N_reg=2400 be last characteristic pattern The maximum dimension of sliding, inverted about 10 times of the difference of the two, therefore λ=10 are taken, be conducive to returning the same of preferable frame in this way When, targetedly learn corresponding target signature.

(5) non-maxima suppression (NMS) algorithm is adjusted.Classical non-maxima suppression algorithm cannot retain adjacent well Or the multi-target detection frame of overlapping, cause the recall rate of target detection to reduce.Therefore, the improved non-maxima suppression of the present invention Algorithm uses the confidence level penalty mechanism of classification, has also been effectively maintained Suggestion box while removing the Suggestion box of redundancy, And further improve mAP.Specific operation is talked about in step 3.

(6) data enhancing work improves the generalization ability of model.Pass through random left and right mirror image switch and illumination when training Saturation degree is adjusted, and is increased sample diversity, is prevented over-fitting.To improve model to the detectability of different sized images, instructing Multiple dimensioned training is used when practicing, by the random ruler of the short side setting { 224,416,480,512,600,672,900 } of image It is very little, another side equal proportion scaling.The enhancing work of sample also further increases mAP.

(7) strategy of learning rate is had adjusted.40k is set by the drop point of learning rate, is equivalent to 8 epoch, and Hereafter decline a learning rate again every 20k the number of iterations, be conducive to the global scope search capability for improving early period in this way, It avoids falling into local minizing point；Meanwhile the smaller learning rate that the later period uses, the fining adjustment of minimum is carried out, favorably In dropping many times for loss.

Step 3, it is to divide the mini- formed on sample basis by adjusting RPN in training in hardly possible that negative difficulty, which divides sample to excavate, The positive and negative sample proportion of batch is 1:3, is trained.Because being found by many experiments, when ratio is 1:3, in the table of model Existing power is all best, specific strategy are as follows:

(1) cancel the threshold restriction of negative sample.In original Faster R-CNN, the mechanism for being determined as background is basis Friendship between the candidate Suggestion box that RPN is generated and true frame and size than IoU is set, when IoU ratio [0.1, 0.5) when, it is considered to be the Suggestion box of background.This ratio setting be disadvantageous in that have ignored it is below lower than 0.1 it is rare, It loses larger, important difficulty and divides negative sample, these features could not be learnt well.Therefore, the present invention, which is arranged, determines background [0,0.5) threshold value is.Threshold value provided with positive sample is the maximum one or IoU threshold value of IoU value in [0.7,1.0] model It encloses.

(2) the ratio 1:3 of difficult point of positive negative sample of setting.As previously mentioned, positive negative sample combines, and learn more about Background information can be improved the ability of model detection and localization target under specific background environment.And it is provided with each target Picture batch size is 64, then the quantity of positive sample is 16, and the quantity of negative sample is 48.

(3) it uses non-maximum in view of the positive sample and label predicted is there are multiple the case where repeating simultaneously and presses down Method processed is then deleted using 0.7 threshold value when the IoU of target and label is below 0.7.

To the negative difficulty in step 3 divide sample excavate parameter setting summarize it is as follows:

(1) parameter name: FG_THRESH；The meaning of representative: positive sample IoU threshold value；Parameter value: [0.7,1.0]；

(2) parameter name: BG_THRESH_LO；The meaning of representative: negative sample IoU threshold value；Parameter value: [0,0.5)；

(3) parameter name: HNEM_NMS_THRESH；The meaning of representative: non-maxima suppression threshold value；Parameter value: 0.7；

(4) parameter name: HNEM_BATCHSIZE；The meaning of representative: picture object lot size；Parameter value: 64；

(5) parameter name: RPN_FG_FRACTION；The meaning of representative: positive sample ratio；Parameter value: 0.25；

(6) parameter name: RPN_BG_FRACTION；The meaning of representative: negative sample ratio；Parameter value: 0.75.

Step 4 rejects redundancy frame, avoids the multiple calculating of loss.Using improved non-maxima suppression algorithm The Suggestion box that RPN layer network generates reasonably is removed into redundancy.Specific operation are as follows:

It avoids pushily deleting the Suggestion box that IoU is greater than threshold value, the improved procedure taken is to reduce its confidence level.It takes The linear weighting of calculation method, Gauss weighting method, exponential weighted method.

Wherein, the calculation method of linear weighted function is that IoU is greater than to its confidence level of the reduction of threshold value, that is, introduces the think of of penalty function Think:

In formula, s_iIndicate that the confidence score of current generic, a indicate weight coefficient, 0 < a≤1, b_mIndicate confidence level Frame corresponding to highest scoring, b_iIndicate current frame, IoU (b_m,b_i) indicate both friendship and ratio, N_tExpression gives Threshold value.

And use the calculation formula of Gauss weighting method are as follows:

In formula, the punishment dynamics of the value Different Effects penalty function of σ, gaussian weighing function has smooth compared to linear weighted function The features such as transition.

Exponential weighted method equally uses certain threshold value, when its IoU is greater than the threshold value, punishes it.It compares Weigthed sums approach has the characteristics that smooth transition at threshold value；Compared to Gauss weighting method, can retain in threshold value previous stage Compared with multiple weighing value.Calculation formula:

In formula, N_tIt is same to indicate IoU threshold value.

By multiple experiment, for each improved non-maxima suppression algorithm, corresponding detailed process and respectively Adaptation situation be summarized as follows:

(1) linear method of weighting screens the highest frame of confidence level in certain one kind as most by confidence level descending Excellent frame, the IoU and given threshold N of more next frame_t, when being less than the threshold value, confidence level score value is constant；Otherwise it sets Confidence threshold is reduced to the ratio of a (1-IoU), i.e. IoU is bigger, and punishment dynamics are bigger.Circulate operation, until all to deckle The confidence level of frame is less than given confidence threshold value threshold, then gives up, log history optimal value.This method is suitable for weight Multiple frame is more, time complexity is low, the scene of quick the selection result, that is, the dynamics punished is larger, and screening finishes quickly, but Obtained effect is not too much ideal.

(2) it is directed to Gauss weighting method, equally filters out optimal confidence level frame, but without setting N_tThreshold value, pass through finger Number functions successively decrease, i.e., IoU is bigger, confidence level score value successively decrease it is bigger, and set σ value control the degree successively decreased.Recycle ratio Compared with confidence score rejects these redundancy frames when being less than set confidence threshold value threshold；Otherwise it is high to retain confidence level Optimal frame.The frame quantity that this method is suitable for predicting to generate is medium, time requirement is not high, the accurate feelings of quantity statistics Condition, and the screening seamlessly transitted helps to obtain preferably recurrence frame and obtains although time complexity is relatively linear poor Result it is more excellent.

(3) it is directed to exponential weighted method, the IoU of same relatively suboptimum confidence level frame and optimal frame sets certain threshold Value N_t.When being less than the threshold value, confidence level score value is constant；Otherwise confidence threshold value is reduced toThe index is presented The effect successively decreased.It is finished until all frames all detect, retains the Suggestion box that confidence level is greater than threshold value.This method is suitable for protecting It stays more recurrence frame, delete the scene for selecting partial redundance frame, can be applied to only to detect target whether there is or not be not required to the feelings of statistical magnitude Condition, therefore effect is poorer than linear and Gauss weighting method.

So can choose different calculation methods with applicable situation according to different conditions to reduce its confidence level.

The invention has the advantages that

Excavated based on online sample divides sample to excavate the technology combined with negative difficulty, and most prominent feature is not expand sample In the case where this, relax the definition of negative sample, more difficult training samples are gone out by sample online mining itself；Provided with just Negative sample ratio rationally and easily calculates the largest loss, difficult sample train, rare；The damage that classification, frame are returned Mistake has carried out equilibrating processing, can satisfy the lasting reduction of training loss.Secondly it is overlapped missing inspection problem for multiple target, led to Improved non-maxima suppression is crossed, penalty function thought is introduced using three kinds of different modes and lowers its confidence level；Through Experimental comparison, It obtains and uses Gauss weighting method best to modelling effect, improve recall rate, solve the problems, such as multiple target missing inspection.The improvement is calculated Method may also extend into the context of detection of other field, such as product defects context of detection (chain is overlapped missing inspection problem), Hang Renjian Survey problem, transport truck statistical problem etc..

Detailed description of the invention

Fig. 1 is the process that different size Anchor of the invention generate Suggestion box；

Fig. 2 is that the difficulty of screening of the invention divides sample back transfer process；

Fig. 3 is the OHEM module added of the invention；

Fig. 4 is improved non-maxima suppression algorithm flow of the invention；

Fig. 5 a~Fig. 5 f is the loss convergence curve and learning rate image that NMS Gauss of the invention improves front and back, wherein Fig. 5 a is RPN layers of Classification Loss, and Fig. 5 b is that RPN layers of frame return loss, and Fig. 5 c is full articulamentum Classification Loss, and Fig. 5 d is complete Articulamentum frame returns loss, and Fig. 5 e is total losses, and Fig. 5 f is the learning rate strategy of adjustment；

Fig. 6 is improved Gauss NMS detection effect of the invention.

Specific embodiment

The present invention provides a kind of online sample digging technology when not increasing sample and negative for sample problem Difficulty divides sample to excavate the technology combined, so that model is divided sample for existing difficulty, targetedly learns its feature, Achieve the effect that model further promotes generalization, robustness.

Step 1, the image object detection based on deep learning；

(7) parameter name: FG_THRESH；The meaning of representative: positive sample IoU threshold value；Parameter value: [0.7,1.0]；

(8) parameter name: BG_THRESH_LO；The meaning of representative: negative sample IoU threshold value；Parameter value: [0,0.5)；

(9) parameter name: HNEM_NMS_THRESH；The meaning of representative: non-maxima suppression threshold value；Parameter value: 0.7；

(10) parameter name: HNEM_BATCHSIZE；The meaning of representative: picture object lot size；Parameter value: 64；

(11) parameter name: RPN_FG_FRACTION；The meaning of representative: positive sample ratio；Parameter value: 0.25；

(12) parameter name: RPN_BG_FRACTION；The meaning of representative: negative sample ratio；Parameter value: 0.75.

And use the calculation formula of Gauss weighting method are as follows:

In formula, N_tIt is same to indicate IoU threshold value.

In order to confirm the feasibility of the method for the present invention, according to above parameter designing, tested as follows:

By being tested in the common data sets of VOC2007 and VOC2012, wherein having 20 classification+backgrounds etc. In 21 classification problems, training set 5,000 multiple, test set 5,000 multiple.It compared taking the online sample under different condition to dig The strategy combination that pick divides sample to excavate with negative difficulty, is finally analyzed as follows experimental result:

Using classical Faster RCNN object detection method as reference:

(1) it only adds OHEM module in last articulamentum entirely or only when adding OHEM module for RPN layers, effect does not all have It gets a promotion；

(2) by RPN layers and finally, full articulamentum all adds OHEM module, and using the online sample of random positive and negative sample proportion When this method for digging, loss weight=1:1 that frame returns, effect promoting；

(3) RPN layers are classified on the basis of (2): the loss weight that frame returns is set as 1:10, the i.e. side of λ=10 Method further improves 1.3% after two classes loss balance, improves 1.8% compared to original, it is seen that both balanced damage Lose the training for being conducive to model；

(4) the random ratio of positive negative sample is removed, is set as the ratio of 1:3, i.e., negative difficulty divides sample digging technology, and λ =10, effect improves 0.4% again, and other ratios are not obviously improved, and illustrates that negative difficulty divides sample to excavate in model training It plays a role；

(5) use improved NMS linear weighted function algorithm, by many experiments select optimal parameter be set as a=1, N_t=0.2, threshold=0.001；Further improve 1.3%；

(6) it is weighted using Gauss, by comparative analysis, σ=0.3, threshold=0.003, the target inspection of promotion is set Survey the object that object is aggregated pattern mostly, such as dog, people, bird, sheep, plant, improve 3.1% respectively, 7.6%, 9.1%, 1.3%, 24.8%, cause whole mAP to promote 6.3%, promotes 4.4 percentage points compared to original；

(7) exponential weighted method is used, N is set_t=0.1, threshold=0.0001 is promoted effect ratio (5) and (6) two Kind method is small；

(8) learning rate strategy is had adjusted on the basis of best NMS Gauss improved method, loss can be carried out again Decline (attached drawing 5), mAP improves 4.5 compared to original；

(9) training sample for using VOC2012, is tested on 07 test set, is equivalent to and is increased more training samples, Obtain the promotion of 3.1 percentage points；

(10) it is tested on the basis of (9) using the method for Gauss NMS, mAP improves 1.8 points.It is restrained by loss Tracing analysis, it can be found that learning rate strategy adjusted can make, frame is returned and the loss of classification constantly declines；

(11) because declining when iteration 30k times several primary originally according to the learning rate adjustable strategies in Fater RCNN Learning rate, analysis finds that the strategy can not seek global minimum well in experiment.Therefore, it increases and searches early period The drop point of learning rate is set 40k by the duration of rope, is conducive to avoid falling into local minizing point.

By the above analysis of experimental results it is found that using OHEM and the HNEM method combined and improved NMS and instruction Practice strategy preferable model inspection effect available in the case where possessing original data set, improves 4.5 percentage points.Wherein, using changing Into Gaussian function non-maxima suppression algorithm, on the basis of master mould promoted 1.8 percentage points, the recall rate of model can be made It is significantly improved, there is preferable applicability to target object detection in groups.

The locating effect of improved NMS algorithm is as shown in Fig. 6, analyzes it is found that the innovatory algorithm was overlapped multiple target The more traditional NMS algorithm of detection effect has biggish promotion in target detection, reduces the risk of target object missing inspection.So that Recall rate index gets a promotion, and single P-R area under the curve (AP) more previously had the advantage further increased, leads to entirety MAP obtain preferable result.

Content described in this specification embodiment is only enumerating to the way of realization of inventive concept, protection of the invention Range should not be construed as being limited to the specific forms stated in the embodiments, and protection scope of the present invention is also and in this field skill Art personnel conceive according to the present invention it is conceivable that equivalent technologies mean.

Claims

1. a kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible, includes the following steps:

Step 1, the image object detection based on deep learning；

Analyzed based on Faster RCNN, Faster RCNN use Softmax Loss and Smooth L1 Loss to point Class probability and frame return joint training, loss function formula are as follows:

Wherein, N_cls=256 indicate the number of prospect, and frame returns number N_reg=2400 be the maximum of last characteristic pattern sliding Number of dimensions 40 × 60；I indicates some Suggestion box；p_iExpression predicts the probability of corresponding classification, including prospect, background； The value of expression prospect and background remembers prospectBackgroundThe loss of target frame recurrence is calculated with this；t_iIt indicates The coordinate information of the Suggestion box, i.e. center point coordinate and the suggestion high t of frame width_i=(x_i,y_i,w_i,h_i), andIndicate really corresponding Information on target object, similarlyAnd λ balances the weight of frame recurrence and Classification Loss；L_clsFor Softmax Classification Loss function,Frame returns the method that loss uses smooth L1；

Step 2, it is based on online sample method for digging, the key parameter of use adjusts setting are as follows:

(21) setting hardly possible divides the Filtering system of sample；The difficulty of each iteration screening divides sample to be according to current total losses L ({ p_i}, {t_i) descending arrangement, and preceding B/N sample size is screened, wherein B=64, N are picture number N=1 trained every time, in this way Trained backpropagation speed faster, because only that the gradient for needing to adjust on a small quantity；

(22) arithmetic speed is improved；In back transfer, the difficulty of screening divide sample to be got by forward direction costing bio disturbance, by blaming Point sample losses set 0 operation, not can be reduced model video memory amount；So back transfer filters out B/N difficulty and sample is divided to carry out gradient Transmitting, trained video memory capacity are reduced to 3057M from 3527M；

(23) at RPN layers and finally, full articulamentum increases OHEM module simultaneously；Since good classification results depend on accurately Target positioning, the i.e. generation of Suggestion box are conducive to frame and return out most accurate position so also increasing OHEM module to RPN layers It sets, even more improves the validity of the feature extraction of classification；

(24) adjustment Classification Loss and frame appropriate return the weight of loss；In loss function formula, Classification Loss L_clsWith Frame returns loss L_regIt is unbalanced, N_cls=256 be the number of classification, and N_reg=2400 be last characteristic pattern sliding Maximum dimension, inverted about 10 times of the difference of the two, therefore λ=10 are taken, be conducive to have needle while returning preferable frame in this way Learn corresponding target signature to property；

(25) non-maxima suppression (NMS) algorithm is adjusted；Classical non-maxima suppression algorithm cannot retain well it is adjacent or The multi-target detection frame of overlapping causes the recall rate of target detection to reduce；Therefore, the improved non-maxima suppression algorithm of the present invention Using the confidence level penalty mechanism of classification, it is also effectively maintained Suggestion box while removing the Suggestion box of redundancy, has been gone forward side by side One step improves mAP；Specific operation is talked about in step 3；

(26) data enhancing work improves the generalization ability of model；It is saturated when training by random left and right mirror image switch and illumination Degree is adjusted, and is increased sample diversity, is prevented over-fitting；To improve model to the detectability of different sized images, in training Multiple dimensioned training has been used, it is another by the random size of the short side setting { 224,416,480,512,600,672,900 } of image Side equal proportion scaling；The enhancing work of sample also further increases mAP；

(27) strategy of learning rate is had adjusted；40k is set by the drop point of learning rate, is equivalent to 8 epoch, and this Decline a learning rate again every 20k the number of iterations afterwards, is conducive to the global scope search capability for improving early period in this way, avoids Fall into local minizing point；Meanwhile the smaller learning rate that the later period uses, the fining adjustment of minimum is carried out, is conducive to damage That loses drops many times；

Step 3, negative difficulty divide sample excavate be hardly possible divide on sample basis by adjusting in training RPN formed mini-batch just Negative sample ratio is 1:3, is trained, specifically includes:

(31) cancel the threshold restriction of negative sample；In original Faster R-CNN, the mechanism for being determined as background is according to RPN Friendship between the candidate Suggestion box and true frame of the generation and size than IoU is set, when IoU ratio [0.1,0.5) When, it is considered to be the Suggestion box of background；This ratio setting, which is disadvantageous in that, has ignored rare, loss below lower than 0.1 Larger, important difficulty divides negative sample, these features could not be learnt well；Therefore, the threshold value for determining background is arranged in the present invention For [0,0.5)；Threshold value provided with positive sample is the maximum one or IoU threshold value of IoU value in [0.7,1.0] range；

(32) the ratio 1:3 of difficult point of positive negative sample of setting；As previously mentioned, positive negative sample combines, and learn more about back Scape information can be improved the ability of model detection and localization target under specific background environment；And it is provided with each target figure Piece batch size is 64, then the quantity of positive sample is 16, and the quantity of negative sample is 48；

(33) non-maxima suppression side is used in view of the positive sample and label predicted is there are multiple the case where repeating simultaneously Method is then deleted using 0.7 threshold value when the IoU of target and label is below 0.7；

Sample is divided to excavate being provided that for parameter the negative difficulty in step 3

The meaning that FG_THRESH is represented: positive sample IoU threshold value；Parameter value: [0.7,1.0]；

The meaning that BG_THRESH_LO is represented: negative sample IoU threshold value；Parameter value: [0,0.5)；

The meaning that HNEM_NMS_THRESH is represented: non-maxima suppression threshold value；Parameter value: 0.7；

The meaning that HNEM_BATCHSIZE is represented: picture object lot size；Parameter value: 64；

The meaning that RPN_FG_FRACTION is represented: positive sample ratio；Parameter value: 0.25；

The meaning that RPN_BG_FRACTION is represented: negative sample ratio；Parameter value: 0.75；

Step 4, redundancy frame is rejected, the multiple calculating of loss is avoided；Using improved non-maxima suppression algorithm by RPN The Suggestion box that layer network generates reasonably removes redundancy；Specific operation are as follows:

It avoids pushily deleting the Suggestion box that IoU is greater than threshold value, the improved procedure taken is to reduce its confidence level；The calculating taken The linear weighting of method, Gauss weighting method, exponential weighted method；

Wherein, the calculation method of linear weighted function is that IoU is greater than to its confidence level of the reduction of threshold value, that is, introduces the thought of penalty function:

In formula, s_iIndicate that the confidence score of current generic, a indicate weight coefficient, 0 < a≤1, b_mIndicate confidence score Frame corresponding to highest, b_iIndicate current frame, IoU (b_m,b_i) indicate both friendship and ratio, N_tIndicate given threshold Value；

And use the calculation formula of Gauss weighting method are as follows:

In formula, the punishment dynamics of the value Different Effects penalty function of σ, gaussian weighing function has smooth transition compared to linear weighted function The features such as；

Exponential weighted method equally uses certain threshold value, when its IoU is greater than the threshold value, punishes it；Compared to linear Weighting method has the characteristics that smooth transition at threshold value；Compared to Gauss weighting method, more power can be retained in threshold value previous stage Value；Calculation formula:

In formula, N_tIt is same to indicate IoU threshold value；

By multiple experiment, for each improved non-maxima suppression algorithm, corresponding detailed process and respective suitable Situation is answered to be summarized as follows:

(41) linear method of weighting screens in certain one kind the highest frame of confidence level as optimal by confidence level descending Frame, the IoU and given threshold N of more next frame_t, when being less than the threshold value, confidence level score value is constant；Otherwise confidence level Threshold value is reduced to the ratio of a (1-IoU), i.e. IoU is bigger, and punishment dynamics are bigger；Circulate operation, until setting for all frames undetermined Reliability is less than given confidence threshold value threshold, then gives up, log history optimal value；This method is suitable for repeating frame It is more, time complexity is low, the scene of quick the selection result, that is, the dynamics punished is larger, screens the effect that finishes, but obtain quickly Fruit is not too much ideal；

(42) it is directed to Gauss weighting method, equally filters out optimal confidence level frame, but without setting N_tThreshold value, pass through index letter Number successively decrease, i.e., IoU is bigger, confidence level score value successively decrease it is bigger, and set σ value control the degree successively decreased；Circulation compares, and sets Confidence score rejects these redundancy frames when being less than set confidence threshold value threshold；Otherwise retain high optimal of confidence level Frame；The situation that the frame quantity that this method is suitable for predicting to generate is medium, time requirement is not high, quantity statistics are accurate, and The screening of smooth transition help to obtain it is preferable return frame, although time complexity is relatively linear poor, obtained result compared with It is excellent；

(43) it is directed to exponential weighted method, the IoU of same relatively suboptimum confidence level frame and optimal frame sets certain threshold value N_t； When being less than the threshold value, confidence level score value is constant；Otherwise confidence threshold value is reduced toIndex presentation is successively decreased Effect；It is finished until all frames all detect, retains the Suggestion box that confidence level is greater than threshold value；This method is suitable for retaining more time Return frame, delete the scene for selecting partial redundance frame, can be applied to need to only to detect target, whether there is or not, the case where being not required to statistical magnitude, therefore imitate Fruit is poorer than linear and Gauss weighting method；

Different calculation methods is selected to reduce its confidence level with applicable situation according to different conditions.