CN107316058A - Improve the method for target detection performance by improving target classification and positional accuracy - Google Patents

Improve the method for target detection performance by improving target classification and positional accuracy Download PDF

Info

Publication number
CN107316058A
CN107316058A CN201710450327.2A CN201710450327A CN107316058A CN 107316058 A CN107316058 A CN 107316058A CN 201710450327 A CN201710450327 A CN 201710450327A CN 107316058 A CN107316058 A CN 107316058A
Authority
CN
China
Prior art keywords
image
candidate frame
target detection
frame
convolutional layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710450327.2A
Other languages
Chinese (zh)
Inventor
娄英欣
周芸
付光涛
姜竹青
门爱东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National News Publishes Broadcast Research Institute Of General Bureau Of Radio Film And Television
Beijing University of Posts and Telecommunications
Academy of Broadcasting Science of SAPPRFT
Original Assignee
National News Publishes Broadcast Research Institute Of General Bureau Of Radio Film And Television
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National News Publishes Broadcast Research Institute Of General Bureau Of Radio Film And Television, Beijing University of Posts and Telecommunications filed Critical National News Publishes Broadcast Research Institute Of General Bureau Of Radio Film And Television
Priority to CN201710450327.2A priority Critical patent/CN107316058A/en
Publication of CN107316058A publication Critical patent/CN107316058A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of method by improving target classification and positional accuracy improvement target detection performance, its technical characteristics is:Characteristics of image is extracted according to convolutional neural networks framework, and selects M layers of output before convolutional layer to carry out Fusion Features, the characteristic pattern of multiple features is formed;Mesh generation is carried out on convolutional layer M, the target candidate frame of fixed number and size is predicted in each network;Candidate frame is mapped on characteristic pattern and cut, result then will be cut and carry out multiple features connection;By the above results by full articulamentum after, characteristics of image is classified by Softmax sorting algorithms, and online iterative regression positioning is carried out with overlapping area loss function, obtains the result of final target detection.The present invention is reasonable in design, feature is extracted by convolutional neural networks, and multilayer fusion is carried out to characteristics of image, finally characteristics of image is classified using Softmax sorting algorithms, and positioned using overlapping area loss function, obtain good object detection results.

Description

Improve the method for target detection performance by improving target classification and positional accuracy
Technical field
It is especially a kind of to be improved by improving target classification and positional accuracy the invention belongs to target detection technique field The method of target detection performance.
Background technology
The mankind have more than 80% information source in vision in the perception engineering of the material world.And image is at certain It is with different patterns to mankind's transmission information, and it is as a kind of important is reflected to one kind of objective reality in meaning Information carrier, with it is directly perceived, abundant in content with being easy to exchange the features such as, be multimedia important composition content, therefore, based on figure As the various applications for the treatment of technology are just arisen at the historic moment.Images steganalysis and detection technique are exactly wherein most typical application skill Art.Computer vision research purpose is that the mankind are realized with computer to the perception of objective world, identification and are understood, target detection (Object Detection) is most common problem in computer vision, and received in theory on computer vision research field Extensive concern, has broad application prospects.Eyes, which are opened, when machine " opening " sees universal time, it is necessary to judge which be present in its visual field A little targets, what is respectively, where.The target detection of view-based access control model is image procossing, computer vision, pattern-recognition Etc. many multi-disciplinary crossing research problems.The purpose of target detection is that target is picked out from the background of differing complexity, and Indicated in the form of encirclement frame (Bounding Box), so as to the follow-up work such as complete to track, recognize.Therefore, target Detection is the high-rise background task understood with application, and the quality of its performance will directly affect follow-up target following, action recognition And the performance of the task on the middle and senior level such as behavior understanding., it is necessary to be handled in real time multiple targets especially in complex scene When, target, which is automatically extracted and recognized, just seems especially important.Therefore, object detection and recognition is graphical analysis and the base understood Plinth, furthers investigate object detection and recognition algorithm, very important meaning is suffered from academia and industrial quarters.But for machine For device, because the dynamic change of complicated identification background and target in itself adds the difficulty of target identification, huge system The matrix operation of parameter and higher-dimension takes substantial amounts of processing time, and the problem of Target detection and identification also has larger such as recognizes The degree of accuracy, real-time all await improve.
The main task of target detection be in image sequence target object carry out automatic detection, including judge classification with Recognize position.Current popular algorithm of target detection, generates 1K-2K candidate frame, then for every first on a pictures Individual candidate frame extracts feature using CNN convolutional neural networks, and secondly feature is inputted to the SVM classifier or Softmax of each class Grader judges whether target belongs to such, finally realizes the precisely fixed of target using the position for returning device amendment candidate frame Position.Traditional algorithm of target detection is had translation, imitative set, rotates using the feature such as SIFT, HOG and LBP by finding in picture The matching between image is realized etc. the invariant features point under change situation, so as to realize target detection.But extract the quality of feature The accuracy of classification is directly influenced, due to the Morphological Diversity of target, illumination variation diversity, the factor such as background diversity makes The feature that a robust must be designed not is that so easily the adaptability of traditional characteristic is not strong.And based on CNN convolutional Neurals The feature extraction of network has good robustness, and convolutional neural networks are more than one of particular design for identification two-dimensional shapes Layer perceptron, this network structure has height consistency to translation, proportional zoom, inclination or the deformation of his common form.Carry The CNN models of feature are taken to be obtained by training in advance, pre-training is complete based on the Computer Vision Recognition challenge matches of ILVCR 2012 Portion's data set is trained, and being then based on the training sets of PASCAL VOC 2007 for pre-training model carries out tuning training, so that Realize and pass through CNN network extraction picture features.Deep learning is widely used in the depth that target detection comes from Alex et al. propositions The convolutional Neural net AlexNet network architectures, the framework achieves extraordinary achievement in the matches of ILSVRC 2012, hereafter, volume Product neutral net is widely used in all kinds of image association areas.The AlexNet of Geoffrey Hinton designs is one 8 layers CNN frameworks, including 5 convolutional layers and 3 full articulamentums, the error rate of best algorithm at that time is halved, and it demonstrates CNN multiple Validity under parasitic mode type, and GPU causes training to obtain result in acceptable time range., Christian in 2014 Szegedy proposes GoogleNet frameworks and taken first place in the classification matches of ILSVRC 2014, different from AlexNet It is:GoogleNet depth (number of plies) is deeper, and width (layer core or neuron number) is wider.The same year, Andrew Zisserman The VGG-Net frameworks of proposition take first place in the positioning matches of ILSVRC 2014, and unlike AlexNet:VGG-Net Using more layers, generally there are 16-19 layers.In 2015, the Res-Net frameworks that Kaiming He are proposed were in ILSVRC 2015 Taken first place in classification and positioning match, the model employs 152 layers of deep layer convolutional neural networks.Hinton professor into Work(, has attracted the concern of a large amount of scholar both at home and abroad;Meanwhile, industrial quarters add deep learning research, Baidu, google, Facebook sets up deep learning laboratory one after another, by deep learning, carries out image recognition and classification.Although researcher carries Go out many algorithm of target detection based on deep learning convolutional neural networks, these algorithms also achieve good effect, but It is still to have many aspects to have much room for improvement, such as picture background is complicated, network inputs size is fixed, candidate frame is excessive, training speed is slow, Consume the problems such as computer memory, wisp detect inaccurate, complex steps and position not accurate.
The content of the invention
It is to overcome the deficiencies in the prior art the mesh of the present invention, proposes that a kind of reasonable in design, precision is high and stability is strong Improve the method for target detection performance by improving target classification and positional accuracy.
The present invention solves its technical problem and takes following technical scheme to realize:
It is a kind of to improve the method for target detection performance by improving target classification and positional accuracy, comprise the following steps:
Step 1, characteristics of image is extracted according to convolutional neural networks framework, and select before convolutional layer M layer output progress feature Fusion, forms the characteristic pattern of multiple features;
Step 2, mesh generation is carried out on convolutional layer M, predict that the target of fixed number and size is waited in each network Select frame;
Step 3, candidate frame is mapped on characteristic pattern cut, then will cut result and carry out multiple features connection;
Step 4, by the above results by full articulamentum after, characteristics of image is classified by Softmax sorting algorithms, And online iterative regression positioning is carried out with overlapping area loss function, obtain the result of final target detection.
The specific method of the step 1 comprises the following steps:
(1) the picture with the true encirclement frame of object is input in convolutional neural networks framework first, carried by Caffe Image is taken to pass through the feature of convolutional neural networks different layers;
(2) the characteristics of image forward convolutional layer exported carries out maximum pondization and operated, and the image that convolutional layer M is exported Feature carries out deconvolution operation, realizes that output characteristic of the size of output all with middle convolutional layer is in the same size;
(3) the last feature for exporting all convolutional layers is merged, and obtains the characteristic pattern of the multi-feature extraction of image.
The implementation method of the step 2 comprises the following steps:
(1) 6*6 mesh generation is carried out on the characteristic pattern that convolutional layer M is exported;
(2) the candidate frame of object may be included by predicting 4 in the small lattice middle of each single network, this 4 candidate frame tools There is fixed size and length-width ratio, length-width ratio is respectively 1:1、1:2 and 2:1, only for 1:The candidate frame of 1 length-width ratio, sets 2 kinds Candidate frame size 0.6 and 0.9;
(3) during network training, we are matched the true encirclement frame and candidate frame of object, by the two IOU overlapping areas screened more than or equal to 0.7, and delete the candidate frame beyond image boundary;
(4) 100 candidate frames are finally generated on convolutional layer M characteristic pattern.
The implementation method of the step 3 comprises the following steps:
(1), according to 100 candidate frames generated on convolutional layer M characteristic pattern, corresponding multilayer is mapped according to its position On characteristic pattern, and cut accordingly on multilayer feature figure;
(2) the characteristic pattern square after cutting is done into 1*1 convolution, 3*3 convolution and 5*5 volumes are then carried out respectively to convolution results Product;
(3) in order to obtain full text information, by multilayer feature figure by maximum pond layer, then by 1*1 convolutional layers and activation Layer;
(4) the convolution output result of 1*1 convolution, 3*3 convolution, 5*5 convolution and full text information is connected according to tandem Connect, form the multiple features connection of candidate frame.
The concrete methods of realizing of the step 4 comprises the following steps:
By full articulamentum after, characteristics of image is classified by Softmax sorting algorithms, based on target detection Data set, has oneself corresponding precision per type objects;
(2) recurrence positioning is carried out to candidate frame by overlapping area loss function so that candidate frame is closer to the true of object Real encirclement frame, the loss function is candidate frame and the common factor area divided by union area of true encirclement frame;
(3) it is ranked up according to Softmax losses and overlapping area penalty values, positive sample and negative sample is filtered out online Ratio is 3:1, renewal Sample Storehouse, which is input on multilayer feature figure, proceeds iterative regression positioning;
(4) after iteration n times, candidate frame closer to object true encirclement frame, model training well after can carry out reality The test of object.
Advantages and positive effects of the present invention are:
1st, picture is inputted in VGG-16 convolutional neural networks to obtain more image informations and extracts image by the present invention Feature, then carries out multilayer to characteristics of image and merges to form multi-characteristic;In order to quickly obtain object candidate frame, in convolutional layer 5 Target candidate frame is generated according to certain length-width ratio and size on characteristic pattern, and is mapped on multi-characteristic and is cut;In order to obtain The information of more candidate frames is obtained, result will be cut and carry out multiple features connection, and be input to full articulamentum;In order to realize precise elevation Classification and positioning, carry out the classification of Softmax graders and the positioning of Overlap loss functions iterative regression, realize complete mesh The classification and positioning of detection are marked, the Detection results better than other main flow target detection frameworks such as Fatser R-CNN have been obtained.
2nd, the present invention is reasonable in design, and it carries out multi-feature extraction using deep learning framework, obtains the multilayer feature of image Represent, but realize more accurately classification;And employ a kind of new Overlap overlapping area loss functions in positioning, The position of target-area object in input picture can be more accurately detected, has obtained good on the data set of target detection Effect.
Brief description of the drawings
Fig. 1 is entire block diagram of the invention;
Fig. 2 generates fixed aspect ratio and the candidate frame of size for the present invention on the characteristic pattern of convolutional layer 5;
Fig. 3 is the Overlap loss functions between the candidate frame of the invention proposed in positioning and true encirclement frame;
Influences of the Fig. 4 for the different training iterations of the present invention to target detection precision;
Fig. 5 is the target detection accuracy table based on PASCAL VOC.
Embodiment
The embodiment of the present invention is further described below in conjunction with accompanying drawing.
It is a kind of to improve the method for target detection performance by improving target classification and positional accuracy, as shown in figure 1, first First, in order to obtain more image informations, the picture with the true encirclement frame of object (Ground Truth) is input to VGG- Characteristics of image is extracted in 16 convolutional neural networks, then carrying out multilayer to characteristics of image merges to form multi-characteristic;Then in order to Object candidate frame quickly is obtained, target candidate frame is generated according to certain length-width ratio and size on the characteristic pattern of convolutional layer 5, and map Cut on to multi-characteristic;Then in order to obtain the information of more candidate frames, result will be cut and carry out multiple features connection, and It is input to full articulamentum;Last classification and positioning in order to realize precise elevation, carry out the classification of Softmax graders and Overlap loss functions iterative regression is positioned, and realizes the classification and positioning of the target detection that complete precision is improved.Below with one Individual instantiation is illustrated:
S1, based on VGG-16 convolutional neural networks framework extract characteristics of image, and by convolutional layer 1,2,3 and 5 carry out feature Fusion, forms the characteristic pattern of multiple features;
S2, mesh generation is carried out on convolutional layer 5, the target candidate of fixed number and size is predicted in each network Frame;
S3, candidate frame is mapped on characteristic pattern cut, then will cut result and carry out multiple features connection;、
After S4, the above results are by full articulamentum, characteristics of image is classified by Softmax sorting algorithms, is used in combination Overlap space wastages function carries out online iterative regression positioning, obtains the result of final target detection.
In the present embodiment, the step S1 further comprises:
S1.1, the picture with the true encirclement frame of object is input to VGG-16 convolutional neural networks frameworks first, passed through Caffe extracts feature of the image by convolutional neural networks different layers;
S1.2, the characteristics of image for exporting convolutional layer 1,2 carry out maximum pondization and operated, and the image that convolutional layer 5 is exported Feature carries out deconvolution operation, realizes that output characteristic of the size of output all with convolutional layer 3 is in the same size;
S1.3, the feature for finally exporting convolutional layer 1,2,3 and 5 are merged, and obtain the spy of the multi-feature extraction of image Levy figure.
Fig. 2 is given in the present invention and carries out mesh generation on the characteristic pattern of convolutional layer 5, and fixed length is generated in each grid Wide 4 candidate frames than with size, the step S2 further comprises:
S2.1, convolutional layer 5 export characteristic pattern on carry out 6*6 mesh generation;
S2.2, predict 4 in the small lattice middle of each single network and may include the candidate frame of object, this 4 candidates Frame has fixed size and length-width ratio, and length-width ratio is respectively 1:1、1:2 and 2:1, only for 1:The candidate frame of 1 length-width ratio, I To set 2 kinds of candidate frame sizes be respectively 32*32 pixels and 64*64 pixels;
S2.3, during network training, we are matched the true encirclement frame and candidate frame of object, pass through two The IOU overlapping areas of person are screened more than or equal to 0.7, and delete the candidate frame beyond image boundary;
S2.4, about 100 candidate frames are finally generated on the characteristic pattern of convolutional layer 5.
In the present embodiment, the step S3 further comprises:
100 candidate frames generated on S3.1, the characteristic pattern according to convolutional layer 5, map corresponding according to its position On multilayer feature figure, and cut accordingly on multilayer feature figure;
S3.2, the characteristic pattern square after cutting done into 1*1 convolution, in order to retain preceding layer can the visual field, and subtract Then convolution results are carried out 3*3 convolution and 5*5 convolution by few amount of calculation respectively;
S3.3, in order to obtain full text information, by multilayer feature figure by maximum pond layer, then by 1*1 convolutional layers with Active coating, can halve amount of calculation;
S3.4, the convolution output result of 1*1 convolution, 3*3 convolution, 5*5 convolution and full text information entered according to tandem Row connection, forms the multiple features connection of candidate frame.
In the present embodiment, the step S4 further comprises:
S4.1, above by convolutional layer and multiple features connection after result by 3 layers of full articulamentum after, pass through Softmax Sorting algorithm is classified to characteristics of image, based on PASCAL VOC data sets, and classification results include 20 type objects, per type objects With oneself corresponding precision;
S4.2, pass through Overlap space wastage function pairs candidate frame carry out recurrence positioning so that candidate frame is closer to thing The true encirclement frame of body, the loss function is candidate frame and the common factor area divided by union area of true encirclement frame, and the numerical value is got over Close to 1 explanation the two closer to;
S4.3, according to Softmax loss and Overlap penalty values be ranked up, positive sample and negative sample are filtered out online Ratio be 3:1, renewal Sample Storehouse, which is input on multilayer feature figure, proceeds iterative regression positioning;
After S4.4, iteration n times, candidate frame closer to object true encirclement frame, model training well after can carry out The test of actual object.
Fig. 3 shows the Overlap losses between the candidate frame proposed in the present invention in positioning and true encirclement frame Function, the step S4.2 further comprises:
The true encirclement frame (Ground Truth) of object, its upper left corner and the lower right corner are included in S4.2.1, input picture Coordinate constitute 4 dimensional vectors, beThe object candidate frame predicted by inventive algorithm, The coordinate of its upper left corner () and the lower right corner () constitutes 4 dimensional vectors, is x=(x1,y1,x2,y2);
S4.2.2, traditional coordinate loss function are one-dimensional loss function, and the loss between each coordinate points is summed to come Overall position skew loss is calculated, but traditional method individually makes a distinction coordinate, it is impossible to overall prediction is truly wrapped Skew between peripheral frame and candidate frame is lost, and the formula of traditional one-dimensional coordinate loss function is:
S4.2.3, the skew loss predicted between true encirclement frame and candidate frame for entirety, we have proposed 4 dimension coordinates are carried out overall recurrence and calculated, calculate the area between true encirclement frame and candidate frame by Overlap loss functions Skew loss.Wherein I represents common factor area therebetween, and U represents union area therebetween, by common factor area divided by Union area evaluates position deviation therebetween, and the numerical value illustrates that position coordinates regression effect is better closer to 1.It is described The formula of two-dimensional areas Overlap loss functions is:
I=(x2′-x1′)×(y2′-y1′)
Fig. 4 shows influence of the training iterationses different in the present invention to target detection precision, the step S4.4 further comprises:
S4.4.1, in the training process according to Softmax loss and Overlap penalty values be ranked up, filter out 3:1 Positive and negative difficult sample, and the sample is re-entered into multilayer feature figure cut, multiple features connection is then proceeded by, By screening difficult sample, the robustness and accuracy of detection for lifting system proposed by the present invention are realized;
S4.4.2, as seen in Figure 4, after successive ignition is trained, the nicety of grading of target detection has been obtained very Big lifting, 1 Iterative classification precision proceeds iteration it is observed that precision is increased rapidly, but is passed through 42% or so Cross after 4 iteration, the increasing degree of precision is smaller, therefore balance accuracy and speed, we select the repetitive exercise number of times of system For 4, the nicety of grading of target detection and the overall raising of regression accuracy are obtained.
Method below as the present invention is tested, and illustrates the experiment effect of the present invention.
Test environment:MATLAB 2014b;Caffe;Ubuntu14.04 systems;NVIDIA GTX 1070p GPU
Cycle tests:Selected cycle tests and the true encirclement frame (Ground of its correspondence standard target detection object Truth), be all from target detection PASCAL VOC data sets (M.Everingham, L.Van Gool, C.K.Williams, J.Winn,and A.Zisserman,“The pascal visual object classes(voc)challenge,” International journal of computer vision,vol.88,no.2,pp.303–338,2007.).Wherein wrap The legend contained has 20 classifications, the respectively mankind;Animal (bird, cat, ox, dog, horse, sheep);The vehicles (aircraft, bicycle, Ship, bus, car, motorcycle, train);Indoor (bottle, chair, dining table, potted plant, sofa, TV).From Target be it is daily in most common object, be be exactly can preferably embody the practicality of algorithm, altogether comprising 9,963 figure Piece, there is 24,640 labeled target objects.
Test index:Present invention uses two kinds of evaluation indexes, respectively precision mAP (mean average ) and speed fps (frames per second) precision.Wherein precision mAP is the bat of object detection results Measurement, is compared with dreamboat testing result and is weighted average computation to all objects classification in database, to not This parameter value is calculated with algorithm, it was demonstrated that inventive algorithm obtains preferable result in object detection field;Speed fps is target inspection The measurement of the speed of result is surveyed, how many frame pictures can be handled by per second in test process and enters degree of testing the speed come evaluating target, it is right Algorithms of different calculates this parameter value, it was demonstrated that superiority of the inventive algorithm in object detection field.
Test result as shown in figure 5, Fig. 5 be based in PASCAL VOC data sets all image category measuring accuracies it is flat Equal result, it can be seen that inventive algorithm is significantly improved on mAP compared to other algorithm of target detection, wherein 4 generation of the invention Table loop iteration is trained 4 times, and the present invention 6 represents loop iteration and trained 6 times.The best result Faster R- of current target detection CNN mAP is 73.2%, and the mAP of the present invention 6 is 74.2%, and the accuracy of detection than Faster R-CNN improves 1.0%.And And, in wisp detection, such as bottle, aircraft and plant, inventive algorithm obtain higher accuracy of detection, example than other algorithms Wisp plant is such as directed to, inventive algorithm reaches 50.4%mAP, 11.6%mAP is higher by than Faster R-CNN.The above results Object detection results produced by showing inventive algorithm possess higher precision, and can preferably solve small target deteection Problem.
Target detection speed of the table 1 based on PASCAL VOC
Table 1 is the result based on all image category detection speeds in PASCAL VOC data sets, it can be seen that the present invention Algorithm is significantly improved on fps compared to other algorithm of target detection, carries out full articulamentum to block SVD wherein compression is represented (singular value decomposition) compresses.The best result Faster R-CNN of target detection speed is at present 7fps, the speed of the present invention 4 of compression is 12fps, accelerates 2fps than uncompressed convolutional layer, the speed of the present invention 6 of compression is 12fps, 2fps is accelerated than uncompressed convolutional layer;And the speed of inventive algorithm is 22 times of Fast R-CNN, almost close to real When detect.Object detection results produced by the above results show the present invention possess higher speed, and in target detection In two indexs of speed and precision, best object detection results can be reached, illustrate that inventive algorithm has frontier nature.
It is emphasized that embodiment of the present invention is illustrative, rather than it is limited, therefore present invention bag Include and be not limited to embodiment described in embodiment, it is every by those skilled in the art's technique according to the invention scheme The other embodiment drawn, also belongs to the scope of protection of the invention.

Claims (5)

1. a kind of improve the method for target detection performance by improving target classification and positional accuracy, it is characterised in that including with Lower step:
Step 1, extract characteristics of image according to convolutional neural networks framework, and select before convolutional layer M layer output progress Fusion Features, Form the characteristic pattern of multiple features;
Step 2, mesh generation is carried out on convolutional layer M, the target candidate frame of fixed number and size is predicted in each network;
Step 3, candidate frame is mapped on characteristic pattern cut, then will cut result and carry out multiple features connection;
Step 4, by the above results by full articulamentum after, characteristics of image is classified by Softmax sorting algorithms, is used in combination Overlapping area loss function carries out online iterative regression positioning, obtains the result of final target detection.
2. according to claim 1 improve the method for target detection performance by improving target classification and positional accuracy, It is characterized in that:The specific method of the step 1 comprises the following steps:
(1) the picture with the true encirclement frame of object is input in convolutional neural networks framework first, is extracted and schemed by Caffe As the feature by convolutional neural networks different layers;
(2) the characteristics of image forward convolutional layer exported carries out maximum pondization and operated, and the characteristics of image that convolutional layer M is exported Deconvolution operation is carried out, realizes that output characteristic of the size of output all with middle convolutional layer is in the same size;
(3) the last feature for exporting all convolutional layers is merged, and obtains the characteristic pattern of the multi-feature extraction of image.
3. according to claim 1 improve the method for target detection performance by improving target classification and positional accuracy, It is characterized in that:The implementation method of the step 2 comprises the following steps:
(1) 6*6 mesh generation is carried out on the characteristic pattern that convolutional layer M is exported;
(2) the candidate frame of object may be included by predicting 4 in the small lattice middle of each single network, and this 4 candidate frames have solid Fixed size and length-width ratio, length-width ratio is respectively 1:1、1:2 and 2:1, only for 1:The candidate frame of 1 length-width ratio, sets 2 kinds of candidates Frame size 0.6 and 0.9;
(3) during network training, we are matched the true encirclement frame and candidate frame of object, pass through the IOU of the two Overlapping area is screened more than or equal to 0.7, and deletes the candidate frame beyond image boundary;
(4) 100 candidate frames are finally generated on convolutional layer M characteristic pattern.
4. according to claim 1 improve the method for target detection performance by improving target classification and positional accuracy, It is characterized in that:The implementation method of the step 3 comprises the following steps:
(1), according to 100 candidate frames generated on convolutional layer M characteristic pattern, corresponding multilayer feature is mapped according to its position On figure, and cut accordingly on multilayer feature figure;
(2) the characteristic pattern square after cutting is done into 1*1 convolution, 3*3 convolution and 5*5 convolution are then carried out respectively to convolution results;
(3) in order to obtain full text information, by multilayer feature figure by maximum pond layer, then by 1*1 convolutional layers and active coating;
(4) the convolution output result of 1*1 convolution, 3*3 convolution, 5*5 convolution and full text information is attached according to tandem, Form the multiple features connection of candidate frame.
5. the degree of accuracy according to claim 1 by improving target classification and positioning improves target detection performance, it is special Levy and be:The concrete methods of realizing of the step 4 comprises the following steps:
By full articulamentum after, characteristics of image is classified by Softmax sorting algorithms, the data based on target detection Collection, has oneself corresponding precision per type objects;
(2) recurrence positioning is carried out to candidate frame by overlapping area loss function so that true bag of the candidate frame closer to object Peripheral frame, the loss function is candidate frame and the common factor area divided by union area of true encirclement frame;
(3) it is ranked up according to Softmax losses and overlapping area penalty values, positive sample and the ratio of negative sample is filtered out online For 3:1, renewal Sample Storehouse, which is input on multilayer feature figure, proceeds iterative regression positioning;
(4) after iteration n times, candidate frame closer to object true encirclement frame, model training well after can carry out actual object Test.
CN201710450327.2A 2017-06-15 2017-06-15 Improve the method for target detection performance by improving target classification and positional accuracy Pending CN107316058A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710450327.2A CN107316058A (en) 2017-06-15 2017-06-15 Improve the method for target detection performance by improving target classification and positional accuracy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710450327.2A CN107316058A (en) 2017-06-15 2017-06-15 Improve the method for target detection performance by improving target classification and positional accuracy

Publications (1)

Publication Number Publication Date
CN107316058A true CN107316058A (en) 2017-11-03

Family

ID=60181717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710450327.2A Pending CN107316058A (en) 2017-06-15 2017-06-15 Improve the method for target detection performance by improving target classification and positional accuracy

Country Status (1)

Country Link
CN (1) CN107316058A (en)

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171112A (en) * 2017-12-01 2018-06-15 西安电子科技大学 Vehicle identification and tracking based on convolutional neural networks
CN108205687A (en) * 2018-02-01 2018-06-26 通号通信信息集团有限公司 Based on focus mechanism positioning loss calculation method and system in object detection system
CN108229341A (en) * 2017-12-15 2018-06-29 北京市商汤科技开发有限公司 Sorting technique and device, electronic equipment, computer storage media, program
CN108229477A (en) * 2018-01-25 2018-06-29 深圳市商汤科技有限公司 For visual correlation recognition methods, device, equipment and the storage medium of image
CN108805064A (en) * 2018-05-31 2018-11-13 中国农业大学 A kind of fish detection and localization and recognition methods and system based on deep learning
CN108805210A (en) * 2018-06-14 2018-11-13 深圳深知未来智能有限公司 A kind of shell hole recognition methods based on deep learning
CN108830280A (en) * 2018-05-14 2018-11-16 华南理工大学 A kind of small target detecting method based on region nomination
CN108830131A (en) * 2018-04-10 2018-11-16 中科院微电子研究所昆山分所 Traffic target detection and distance measuring method based on deep learning
CN108830224A (en) * 2018-06-19 2018-11-16 武汉大学 A kind of high-resolution remote sensing image Ship Target Detection method based on deep learning
CN108875577A (en) * 2018-05-11 2018-11-23 深圳市易成自动驾驶技术有限公司 Object detection method, device and computer readable storage medium
CN109086779A (en) * 2018-07-28 2018-12-25 天津大学 A kind of attention target identification method based on convolutional neural networks
CN109447066A (en) * 2018-10-18 2019-03-08 中国人民武装警察部队海警学院 A kind of quick accurately single phase object detection method and device
CN109493370A (en) * 2018-10-12 2019-03-19 西南交通大学 A kind of method for tracking target based on spatial offset study
CN109492685A (en) * 2018-10-31 2019-03-19 中国矿业大学 A kind of target object visible detection method for symmetrical feature
CN109508672A (en) * 2018-11-13 2019-03-22 云南大学 A kind of real-time video object detection method
CN109583483A (en) * 2018-11-13 2019-04-05 中国科学院计算技术研究所 A kind of object detection method and system based on convolutional neural networks
CN109685008A (en) * 2018-12-25 2019-04-26 云南大学 A kind of real-time video object detection method
CN109697464A (en) * 2018-12-17 2019-04-30 环球智达科技(北京)有限公司 Method and system based on the identification of the precision target of object detection and signature search
CN109711326A (en) * 2018-12-25 2019-05-03 云南大学 A kind of video object detection method based on shallow-layer residual error network
CN109784349A (en) * 2018-12-25 2019-05-21 东软集团股份有限公司 Image object detection model method for building up, device, storage medium and program product
CN109816012A (en) * 2019-01-22 2019-05-28 南京邮电大学 A kind of multiscale target detection method of integrating context information
CN109918951A (en) * 2019-03-12 2019-06-21 中国科学院信息工程研究所 A kind of artificial intelligence process device side channel system of defense based on interlayer fusion
CN110059667A (en) * 2019-04-28 2019-07-26 上海应用技术大学 Pedestrian counting method
CN110110722A (en) * 2019-04-30 2019-08-09 广州华工邦元信息技术有限公司 A kind of region detection modification method based on deep learning model recognition result
CN110222641A (en) * 2019-06-06 2019-09-10 北京百度网讯科技有限公司 The method and apparatus of image for identification
CN110245675A (en) * 2019-04-03 2019-09-17 复旦大学 A kind of dangerous objects detection method based on millimeter-wave image human body contextual information
WO2019179269A1 (en) * 2018-03-21 2019-09-26 广州极飞科技有限公司 Method and apparatus for acquiring boundary of area to be operated, and operation route planning method
CN110348384A (en) * 2019-07-12 2019-10-18 沈阳理工大学 A kind of Small object vehicle attribute recognition methods based on Fusion Features
CN110555354A (en) * 2018-05-31 2019-12-10 北京深鉴智能科技有限公司 Feature screening method and apparatus, target detection method and apparatus, electronic apparatus, and storage medium
CN110610184A (en) * 2018-06-15 2019-12-24 阿里巴巴集团控股有限公司 Method, device and equipment for detecting salient object of image
CN110874641A (en) * 2018-08-29 2020-03-10 松下电器(美国)知识产权公司 Information processing method and information processing system
CN110909604A (en) * 2019-10-23 2020-03-24 深圳市华讯方舟太赫兹科技有限公司 Security image detection method, terminal device and computer storage medium
CN110956060A (en) * 2018-09-27 2020-04-03 北京市商汤科技开发有限公司 Motion recognition method, driving motion analysis method, device and electronic equipment
CN111126421A (en) * 2018-10-31 2020-05-08 浙江宇视科技有限公司 Target detection method, device and readable storage medium
CN111160353A (en) * 2019-12-27 2020-05-15 广州亚信技术有限公司 License plate recognition method, device and equipment
CN111325075A (en) * 2018-12-17 2020-06-23 北京华航无线电测量研究所 Video sequence target detection method
CN111968087A (en) * 2020-08-13 2020-11-20 中国农业科学院农业信息研究所 Plant disease area detection method
US10902314B2 (en) 2018-09-19 2021-01-26 Industrial Technology Research Institute Neural network-based classification method and classification device thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104881662A (en) * 2015-06-26 2015-09-02 北京畅景立达软件技术有限公司 Single-image pedestrian detection method
CN105488758A (en) * 2015-11-30 2016-04-13 河北工业大学 Image scaling method based on content awareness
CN106127204A (en) * 2016-06-30 2016-11-16 华南理工大学 A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks
US9501724B1 (en) * 2015-06-09 2016-11-22 Adobe Systems Incorporated Font recognition and font similarity learning using a deep neural network
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network
US20170083792A1 (en) * 2015-09-22 2017-03-23 Xerox Corporation Similarity-based detection of prominent objects using deep cnn pooling layers as features
CN106650699A (en) * 2016-12-30 2017-05-10 中国科学院深圳先进技术研究院 CNN-based face detection method and device
CN106650725A (en) * 2016-11-29 2017-05-10 华南理工大学 Full convolutional neural network-based candidate text box generation and text detection method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9501724B1 (en) * 2015-06-09 2016-11-22 Adobe Systems Incorporated Font recognition and font similarity learning using a deep neural network
US20160364633A1 (en) * 2015-06-09 2016-12-15 Adobe Systems Incorporated Font recognition and font similarity learning using a deep neural network
CN104881662A (en) * 2015-06-26 2015-09-02 北京畅景立达软件技术有限公司 Single-image pedestrian detection method
US20170083792A1 (en) * 2015-09-22 2017-03-23 Xerox Corporation Similarity-based detection of prominent objects using deep cnn pooling layers as features
CN105488758A (en) * 2015-11-30 2016-04-13 河北工业大学 Image scaling method based on content awareness
CN106127204A (en) * 2016-06-30 2016-11-16 华南理工大学 A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks
CN106250812A (en) * 2016-07-15 2016-12-21 汤平 A kind of model recognizing method based on quick R CNN deep neural network
CN106650725A (en) * 2016-11-29 2017-05-10 华南理工大学 Full convolutional neural network-based candidate text box generation and text detection method
CN106650699A (en) * 2016-12-30 2017-05-10 中国科学院深圳先进技术研究院 CNN-based face detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
任少卿: "基于特征共享的高效物体检测", 《中国博士学位论文全文数据库(信息科技辑)》 *

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171112A (en) * 2017-12-01 2018-06-15 西安电子科技大学 Vehicle identification and tracking based on convolutional neural networks
CN108171112B (en) * 2017-12-01 2021-06-01 西安电子科技大学 Vehicle identification and tracking method based on convolutional neural network
CN108229341B (en) * 2017-12-15 2021-08-06 北京市商汤科技开发有限公司 Classification method and device, electronic equipment and computer storage medium
CN108229341A (en) * 2017-12-15 2018-06-29 北京市商汤科技开发有限公司 Sorting technique and device, electronic equipment, computer storage media, program
CN108229477A (en) * 2018-01-25 2018-06-29 深圳市商汤科技有限公司 For visual correlation recognition methods, device, equipment and the storage medium of image
CN108229477B (en) * 2018-01-25 2020-10-09 深圳市商汤科技有限公司 Visual relevance identification method, device, equipment and storage medium for image
CN108205687B (en) * 2018-02-01 2022-04-01 通号通信信息集团有限公司 Attention mechanism-based positioning loss calculation method and system in target detection system
CN108205687A (en) * 2018-02-01 2018-06-26 通号通信信息集团有限公司 Based on focus mechanism positioning loss calculation method and system in object detection system
WO2019179269A1 (en) * 2018-03-21 2019-09-26 广州极飞科技有限公司 Method and apparatus for acquiring boundary of area to be operated, and operation route planning method
CN108830131A (en) * 2018-04-10 2018-11-16 中科院微电子研究所昆山分所 Traffic target detection and distance measuring method based on deep learning
CN108830131B (en) * 2018-04-10 2021-05-04 昆山微电子技术研究院 Deep learning-based traffic target detection and ranging method
CN108875577A (en) * 2018-05-11 2018-11-23 深圳市易成自动驾驶技术有限公司 Object detection method, device and computer readable storage medium
CN108830280B (en) * 2018-05-14 2021-10-26 华南理工大学 Small target detection method based on regional nomination
CN108830280A (en) * 2018-05-14 2018-11-16 华南理工大学 A kind of small target detecting method based on region nomination
CN108805064A (en) * 2018-05-31 2018-11-13 中国农业大学 A kind of fish detection and localization and recognition methods and system based on deep learning
CN110555354A (en) * 2018-05-31 2019-12-10 北京深鉴智能科技有限公司 Feature screening method and apparatus, target detection method and apparatus, electronic apparatus, and storage medium
CN108805210B (en) * 2018-06-14 2022-03-04 深圳深知未来智能有限公司 Bullet hole identification method based on deep learning
CN108805210A (en) * 2018-06-14 2018-11-13 深圳深知未来智能有限公司 A kind of shell hole recognition methods based on deep learning
CN110610184A (en) * 2018-06-15 2019-12-24 阿里巴巴集团控股有限公司 Method, device and equipment for detecting salient object of image
CN110610184B (en) * 2018-06-15 2023-05-12 阿里巴巴集团控股有限公司 Method, device and equipment for detecting salient targets of images
CN108830224A (en) * 2018-06-19 2018-11-16 武汉大学 A kind of high-resolution remote sensing image Ship Target Detection method based on deep learning
CN108830224B (en) * 2018-06-19 2021-04-02 武汉大学 High-resolution remote sensing image ship target detection method based on deep learning
CN109086779A (en) * 2018-07-28 2018-12-25 天津大学 A kind of attention target identification method based on convolutional neural networks
CN109086779B (en) * 2018-07-28 2021-11-09 天津大学 Attention target identification method based on convolutional neural network
CN110874641A (en) * 2018-08-29 2020-03-10 松下电器(美国)知识产权公司 Information processing method and information processing system
US10902314B2 (en) 2018-09-19 2021-01-26 Industrial Technology Research Institute Neural network-based classification method and classification device thereof
CN110956060A (en) * 2018-09-27 2020-04-03 北京市商汤科技开发有限公司 Motion recognition method, driving motion analysis method, device and electronic equipment
CN109493370A (en) * 2018-10-12 2019-03-19 西南交通大学 A kind of method for tracking target based on spatial offset study
CN109493370B (en) * 2018-10-12 2021-07-02 西南交通大学 Target tracking method based on space offset learning
CN109447066B (en) * 2018-10-18 2021-08-20 中国人民武装警察部队海警学院 Rapid and accurate single-stage target detection method and device
CN109447066A (en) * 2018-10-18 2019-03-08 中国人民武装警察部队海警学院 A kind of quick accurately single phase object detection method and device
CN109492685A (en) * 2018-10-31 2019-03-19 中国矿业大学 A kind of target object visible detection method for symmetrical feature
CN111126421A (en) * 2018-10-31 2020-05-08 浙江宇视科技有限公司 Target detection method, device and readable storage medium
CN109492685B (en) * 2018-10-31 2022-05-24 煤炭科学研究总院 Target object visual detection method for symmetric characteristics
CN109583483B (en) * 2018-11-13 2020-12-11 中国科学院计算技术研究所 Target detection method and system based on convolutional neural network
CN109508672A (en) * 2018-11-13 2019-03-22 云南大学 A kind of real-time video object detection method
CN109583483A (en) * 2018-11-13 2019-04-05 中国科学院计算技术研究所 A kind of object detection method and system based on convolutional neural networks
CN109697464A (en) * 2018-12-17 2019-04-30 环球智达科技(北京)有限公司 Method and system based on the identification of the precision target of object detection and signature search
CN111325075A (en) * 2018-12-17 2020-06-23 北京华航无线电测量研究所 Video sequence target detection method
CN111325075B (en) * 2018-12-17 2023-11-07 北京华航无线电测量研究所 Video sequence target detection method
CN109685008A (en) * 2018-12-25 2019-04-26 云南大学 A kind of real-time video object detection method
CN109784349A (en) * 2018-12-25 2019-05-21 东软集团股份有限公司 Image object detection model method for building up, device, storage medium and program product
CN109711326A (en) * 2018-12-25 2019-05-03 云南大学 A kind of video object detection method based on shallow-layer residual error network
CN109816012A (en) * 2019-01-22 2019-05-28 南京邮电大学 A kind of multiscale target detection method of integrating context information
CN109816012B (en) * 2019-01-22 2022-07-12 南京邮电大学 Multi-scale target detection method fusing context information
CN109918951A (en) * 2019-03-12 2019-06-21 中国科学院信息工程研究所 A kind of artificial intelligence process device side channel system of defense based on interlayer fusion
CN110245675B (en) * 2019-04-03 2023-02-10 复旦大学 Dangerous object detection method based on millimeter wave image human body context information
CN110245675A (en) * 2019-04-03 2019-09-17 复旦大学 A kind of dangerous objects detection method based on millimeter-wave image human body contextual information
CN110059667A (en) * 2019-04-28 2019-07-26 上海应用技术大学 Pedestrian counting method
CN110110722A (en) * 2019-04-30 2019-08-09 广州华工邦元信息技术有限公司 A kind of region detection modification method based on deep learning model recognition result
CN110222641A (en) * 2019-06-06 2019-09-10 北京百度网讯科技有限公司 The method and apparatus of image for identification
CN110348384B (en) * 2019-07-12 2022-06-17 沈阳理工大学 Small target vehicle attribute identification method based on feature fusion
CN110348384A (en) * 2019-07-12 2019-10-18 沈阳理工大学 A kind of Small object vehicle attribute recognition methods based on Fusion Features
CN110909604A (en) * 2019-10-23 2020-03-24 深圳市华讯方舟太赫兹科技有限公司 Security image detection method, terminal device and computer storage medium
CN110909604B (en) * 2019-10-23 2024-04-19 深圳市重投华讯太赫兹科技有限公司 Security check image detection method, terminal equipment and computer storage medium
CN111160353A (en) * 2019-12-27 2020-05-15 广州亚信技术有限公司 License plate recognition method, device and equipment
CN111968087A (en) * 2020-08-13 2020-11-20 中国农业科学院农业信息研究所 Plant disease area detection method
CN111968087B (en) * 2020-08-13 2023-11-07 中国农业科学院农业信息研究所 Plant disease area detection method

Similar Documents

Publication Publication Date Title
CN107316058A (en) Improve the method for target detection performance by improving target classification and positional accuracy
CN107563381B (en) Multi-feature fusion target detection method based on full convolution network
CN109034210A (en) Object detection method based on super Fusion Features Yu multi-Scale Pyramid network
Wang et al. Autonomous garbage detection for intelligent urban management
Ouyang et al. DeepID-Net: Object detection with deformable part based convolutional neural networks
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
Zhang et al. Pedestrian detection method based on Faster R-CNN
CN109800628A (en) A kind of network structure and detection method for reinforcing SSD Small object pedestrian detection performance
Wan et al. Ceramic tile surface defect detection based on deep learning
CN107818302A (en) Non-rigid multiple dimensioned object detecting method based on convolutional neural networks
CN107451602A (en) A kind of fruits and vegetables detection method based on deep learning
CN106446930A (en) Deep convolutional neural network-based robot working scene identification method
CN109932730A (en) Laser radar object detection method based on multiple dimensioned monopole three dimensional detection network
CN110321891A (en) A kind of big infusion medical fluid foreign matter object detection method of combined depth neural network and clustering algorithm
CN107808376A (en) A kind of detection method of raising one's hand based on deep learning
CN105243139A (en) Deep learning based three-dimensional model retrieval method and retrieval device thereof
CN106127161A (en) Fast target detection method based on cascade multilayer detector
CN110569926B (en) Point cloud classification method based on local edge feature enhancement
CN105787488A (en) Image feature extraction method and device realizing transmission from whole to local
Xu et al. Occlusion problem-oriented adversarial faster-RCNN scheme
Yang et al. Road crack detection using deep neural network with receptive field block
Ouadiay et al. Simultaneous object detection and localization using convolutional neural networks
Zhang et al. A precise apple leaf diseases detection using BCTNet under unconstrained environments
Wang et al. A review of object detection based on convolutional neural networks and deep learning
Zhang et al. Multiple Objects Detection based on Improved Faster R-CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171103

RJ01 Rejection of invention patent application after publication