CN110348357A - A kind of fast target detection method based on depth convolutional neural networks - Google Patents
A kind of fast target detection method based on depth convolutional neural networks Download PDFInfo
- Publication number
- CN110348357A CN110348357A CN201910594388.5A CN201910594388A CN110348357A CN 110348357 A CN110348357 A CN 110348357A CN 201910594388 A CN201910594388 A CN 201910594388A CN 110348357 A CN110348357 A CN 110348357A
- Authority
- CN
- China
- Prior art keywords
- layer
- model
- convolution kernel
- frame
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 62
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 47
- 238000013138 pruning Methods 0.000 claims abstract description 21
- 238000000605 extraction Methods 0.000 claims abstract description 17
- 230000006835 compression Effects 0.000 claims abstract description 8
- 238000007906 compression Methods 0.000 claims abstract description 8
- 238000000034 method Methods 0.000 claims description 28
- 230000008569 process Effects 0.000 claims description 18
- 238000005457 optimization Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 8
- 230000003416 augmentation Effects 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 238000005259 measurement Methods 0.000 abstract description 2
- 230000001737 promoting effect Effects 0.000 abstract 1
- 238000007689 inspection Methods 0.000 description 7
- 230000004438 eyesight Effects 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000011897 real-time detection Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010224 classification analysis Methods 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012372 quality testing Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
Abstract
The invention discloses a kind of fast target detection methods based on depth convolutional neural networks.First by constructing basic SSD detection model, pretreated data are trained, an original training pattern is obtained.Then, by the measurement of convolution kernel importance, convolution kernel Pruning strategy is taken, removes unessential convolution kernel, simplifies the feature extraction network in detection model, and obtain compact model.Specifically, it is as close as possible with the output of i+1 layer by the input for constituting the channel subset to i+1 layer, so as to remove other channels of i+1 layer input, and corresponding convolution kernel in i-th layer is removed in turn, to realize the beta pruning to model convolution kernel.One convolutional layer of every removal, is then finely adjusted compact model, to restore detection model precision.After all convolutional layer beta prunings, final compressed detected model is obtained.The present invention disposes model in mobile terminal by model compression, while promoting detection speed, and keep detection accuracy.
Description
Technical field
The present invention relates to computer vision fields, more particularly to a kind of fast target based on depth convolutional neural networks
Detection method.
Technical background
The mankind obtain the approach of information, are exactly intuitively most the sense of hearing and vision.Studies have shown that the mankind obtain hundred in information
/ eight nine ten, it is all to be arrived by eye-observation.Also it is based on this, vision mechanism is always the important neck of human research
Domain, especially computer vision, in recent years, along with the development of deep learning, computer vision achieves many great dash forward
It is broken.And target detection, be exactly people realize visual perception and understanding an important ring, the speed and precision of detection object, directly
The quality for determining that we obtain information is connect, importance has some idea of.In addition, target detection has broad application prospects,
It such as in unmanned, may be implemented quickly to position by 3D target detection, so that automobile can be with avoiding obstacles and suitable
Benefit traveling;The spot in production product is automatically identified by the quality testing of artificial intelligence in manufacturing industry neck industry, it can be rapid
Underproof product is found out, the accuracy and speed of quality inspection not only can be improved, a large amount of labour can also be saved;In traffic system
In system, by the real-time detection to video, license plate number can be quickly recognized.To sum up, target detection, especially quickly
Target detection plays an increasingly important role in we live.
Target detection mainly includes two processes, that is, positions and identify.Compared to general image recognition tasks, although only
It has had more and has positioned this process, but is more complex on its model realization.In traditional target detection is realized, difficult point exists
In feature extraction and tagsort.It is intended to using Haar small echo, LBP, SIFT, HOG (histogram of oriented
The methods of) gradient manual features extraction carried out to target, then by cascade classifier AdaBoost, support vector machines,
The methods of DMP classifies.But since feature extraction is mainly based upon the information of bottom, believe the abundant semanteme of comparison is possessed
The high-level characteristic of breath extracts insufficient, while feature extraction has specificity, so existing, precision in identification is not high and identification object
The single problem of type.
Therefore, although having studied many years, under based on traditional detection method, fail to be widely used always.
Until 2012, Alexnet model was announced to the world splendidly, and the research of computer vision realizes historic breakthrough.It is being advised greatly
The achievement attracted attention, and the achievement to take first place in ImageNet match are achieved in the application of mould image recognition.It develops,
On the one hand it is the promotion of computer hardware performance, is significantly improved in big data storage and calculating speed;Another party
The it is proposed of the progress of machine learning algorithm, especially deep neural network is had benefited from face, makes it in feature extraction, especially high level
In feature extraction, it is greatly improved.Hereafter, based on the object detection method of convolutional neural networks, as emerging rapidly in large numbersBamboo shoots after a spring rain,
It is flourished.
Currently, the object detection method based on convolutional neural networks model, there are two main classes, i.e. two-stage detection and single order
Section detection.It is foremost in two-stage detection, surely belong to R-CNN series.It is intended to the method by region detection, and target is examined
Survey is divided into two processes, and one is selection that frame is proposed on boundary, is mainly realized by returning;Another is exactly to carry out to object
Classification.Thereafter, fast R-CNN, the models such as faster R-CNN and Mask R-CNN and are successively proposed, although detect
It all increases in speed and precision, but from the real-time detection in practical application, still greatly differs from each other.Another thinking
It is to extract feature after passing through convolutional neural networks to the picture of input, directly carry out recurrence processing, will positions and be used as one with identification
A process realizes that Typical Representative is exactly yolo, SSD etc..Although detection speed greatly improves, in production application
In, but still it is unable to satisfy the requirement of real-time of detection.
Summary of the invention
The fast target detection based on depth convolutional neural networks that the invention mainly solves the technical problem of providing a kind of
Method reduces model size, is conducive to the deployment of model by carrying out convolution kernel cut operator to original detection model.Together
When, detection speed is improved in the case where guaranteeing detection accuracy, is able to solve video monitoring in intelligent transportation system, Hang Renjian
The problems such as survey.
The invention proposes a kind of fast target detection methods based on depth convolutional neural networks, comprising the following steps:
Training archetype, model compression are finely adjusted on compact model, specifically, mainly including following four step:
Step1: pre-processing the image data of training set, specifically: using random cropping fixed area, random sanction
Random size, color change and brightness warping method are cut, augmentation is carried out to image data, then does flip horizontal at random again, most
Normalized is done to the picture after augmentation afterwards, makes its fixed size, having a size of w × h.
It is trained pretreated training set input SSD model to obtain initial model;
(1) SSD model is constructed, then the VGG16 to remove full articulamentum adds for basic feature extraction network
Six convolutional layers of Conv6, Conv7, Conv8, Conv9, Conv10 and Conv11, and extract Conv4_3, Conv7, Conv8_2,
Conv9_2, Conv10_2 and Conv11_2 layers of characteristic pattern is as prediction interval;
(2) pretreated training set input SSD model is subjected to feature extraction, generates fixed number on each prediction interval
Different sizes and the priori candidate frame of different length-width ratios, then will be marked in the image data of priori candidate frame and training set
True frame is matched, and the positive sample and negative sample in training process are obtained, and carries out classification and regression forecasting respectively, first
First, true frame is handed over it and than selecting frame to match after maximum priori;Then, for handing over and than true greater than 0.5
Frame is matched with remaining candidate frame, will be other then make with the matched candidate priori frame of true frame as positive sample
For negative sample, negative sample is arranged according to forecast confidence descending, chooses the negative sample of front, and guarantee that positive and negative sample proportion is
1:3.
In this training process, obtained initial using SGD gradient optimization algorithm by backpropagation training network
Training pattern, the loss function in training process are as follows:
Wherein, N is the number of the priori candidate frame to match with true frame, LlocTo position loss function, LconfTo divide
The loss amount of class confidence level, α are regularization parameter, and z is input picture, and p is target category, and l is model prediction frame, and g is mark
Infuse frame.
Step2: the strategy of convolution kernel beta pruning is taken initial model obtained in Step1, initial model is compressed
Obtain compact model;In above-mentioned basic convolutional neural networks VGG16 (Conv1_1-Conv4_3), it is arranged a kind of based on volume
The Pruning strategy in product core channel and size, the convolution kernel channel high to convolution feature extraction contribution rate are just retained, and feature mentions
Take influence is small then to give up.Specifically, feature extraction is carried out to image, if i-th layer of convolution characteristic layer port number is Ci, it is wide and
Height is respectively HiAnd Di, i-th layer of convolutional layer is denoted as Ii, and haveCorresponding, convolution kernel size is ni×Ci
×Ki×Ki, i.e., shared niA convolution kernel, port number Ci, wide and high respectively KiAnd Ki, i-th layer of convolution kernel be denoted as Wi, and
HaveOur purpose is removal WiIn unessential convolution kernel, with reduced-order models parameter.It is cut based on convolution kernel
The main thought of branch is: for i+1 layer, the output that input is i-th layer, and the input that output is the i-th+2 layers, if the
The i+1 layer input that i layers of channel subset are constituted is approximate with the output of i+1 layer, then i+1 layer inputs its in corresponding i-th layer
He can remove in channel, meanwhile, corresponding convolution kernel can also remove in i-th layer.
Detailed process are as follows:
If y is random in the output characteristic layer of the stochastical sampling point namely i+1 layer in the i-th+2 layers input feature vector layer
Sampled point
Wherein,WithRespectively the convolution kernel and sliding window of i+1 layer response, c indicate convolution kernel channel,
C is maximum port number, k1Indicate width, k2Indicate that height, the two maximum value are K, b is corresponding biasing;
The convolution kernel of i+1 layer response and each channel output of sliding window convolution operation are
The then stochastical sampling point in the output characteristic layer of i+1 layerIt indicates are as follows:
Wherein,
If the i+1 layer input that i-th layer of channel subset is constituted is approximate with the output of i+1 layer, the input of i+1 layer is corresponded to
I-th layer in other channels can remove, speciallyWherein S is characterized a layer channel subset, and hasIf above formula is set up, anyCorresponding channel characteristicsIt can remove, at the same time, i+1
Input is i-th layer of output, then corresponding convolution kernel can be also removed in i-th layer, to realize i-th layer of convolution kernel beta pruning;
In the training process, it is equipped with training setWherein M is that picture number and convolution feature are empty
Between number of positions product,For m-th of input convolution feature,It is obtained by formula (3), is
The jth channel that corresponding convolution kernel and sliding window convolution operation obtain exports;It is obtained by formula (4), to export characteristic layer
In m-th of stochastical sampling point, then former channel selecting problem is changed into following optimization problem:
Wherein, | S | for the quantity of element in channel subset S, r is compression ratio, and T is enabled to indicate removed channel of characteristic layer
Collection, then the intersection of set T and S is empty set, and union is that { 1,2 ..., C } is gathered in channel, above formula conversion are as follows:
In general, | T | < | S |, therefore during hands-on, it is logical by optimization formula (6) Lai Shixian convolution kernel
The beta pruning in road.By above-mentioned optimization, i-th layer of convolution kernel to be removed has been obtained, meanwhile, the mould after being removed in order to ensure convolution kernel
Type performance minimizes its reconstructed error.
Formula (7) is solved according to common least square, available
According to above step, to 10 convolutional layers (Conv1_1-Conv4_3) preceding in the basic model VGG16 of detection model
Convolution kernel carry out cut operator, and obtain compact model.
Step3: carrying out after subtracting branch each characteristic layer of model in step (2), be and then finely adjusted training to model,
Compact model is trained using pretreated training set and saves model.In trim process, use step (1)
In training step, the detection model after obtaining convolution kernel beta pruning, to promote the model inspection precision after beta pruning.In general,
One to two periods of repetition training, and finally obtained model is saved.
Step4: repeating step Step2-Step3 several times, and the model after fine tuning in Step (3) is taken again
The convolution kernel Pruning strategy of Step (2), with further compact model, until all complete on the lesser convolution kernel of detection performance influence
Portion's removal, and saved obtained model is finally finely tuned, as last compressed detected model.
Beneficial effects of the present invention:
The present invention improves detection speed in the case where reducing model size, guaranteeing detection accuracy as far as possible.
In Step1, by carrying out augmentation to data, model can be made to have more robustness to target size, size.And
The precision of original detection model is promoted, as far as possible simultaneously to improve the detection accuracy upper limit of compact model.
In Step2, by the importance of convolution kernel in measurement archetype, removal does not have influential volume to detection performance
Product core, so that implementation model compresses, and keeps detection accuracy simultaneously.
It in Step3, is finely adjusted on the network of compression, so that compressed model inspection performance reaches most again
It is excellent.
In Step4, step (3) and step (4) are repeated several times, to realize the convolution kernel beta pruning of all convolutional layers, and
Detection accuracy is kept, to obtain final detection model.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of preferred embodiment of the invention;
Fig. 2 is the basic detector network structural model SSD of the embodiment of the present invention, and wherein basic network is VGG-16.
Specific embodiment
The preferred embodiments of the present invention will be described in detail with reference to the accompanying drawing, so that advantages and features of the invention energy
It is easier to be readily appreciated by one skilled in the art, so as to make a clearer definition of the protection scope of the present invention.
Embodiment 1: the present invention can be applied in numerous areas, such as in traffic system, by detection monitor video come
Positioning target in real time;Also it can be applied to criminal investigation field, position suspect by quickly detecting;In automatic Pilot,
Quick positioning to road scene, to avoid pedestrian and barrier.In order to show the versatility of this method, below mainly in public affairs
For opening the experiment on data set Pascal VOC, 20 classifications are detected, altogether to illustrate particular condition in use of the invention.This hair
In bright experimentation, using system Ubuntu18.04, use hardware CPU for 3.7GHz × 6 i78700k, programming language is
Python3.6, video card are tall and handsome up to GeForce RTX 2070, and deep learning frame is Pytorch1.0.
Step1: pre-processing the image data of training set, and pretreated training set input SSD model is carried out
Training obtains initial model;Firstly, building SSD network model, and to remove the VGG16 of full articulamentum as basic feature extraction
Network, overall network structure are as shown in Figure 2.VOC2007 and VOC2012 training set and verifying is used to collect as training dataset,
Share 16551 trained pictures;Test set is VOC2007 test data set, shares 4952 pictures.Then, data are carried out
Pretreatment, using the methods of random cropping fixed area, the random size of random cropping, color change, brightness distortion, to picture number
According to augmentation is carried out, flip horizontal is then done at random again.Normalized finally is done to the picture after augmentation, makes its fixed size
300x300.Pretreated data input SSD detection model is subjected to feature extraction, and on the prediction interval of six different scales
Make classification and regression analysis respectively.In the training process, batch 32, total iteration 120000 times, and it is excellent using the decline of SGD gradient
Change algorithm, by backpropagation training network, obtains initial training pattern.
Step2: the strategy of convolution kernel beta pruning is taken initial model obtained in Step1, initial model is compressed
Compact model is obtained, feature extraction is carried out to training data using initial model, obtains preceding 10 convolutional layers in VGG16
(Conv1_1-Conv4_3) characteristic layer.If the output of input namely i+1 characteristic layer that y is the i-th+2 layers, is grasped by convolution
It is obtained as formula:
Wherein,WithRespectively the convolution kernel and sliding window of i+1 layer response, c indicate convolution kernel channel,
C is maximum port number, k1Indicate width, k2Indicate that height, the two maximum value are K, b is corresponding biasing;
Further, the convolution kernel of i+1 layer response and each channel of sliding window convolution operation export and are
The then stochastical sampling point in the output characteristic layer of i+1 layerIt can indicate are as follows:
Wherein,
By above formula, training set is obtainedOur purpose is optimization following formula, is not weighed with removal
The convolution feature channel wanted:
Wherein, T is except the convolutional channel set gone out, C are convolutional layer channel set, and r is compression ratio.
In order to solve the above optimization problem, setting T=φ first, i.e., | T |=0, concurrently set compression ratio r=0.5, formula
(7) initial solution is set as min_val →+∞.Then as | T | when < C × (1-r), execute following operation: for arbitrary m ∈ C,
T '=T ∪ { m } is set, and obtains the solution of formula (7) with T ', so that formula (7) value is minimum when one channel of every increase, if
Otherwise, continue the above operation if val < min_val, updates min_val=val, while updating T=T ' for val.
By the above method, so that the value of above formula is minimum when increasing a channel every time, the available volume to be removed
Product core set T, to realize that network model compresses.
Meanwhile the model performance after being removed in order to ensure convolution kernel, we minimize reconstructed error shown in following formula:
It is solved according to common least square and above formula is solved, it is available
According to above step, to 10 convolutional layers (Conv1_1-Conv4_3) preceding in the basic model VGG16 of detection model
Convolution kernel carry out cut operator, and obtain compact model.
Step3: carrying out after subtracting branch each characteristic layer of model in step (2), be and then finely adjusted training to model,
And model is saved.In trim process, using the training step in step (1), detection model after obtaining convolution kernel beta pruning,
To promote the model inspection precision after beta pruning.In general, one to two periods of repetition training, and finally obtained model is protected
It deposits.
Step4: Step (3) and Step (4) are repeated several times, 3 repetitions are taken in this operation, obtain final detection mould
Type.
By above step, last available archetype and the model inspection effect after convolution kernel beta pruning, at present
Table 1 gives model size, detection speed and detection accuracy before and after beta pruning.As seen from the table, original model is compared, although inspection
It surveys precision to be declined slightly, is kept to 75.2% by 77.3%, but model size is reduced to 13.8M, model inspection speed by 105.2M
It is promoted by 46FPS (frame is per second) to 200FPS (frame is per second), can be met real in the case where slightly sacrificing detection accuracy in this way
The model deployment of mobile terminal and the requirement of real-time of detection in the production application of border.
1 archetype of table and compact model performance comparison
Model | Model size (M) | It detects speed (FPS) | Detection accuracy (mAP) |
Archetype | 105.2 | 46 | 77.3 |
Compact model | 13.8 | 200 | 75.2 |
Compared with existing other methods, the implementation of this example obtains an initial detection by using training data training
Then model carries out importance assessment by extracting the convolution kernel of network to detector feature, removes convolution kernel that should not be important,
With this reduced model.The importance different from the past that convolution kernel is assessed using the statistic of i-th layer of characteristic layer, we pass through the
The characteristic layer of i+1 guides the convolution kernel assessment to i-th layer of characteristic layer.During entire convolution kernel beta pruning, we are not
Change the structure of original model, can preferably keep the precision of model in this way.After completing convolution kernel beta pruning, then to model into
Row fine tuning, so that compact model performance is optimal.By our algorithm, the compression of model is realized, portion is allowed to
Administration improves detection speed in mobile terminal, and maintains the precision of detection substantially.
The above description is only an embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (6)
1. a kind of fast target detection method based on depth convolutional neural networks, which comprises the following steps:
Step1: pre-processing the image data of training set, and pretreated training set input SSD model is trained
Obtain initial model;
Step2: taking initial model obtained in Step1 the strategy of convolution kernel beta pruning, is compressed to obtain to initial model
Compact model;
Step3: compact model is trained using pretreated training set, i.e., compact model is finely adjusted;
Step4: repeating step Step2-Step3 several times, obtains final detection model.
2. the fast target detection method according to claim 1 based on depth convolutional neural networks, which is characterized in that institute
State the pre-processing image data of training set in Step1 specifically: using random cropping fixed area, the random size of random cropping,
Color change and brightness warping method, to image data carry out augmentation, then do flip horizontal at random again, finally to augmentation after
Picture does normalized, makes its fixed size, having a size of w × h.
3. the fast target detection method according to claim 1 based on depth convolutional neural networks, which is characterized in that institute
SSD model in Step1 is stated to be trained to obtain the detailed process of initial training pattern are as follows:
(1) construct SSD model, with remove full articulamentum VGG16 be basic feature extraction network, then add Conv6,
Six convolutional layers of Conv7, Conv8, Conv9, Conv10 and Conv11, and extract Conv4_3, Conv7, Conv8_2, Conv9_
2, Conv10_2 and Conv11_2 layers of characteristic pattern is as prediction interval;
(2) pretreated training set input SSD model is subjected to feature extraction, generates fixed number not on each prediction interval
It is then true by what is marked in the image data of priori candidate frame and training set with the priori candidate frame of size and different length-width ratios
Frame is matched, and the positive sample and negative sample in training process are obtained, and carries out classification and regression forecasting respectively, is trained herein
In the process, initial training pattern is obtained by backpropagation training network using SGD gradient optimization algorithm.
4. the fast target detection method according to claim 3 based on depth convolutional neural networks, which is characterized in that institute
It states the true frame marked in the image data of priori candidate frame and training set and carries out matching strategy are as follows: firstly, by true frame
It is handed over it and than selecting frame to be matched after maximum priori;Then, for hand over and than greater than 0.5 true frame and remaining time
Select frame to be matched, will be used as positive sample with the matched candidate priori frame of true frame, it is other then as negative sample, by negative sample
This is arranged according to forecast confidence descending, chooses the negative sample of front, and guarantees that positive and negative sample proportion is 1:3.
5. the fast target detection method according to claim 3 based on depth convolutional neural networks, which is characterized in that institute
State the loss function in training process are as follows:
Wherein, N is the number of the priori candidate frame to match with true frame, LlocTo position loss function, LconfIt is set for classification
The loss amount of reliability, α are regularization parameter, and z is input picture, and p is target category, and l is model prediction frame, and g is mark side
Frame.
6. the fast target detection method according to claim 1 based on depth convolutional neural networks, which is characterized in that institute
State the strategy of convolution kernel beta pruning are as follows: feature extraction is carried out to pretreated training set obtained in Step1 using initial model, is obtained
Into VGG16 the characteristic layer of preceding 10 convolutional layers and to convolution kernel carry out cut operator, detailed process are as follows:
If y is the stochastical sampling in the output characteristic layer of the stochastical sampling point namely i+1 layer in the i-th+2 layers input feature vector layer
Point:
Wherein,WithThe respectively convolution kernel and sliding window of i+1 layer response, c indicate convolution kernel channel, and C is most
Big port number, k1Indicate width, k2Indicate that height, the two maximum value are K, b is corresponding biasing;
The convolution kernel of i+1 layer response and each channel output of sliding window convolution operation are
The then stochastical sampling point in the output characteristic layer of i+1 layerIt indicates are as follows:
Wherein,
If the i+1 layer input that i-th layer of channel subset is constituted is approximate with the output of i+1 layer, the input of i+1 layer is i-th layer corresponding
In other channels can remove, speciallyWherein S is characterized a layer channel subset, and hasSuch as
Fruit above formula is set up, then anyCorresponding channel characteristicsIt can remove, at the same time, the input of i+1 is i-th layer defeated
Out, then corresponding convolution kernel can be also removed in i-th layer, to realize i-th layer of convolution kernel beta pruning;
In the training process, it is equipped with training setWherein M is picture number and convolution feature space position
The product of quantity is set,For m-th of input convolution feature,For corresponding convolution kernel and sliding window
The jth channel output that mouth convolution operation obtains;For m-th of stochastical sampling point in output characteristic layer, then former channel selecting is asked
Topic is changed into following optimization problem:
Wherein, | S | for the quantity of element in channel subset S, r is compression ratio, and T is enabled to indicate the removed channel subset of characteristic layer,
Then the intersection of set T and S is empty set, and union is that { 1,2 ..., C } is gathered in channel, above formula conversion are as follows:
By above-mentioned optimization, i-th layer of convolution kernel to be removed has been obtained, meanwhile, the model after being removed in order to ensure convolution kernel
Can, minimize its reconstructed error.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910594388.5A CN110348357B (en) | 2019-07-03 | 2019-07-03 | Rapid target detection method based on deep convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910594388.5A CN110348357B (en) | 2019-07-03 | 2019-07-03 | Rapid target detection method based on deep convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110348357A true CN110348357A (en) | 2019-10-18 |
CN110348357B CN110348357B (en) | 2022-10-11 |
Family
ID=68177705
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910594388.5A Active CN110348357B (en) | 2019-07-03 | 2019-07-03 | Rapid target detection method based on deep convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110348357B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080598A (en) * | 2019-12-12 | 2020-04-28 | 哈尔滨市科佳通用机电股份有限公司 | Bolt and nut missing detection method for coupler yoke key safety crane |
CN111291887A (en) * | 2020-03-06 | 2020-06-16 | 北京迈格威科技有限公司 | Neural network training method, image recognition method, device and electronic equipment |
CN112580558A (en) * | 2020-12-25 | 2021-03-30 | 烟台艾睿光电科技有限公司 | Infrared image target detection model construction method, detection method, device and system |
WO2021077947A1 (en) * | 2019-10-22 | 2021-04-29 | 北京市商汤科技开发有限公司 | Image processing method, apparatus and device, and storage medium |
CN113051961A (en) * | 2019-12-26 | 2021-06-29 | 深圳市光鉴科技有限公司 | Depth map face detection model training method, system, equipment and storage medium |
CN114248819A (en) * | 2020-09-25 | 2022-03-29 | 中车株洲电力机车研究所有限公司 | Railway intrusion foreign matter unmanned aerial vehicle detection method, device and system based on deep learning |
CN114429618A (en) * | 2022-01-06 | 2022-05-03 | 电子科技大学 | Congestion identification method based on improved AlexNet network model |
CN115272980A (en) * | 2022-09-22 | 2022-11-01 | 常州海图信息科技股份有限公司 | Conveying belt surface detection method and system based on machine vision |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150022069A1 (en) * | 2013-07-22 | 2015-01-22 | Cooler Master Development Corporation | Storage device carrier |
CN105469100A (en) * | 2015-11-30 | 2016-04-06 | 广东工业大学 | Deep learning-based skin biopsy image pathological characteristic recognition method |
US20180114114A1 (en) * | 2016-10-21 | 2018-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
CN109002807A (en) * | 2018-07-27 | 2018-12-14 | 重庆大学 | A kind of Driving Scene vehicle checking method based on SSD neural network |
CN109271946A (en) * | 2018-09-28 | 2019-01-25 | 清华大学深圳研究生院 | A method of attention object real-time detection is realized in mobile phone terminal |
CN109344731A (en) * | 2018-09-10 | 2019-02-15 | 电子科技大学 | The face identification method of lightweight neural network based |
CN109389102A (en) * | 2018-11-23 | 2019-02-26 | 合肥工业大学 | The system of method for detecting lane lines and its application based on deep learning |
CN109508634A (en) * | 2018-09-30 | 2019-03-22 | 上海鹰觉科技有限公司 | Ship Types recognition methods and system based on transfer learning |
-
2019
- 2019-07-03 CN CN201910594388.5A patent/CN110348357B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150022069A1 (en) * | 2013-07-22 | 2015-01-22 | Cooler Master Development Corporation | Storage device carrier |
CN105469100A (en) * | 2015-11-30 | 2016-04-06 | 广东工业大学 | Deep learning-based skin biopsy image pathological characteristic recognition method |
US20180114114A1 (en) * | 2016-10-21 | 2018-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
CN109002807A (en) * | 2018-07-27 | 2018-12-14 | 重庆大学 | A kind of Driving Scene vehicle checking method based on SSD neural network |
CN109344731A (en) * | 2018-09-10 | 2019-02-15 | 电子科技大学 | The face identification method of lightweight neural network based |
CN109271946A (en) * | 2018-09-28 | 2019-01-25 | 清华大学深圳研究生院 | A method of attention object real-time detection is realized in mobile phone terminal |
CN109508634A (en) * | 2018-09-30 | 2019-03-22 | 上海鹰觉科技有限公司 | Ship Types recognition methods and system based on transfer learning |
CN109389102A (en) * | 2018-11-23 | 2019-02-26 | 合肥工业大学 | The system of method for detecting lane lines and its application based on deep learning |
Non-Patent Citations (3)
Title |
---|
6小贱: ""卷积的滑动窗口实现"", 《HTTPS://WWW.CNBLOGS.COM/XIAOJIANLIU/ARTICLE/9931499.HTML》, 8 November 2018 (2018-11-08), pages 1 - 7 * |
SHULI CHENG: ""A novel deep hashing method for fast image retrieval"", 《THE VISUAL COMPUTER》, vol. 35, no. 9, 13 August 2018 (2018-08-13), pages 1255 - 1266, XP036855953, DOI: 10.1007/s00371-018-1583-x * |
靳丽蕾: ""一种用于卷积神经网络压缩的混合剪枝方法"", 《小型微型计算机系统》, vol. 39, no. 12, 11 December 2018 (2018-12-11), pages 2596 - 2601 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021077947A1 (en) * | 2019-10-22 | 2021-04-29 | 北京市商汤科技开发有限公司 | Image processing method, apparatus and device, and storage medium |
CN111080598A (en) * | 2019-12-12 | 2020-04-28 | 哈尔滨市科佳通用机电股份有限公司 | Bolt and nut missing detection method for coupler yoke key safety crane |
CN113051961A (en) * | 2019-12-26 | 2021-06-29 | 深圳市光鉴科技有限公司 | Depth map face detection model training method, system, equipment and storage medium |
CN111291887A (en) * | 2020-03-06 | 2020-06-16 | 北京迈格威科技有限公司 | Neural network training method, image recognition method, device and electronic equipment |
CN111291887B (en) * | 2020-03-06 | 2023-11-10 | 北京迈格威科技有限公司 | Neural network training method, image recognition device and electronic equipment |
CN114248819A (en) * | 2020-09-25 | 2022-03-29 | 中车株洲电力机车研究所有限公司 | Railway intrusion foreign matter unmanned aerial vehicle detection method, device and system based on deep learning |
CN114248819B (en) * | 2020-09-25 | 2023-12-29 | 中车株洲电力机车研究所有限公司 | Railway intrusion foreign matter unmanned aerial vehicle detection method, device and system based on deep learning |
CN112580558A (en) * | 2020-12-25 | 2021-03-30 | 烟台艾睿光电科技有限公司 | Infrared image target detection model construction method, detection method, device and system |
CN114429618A (en) * | 2022-01-06 | 2022-05-03 | 电子科技大学 | Congestion identification method based on improved AlexNet network model |
CN115272980A (en) * | 2022-09-22 | 2022-11-01 | 常州海图信息科技股份有限公司 | Conveying belt surface detection method and system based on machine vision |
Also Published As
Publication number | Publication date |
---|---|
CN110348357B (en) | 2022-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348357A (en) | A kind of fast target detection method based on depth convolutional neural networks | |
CN112418117B (en) | Small target detection method based on unmanned aerial vehicle image | |
Zhang et al. | Pedestrian detection method based on Faster R-CNN | |
CN110097053B (en) | Improved fast-RCNN-based electric power equipment appearance defect detection method | |
CN109902806A (en) | Method is determined based on the noise image object boundary frame of convolutional neural networks | |
CN108229550B (en) | Cloud picture classification method based on multi-granularity cascade forest network | |
CN109359666A (en) | A kind of model recognizing method and processing terminal based on multiple features fusion neural network | |
CN111898432B (en) | Pedestrian detection system and method based on improved YOLOv3 algorithm | |
CN111401293B (en) | Gesture recognition method based on Head lightweight Mask scanning R-CNN | |
CN111860587B (en) | Detection method for small targets of pictures | |
CN103971106A (en) | Multi-view human facial image gender identification method and device | |
Lv et al. | A visual identification method for the apple growth forms in the orchard | |
CN104484890A (en) | Video target tracking method based on compound sparse model | |
CN110969121A (en) | High-resolution radar target recognition algorithm based on deep learning | |
CN112749663B (en) | Agricultural fruit maturity detection system based on Internet of things and CCNN model | |
CN109165658B (en) | Strong negative sample underwater target detection method based on fast-RCNN | |
CN110533100A (en) | A method of CME detection and tracking is carried out based on machine learning | |
CN111540203B (en) | Method for adjusting green light passing time based on fast-RCNN | |
CN116721414A (en) | Medical image cell segmentation and tracking method | |
CN114170511A (en) | Pavement crack disease identification method based on Cascade RCNN | |
CN115035381A (en) | Lightweight target detection network of SN-YOLOv5 and crop picking detection method | |
Guo et al. | Grape leaf disease detection based on attention mechanisms | |
Sun et al. | Deep learning based pedestrian detection | |
CN109558803A (en) | SAR target discrimination method based on convolutional neural networks Yu NP criterion | |
CN109784291B (en) | Pedestrian detection method based on multi-scale convolution characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |