CN109543632A - A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features - Google Patents

A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features Download PDF

Info

Publication number
CN109543632A
CN109543632A CN201811438351.5A CN201811438351A CN109543632A CN 109543632 A CN109543632 A CN 109543632A CN 201811438351 A CN201811438351 A CN 201811438351A CN 109543632 A CN109543632 A CN 109543632A
Authority
CN
China
Prior art keywords
layer
network
shallow
feature
guidance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811438351.5A
Other languages
Chinese (zh)
Inventor
邓红霞
马垚
杨晓峰
李海芳
杨雅茹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN201811438351.5A priority Critical patent/CN109543632A/en
Publication of CN109543632A publication Critical patent/CN109543632A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to computer vision target detection technique fields, more specifically, it is related to a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, on the basis of Faster R-CNN detection algorithm, proposes a kind of deep layer network pedestrian detection based on the guidance of shallow-layer Fusion Features.By histograms of oriented gradients feature, improved textural characteristics are merged with depth network characterization, obtain accurate pedestrian's feature, and using deep learning as core, shallow-layer feature is guided, and realize the mutual supplement with each other's advantages of shallow-layer study and deep learning.The result shows that the improved method proposed has preferable performance in terms of Detection accuracy and rate.The technical issues of efficiently solving the missing inspection under complex scene and the too small situation of target.

Description

A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
Technical field
The present invention relates to computer vision target detection technique fields, more specifically, are related to a kind of based on shallow-layer feature Merge the deep layer network pedestrian detection method of guidance.
Background technique
In recent years, full-scale digital, the video monitoring system advantage of networking are more obvious, the opening of height, collection It becomes second nature and flexibility, the development for entire society's informationization provides more wide development space.Pedestrian detection is its key One of technology.But since scene where pedestrian is complicated, come in every shape, far and near not first-class reason to design a robustness Good, the high pedestrian detection algorithm of accuracy rate becomes a focus.
Currently, pedestrian detection mainly has two major classes method: based on manual feature and being based on depth convolutional neural networks The method of (ConvolutionalNeural Networks, CNN).Method based on manual feature is comparatively simple, efficiently; And the method based on CNN is more effective, but low efficiency.Comparing the classical pedestrian detection algorithm based on manual feature has: Dalal The HOG-LBP feature of the propositions such as the HOG characteristic image localized variation of equal propositions and Wang combines, they are and SVM classifier In conjunction with being detected;For processing target attitudes vibration, Felzenszwalb et al. uses deformable member model (Deformable Parts Model,DPM);Integrating channel feature (ICF) is proposed Deng [4], the extraction office from HOG Portion feature channel and LUV color channel (i.e. HOG+LUV), then AdaBoost is cascaded for Study strategies and methods, efficiently solve mesh Mark occlusion issue;The ACF method that Dollar et al. is proposed, accelerates pedestrian detection speed etc..These methods are feature extraction The pedestrian detection method combined with classifier design.
With the proposition of deep learning method, also taken in object detection based on depth convolutional neural networks (CNN) method Obtained huge success.Usually it firstly generate candidate target frame, then using training CNN model to these candidate frames into Row classification;Ross Girshick in 2014 et al. propose based on region convolutional neural networks (Regions with CNN, R-CNN]) be CNN network classics, but it consumes very big in zoning candidate frame, later based on its improvement, proposes Fast R-CNN, this method very deep network implementations rate of real-time detection, but it has ignored formation zone time Select the time of frame;Next year, Kaiming He are waited and are proposed Faster R-CNN, average detected speed ratio R-CNN fast 1000 Times or more, Faster R-CNN depth network query function candidate frame, i.e. RPN (Region Proposal Networks), not It sacrifices under the premise of calculating consumption, performance also greatly improves;Jiale Cao et al. proposition utilizes HOG+LUV feature, in conjunction with Deep learning network characterization carries out Adaboost classification, and verification and measurement ratio increases, and the performance of target is detected in complex scene It improves.
Although the pedestrian detection algorithm based on Faster R-CNN obtains immense success, still there is improved place: 1) most methods only use convolutional neural networks the last layer feature, are divided with softmax or SVM motion in CNN Class.In fact, the different layers in CNN represent different image characteristics.Preceding several layers of image locals that better describe change, and most It is several layers of afterwards abstractively to describe image overall feature.Therefore, it can be trained, be detected with these features.2) a lot of methods are all It is feature extraction and the independent progress of classification, thus one piece of whole process is carried out, forms detection process end to end.
Summary of the invention
In the presence of overcoming the shortcomings of the prior art, the present invention provides a kind of depth based on the guidance of shallow-layer Fusion Features Layer network pedestrian detection method solves the problems, such as the missing inspection under complex scene and the too small situation of target of current pedestrian's detection algorithm.
In order to solve the above-mentioned technical problem, the technical scheme adopted by the invention is as follows:
A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, comprising the following steps:
S1, prepare data set for training network, data set is divided into data training set and data test set, and carry out Data set pretreatment;
S2, to image zooming-out character network in data training set in S1, the character network of extraction is divided into five block Section extracts depth network characterization from convolutional neural networks, shallow-layer feature is extracted after first block of neural network;
S3, the shallow-layer feature extracted in S2 is merged with depth network characterization, using VGG16 network as model, carries out feature The classification of fusion;
S4, planned network structural framing are trained convolutional neural networks using image in data training set in S1, obtain To the convolutional neural networks model of better performances;
S5, the convolutional neural networks model parameter obtained using S4, using data test collection in S1, to convolutional neural networks It is tested, realizes pedestrian detection.
Further, in the S1 data set pre-treatment step are as follows: choose caltech pedestrian's data set, by its data Collection is converted into a sheet by a sheet image and accomplishes fluently label, and generates .TXT file, for marking the coordinate of label.The data set is by 11 A video group Set00~set10 composition, the video resolution are 640*480.
Further, the shallow-layer feature includes histograms of oriented gradients feature and textural characteristics.
Further, it is merged in the S3 method particularly includes: the shallow-layer feature of extraction and depth network characterization exist " Flatten " layer is converted into one-dimensional vector, links together, and then adds two full articulamentums.
Further, the neuron number of described two full articulamentums is 4096.
Further, network structure frame uses Faster R-CNN detection framework in the S4.
Further, the anchor point that four scales are selected in the Faster R-CNN algorithm, selects the anchor point of four scales (anchor), anchor point area is set as { 642,1282,2562,5122Four classes, and each area size selection length-width ratio is respectively {1:1,1:2,2:1}。
Further, the convolution of the last one shared convolutional layer output of the Faster R-CNN algorithm convolution feature The Feature Mapping one small network of sliding, this network are connected to entirely by sharing 3 × 3 sky in the convolution feature that convolutional layer exports Between on window.
Further, Relu activation primitive, the Relu activation primitive are used in the convolutional neural networks layer specifically:Wherein functional gradient α is constant, and α ∈ (0,1), x are the input value of neuron.
Further, extra correct detection window is unanimously eliminated using non-maximum during pedestrian detection;The inspection Survey the Duplication of windowWherein w1With w2Two detection windows are respectively indicated, the θ is Threshold value;The threshold θ is set as 0.7.
Compared with prior art, the advantageous effect of present invention is that:
The present invention provides a kind of deep layer network pedestrian detection methods based on the guidance of shallow-layer Fusion Features, in Faster On the basis of R-CNN detection algorithm, a kind of deep layer network pedestrian detection based on the guidance of shallow-layer Fusion Features is proposed.Pass through direction Histogram of gradients feature, improved textural characteristics are merged with depth network characterization, accurate pedestrian's feature are obtained, with deep learning For core, shallow-layer feature is guided, and realizes the mutual supplement with each other's advantages of shallow-layer study and deep learning.The result shows that the improvement proposed Method has preferable performance in terms of Detection accuracy and rate, efficiently solves and leaks under complex scene and the too small situation of target The problem of inspection.
Detailed description of the invention
Fig. 1 is a kind of network of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features provided by the invention Design structure;
Fig. 2 is RPN network structure used;
Fig. 3 is characterized fusion frame;
Fig. 4 is the improved textural characteristics form of expression;
Fig. 5 is characterized amalgamation mode;
Fig. 6 is vgg16, and resnet50 model loss compares;
Fig. 7 is vgg16, and resnet50 model classifier acc compares;
Fig. 8 is that each activation primitive loss compares;
Fig. 9 is vgg, vgg+hog, vgg+lbp, the accuracy rate of vgg+hog+lbp feature;
Figure 10 is pedestrian detection partial detection.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, comprising the following steps:
S1, prepare data set for training network, data set is divided into data training set and data test set, and carry out Data set pretreatment, chooses caltech pedestrian's data set, which is made of 11 video group Set00~set10, the view Frequency division resolution is 640*480, converts a sheet by a sheet image for its data set and accomplishes fluently label, and generate .TXT file, is used to Mark the coordinate of label;
S2, to image zooming-out character network in data training set in S1, the character network of extraction is divided into five block Section extracts depth network characterization from convolutional neural networks, shallow-layer feature, institute is extracted after first block of neural network Stating shallow-layer feature includes histograms of oriented gradients feature and textural characteristics;
S3, the shallow-layer feature extracted in S2 is merged with depth network characterization, using VGG16 network as model, carries out feature The classification of fusion exists the shallow-layer feature of extraction and depth network characterization " flatten " layer is converted into one-dimensional vector, it is connected to one It rises, then adds two full articulamentums, the neuron number of described two full articulamentums is 4096;
S4, planned network structural framing are trained convolutional neural networks using image in data training set in S1;? To the convolutional neural networks model of better performances;
S5, the convolutional neural networks model obtained using S4 carry out convolutional neural networks using data test collection in S1 Pedestrian detection is realized in test.
The network structure frame uses Faster R-CNN detection framework in the present embodiment, uses convolutional network first Target candidate frame is obtained, then target candidate frame is classified and returned, in order to improve pedestrian detection accuracy rate, is solved multiple Missing inspection problem under miscellaneous scene and the too small situation of target guides convolutional network further feature using shallow-layer Fusion Features, by direction Histogram of gradients feature, textural characteristics and convolution Fusion Features.
Further, Relu activation primitive, the Relu activation primitive are used in the convolutional neural networks layer specifically:Wherein functional gradient α is constant, and α ∈ (0,1), x are the input value of neuron.
Further, during pedestrian detection using non-maximum unanimously eliminate mostly with correct detection window;The inspection Survey the Duplication of windowWherein w1With w2Two detection windows are respectively indicated, the θ is Threshold value;The threshold θ is set as 0.7.
Embodiment
(1) network structure designs
In order to improve pedestrian detection accuracy rate, effectively the missing inspection in the too small situation of complex scene and target is solved the problems, such as, Enhance deep learning generalization ability.On the basis of the detection of Faster R-CNN algorithm.Convolution net is guided using shallow-layer Fusion Features Network further feature, by histograms of oriented gradients feature, textural characteristics and convolution Fusion Features, using deep learning as core, shallow-layer Feature guides, and realizes the mutual supplement with each other's advantages of shallow-layer study and deep learning.Network structure of the invention is as shown in Figure 1.This network Framework includes convolutional layer, pond layer, full articulamentum.And activation primitive is all employed in each convolutional layer.? After the last one convolutional layer, RPN network has been used, RPN network includes a convolutional layer, while connecting a classification layer, and one A recurrence layer, for generating target candidate frame;Followed by full articulamentum, the classification for finally carrying out network returns layer.Table 1 is net Network model design parameter.
Design parameter used in 1 network model of table
FasterR-CNN algorithm the biggest advantage is to region suggest network (Region Proposal Network, It RPN), is the convolution spy of a kind of full convolutional network (fully-convolutional network) and the shared full figure of convolutional network Sign, as shown in Fig. 2, height is different due to pedestrian's bodily form different in selected data set, in order to which the height for covering pedestrian is short and stout Thin equal size differences, select the anchor point (anchor) of four scales, anchor point area is set as { 642,1282,2562,5122Four Class, and it is respectively { 1:1,1:2,2:1 } that each area size, which chooses length-width ratio,.In order to generate target candidate frame, at the last one The convolution Feature Mapping one small network of sliding of shared convolutional layer output, this network are connected to entirely by shared convolutional layer output Convolution feature on nxn spatial window on (n=3 herein).Each sliding window is mapped on a low-dimensional vector, And export that (cls, i.e., each candidate frame are targets/non-targeted to target candidate frame classification layer to the convolutional layer of two 1*1 at the same level- Probability) and target candidate frame return layer (reg, the i.e. codes co-ordinates of target candidate frame).
(2) selection and extraction of characteristics of image
There are many kinds of the features for influencing pedestrian detection, and the present embodiment chooses shallow-layer craft characteristic direction histogram of gradients feature (HOG) and textural characteristics (LBP), convolutional neural networks depth network characterization:
Each convolutional layer of convolutional neural networks represents different characteristics of image.It is preceding in image channel several layers of to better describe The localized variation of image, it is last it is several layers of in image channel abstractdesription image overall structure.Meanwhile shallow-layer feature is also well Image change is described.HOG feature describes image local edge direction and variance.Improved LBP Feature capturing edge or office Portion's shape information.In contrast, the characteristics of image of convolutional neural networks is very simple, relative efficiency is calculated.Therefore HOG spy is integrated Sign, both shallow-layer features of improved LBP feature and convolutional network feature construct the characteristic layer of image.Firstly, summarizing feature The framework of extraction.The basic framework of Fusion Features is as shown in Figure 3.The basic framework of Fusion Features can be divided into three parts: 1) firstly, the multi-layer image channel from L1 to LN is generated by convolutional neural networks.The image channel of traditional shallow-layer craft feature (HOG+LBP) after for first block.In each layer, there is multiple images channel.2) second step is feature extraction. HOG, LBP feature are extracted after first block of convolutional neural networks, L1 to LN extracts the convolutional neural networks aspect of model. 3) classify end to end finally, carrying out target candidate frame, return.
2.1 shallow-layer features
Histograms of oriented gradients (Histogram ofOriented Gradient, HOG) has been widely accepted and has caught for one kind Catch edge or best one of the feature of local shape information.HOG implements algorithm following steps:
Gx(x, y)=I (x+1, y)-I (x-1, y)
Gy(x, y)=I (x, y+1)-I (x, y-1)
Wherein Gx(x, y) indicates the horizontal direction gradient in input picture at pixel (x, y), Gy(x, y) indicates input figure Vertical gradient as at pixel (x, y), I (x, y) indicate the pixel value in input picture at pixel (x, y).
For HOG feature, using rectangle section (R-HOG), the section R-HOG (blocks) can carry out table there are three parameter Sign: cell factory number in each section, pixel number, the histogram number of active lanes of each cell in each cell factory. Before the computation, each section (block) plus a Gauss airspace window (Gaussian spatialwindow), in this way can be with Reduce the weight of the surrounding pixel point (pixels aroundthe edge) at edge.By adjusting different parameters, pedestrian detection Optimal parameter be arranged so that 3 × 3 cells/section, 6 × 6 pixels/cell, 9 histogram channels.
Local binary patterns (Local BinaryPattern, LBP) are a kind of sides for describing image local feature Method, key advantages are exactly the efficiency of invariance and calculating that its monotonic gray level changes.Original LBP characterizing definition is in 3*3 Neighborhood in, using centre of neighbourhood pixel as threshold value, the adjacent gray value of 8 pixels and the pixel value of the centre of neighbourhood are compared Compared with if surrounding pixel is greater than center pixel value, otherwise it is 0 that the position of the pixel, which is marked as 1,.By surrounding 0-1 sequence, It sequentially arranges, becomes 8 bits, be converted into decimal number.This decimal integer is exactly to characterize this The LBP value of a window, and reflect with this value the texture information in the region.Its fundamental formular are as follows:
WhereinIn formula, gpIndicate center pixel gray value, gp(p=0,1 ..., P-1) indicates radius For the gray value of the neighborhood territory pixel of R, P is the number of neighborhood territory pixel.But this LBP feature uses in fixed neighborhood Gray value, when the scale of image changes, mistake will occur for the coding of LBP feature, it is impossible to correctly reflect pixel Texture information around point.So in order to adapt to the textural characteristics of different scale, and reach wanting for gray scale and rotational invariance It asks, by 3 × 3 neighborhood extendings to any neighborhood, using circle shaped neighborhood region instead of square neighborhood, improved LBP operator allows There are any number of pixels in the circle shaped neighborhood region that radius is R.To obtain adopting in the border circular areas that radius is R containing P The LBP operator of sampling point.As shown in Figure 4.
The coordinate for calculating each sampled point is
2.2 depth convolutional network features and fusion feature
With Resnet50, VGG16 network model automatically extracts picture depth convolutional neural networks feature.By extraction Histograms of oriented gradients feature, textural characteristics, convolutional neural networks feature, " flatten " layer is converted into one-dimensional vector, it connects Together, two full articulamentums are then added to, neuron number difference 4096 and 4096 carries out convolutional neural networks model Training.Fusion Features mode as shown in Figure 5.
2.3 eliminate pedestrian detection overlapping
There are many relatively correct detection windows in the adjacent area around pedestrian's frame, these detection block Duplication are higher, Although eliminating the detection window of many non-pedestrian by the selection of RPN candidate frame and the detection based on FasterR-CNN algorithm, But extra correct detection window cannot be eliminated.Used here as non-maxima suppression (NMS), the Duplication o (w of detection window1, w2) is defined as:
Wherein w1With w2Respectively indicate two detection windows.Threshold θ is set as 0.7, eliminates the detection that Duplication is lower than 0.7 Window can be accelerated to detect speed, reduce and calculate cost.
(3) result shows and analyzes
The present invention uses Caltech pedestrian's data set, which is made of 11 video group Set00~set10.We It is trained with set00~set07, remaining part is for detecting.These videos are shot by vehicle-mounted camera.About 10 Or so a hour, video resolution 640*480.According to needed for the present invention, a sheet by a sheet image is converted by its data set and is done Label forms 122817 pictures altogether, shares 285558 pedestrians.In this data set, there are the pedestrian at distant view, the row at close shot People, some are blocked by vehicle, are mutually blocked between somebody and people, and somebody stands at bus stop board, and somebody is on road It walks, only half people in some pictures, background is complicated, and posture is different.
The test of 3.1 network models
Using two kinds of popular convolutional network models, Resnet50, VGG16 are for extracting feature, training network.Convolution kernel Size is 3*3, other initial parameters follow model trained in advance.In RPN network, two loss functions are used, i.e., RPN returns loss and Classification Loss.In study convolutional neural networks, we are also that detection has been used to return loss and detection classification Loss, so, the loss of network is this four the sum of loss probability.As a result such as Fig. 6, shown in 7, VGG16 training accuracy rate ratio Comparatively resnet50 will be got well, loss late is low, and VGG16 performance is better than resnet50.This shows in pedestrian detection, network Number of plies depth promotes pedestrian detection performance to a certain extent, but is not that depth is deeper, and effect is better.Select VGG16 network Model solves technical problem.
The test of 3.2 activation primitives and analysis
In convolutional neural networks, activation primitive is the nonlinear transformation carried out to input signal.The output of this transformation It is sent to the input of next layer of neuron, improves the Nonlinear Modeling ability of network, can learn and execute more complicated Task.If, even if there is many hidden layers, whole network is of equal value with monolayer neural networks without activation primitive.? In classification problem, be not using which kind of activation primitive it is unalterable, to be made a concrete analysis of according to particular problem, be to need to debug 's.Compare Relu function, tanh function, Leaky-relu function, influence of the Prelu function to its result.
Tanh function formula is as follows:
For this function when inputting very big or very little, output is all almost smooth, and gradient very little is unfavorable for weight more Newly.This disadvantage is improved, Relu function is proposed.
Relu activation primitive is defined as:
F (x)=max (0, x)
When input is negative value, Relu function will not activate Relu activation primitive completely, cause respective weights can not It updates, the case where for x < 0, Relu function is correspondingly improved, so that
This function removes 0 gradient, changes a non-zero gradient into, general α setting it is smaller, such as take 0.01, i.e., For Leakyrelu function.Similar, relu activation primitive (the Parameterised ReLU of a parametrization can be set Function), α is become the parameter for needing to learn.Parameter generally takes the number between 0~1.As shown in figure 8, in this reality It applies in example, Relu activation primitive is used in convolutional neural networks layer, loss late is comparatively lower, and the non-thread sexuality of function is good. Table 2 is each activation primitive loss.
Each activation primitive loss of table 2
The test of 3.3 Fusion Features
Using VGG16 network as model, the classification of Fusion Features, test experience, as a result as shown in figure 9, when traditional hand are carried out Work feature increases with accuracy rate after convolutional neural networks Fusion Features.In summary, it is solved with the method in complicated field Missing inspection problem under scape and the too small situation of target carries out pedestrian detection.
In shown in Figure 10, (a is schemed1)(b1)(c1) it is not incorporate HOG, the detection knot of LBP feature in convolutional neural networks Fruit, (a2)(b2)(c2) it is that HOG is incorporated in convolutional neural networks, the testing result after LBP feature.(a) group picture medium long shot It blocks pedestrian and is detected in place;(b) group picture two-shot goes out to block pedestrian and be detected;(c) imperfect pedestrian is detected in group picture Out.The target that is blocked is detected after improvement, the target pedestrian at distant view, while can also detect incomplete pedestrian in image, Detection performance increases.
Only presently preferred embodiments of the present invention is explained in detail above, but the present invention is not limited to above-described embodiment, Within the knowledge of a person skilled in the art, it can also make without departing from the purpose of the present invention each Kind variation, various change should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, which comprises the following steps:
S1, prepare data set for training network, data set is divided into data training set and data test set, and carry out data Collection pretreatment;
S2, to image zooming-out character network in data training set in S1, the character network of extraction is divided into five sections block, Depth network characterization is extracted from convolutional neural networks, shallow-layer feature is extracted after first block of convolutional neural networks;
S3, the shallow-layer feature extracted in S2 is merged with depth network characterization, using VGG16 network as model, carries out Fusion Features Classification;
S4, planned network structural framing are trained convolutional neural networks using image in data training set in S1;Obtaining property It can preferable convolutional neural networks model;
S5, the convolutional neural networks model parameter obtained using S4 carry out convolutional neural networks using data test collection in S1 Pedestrian detection is realized in test.
2. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is, the pre-treatment step of data set in the S1 are as follows: chooses caltech pedestrian's data set, converts one for its data set It opens an image and accomplishes fluently label, and generate .TXT file, for marking the coordinate of label.
3. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is: the shallow-layer feature includes histograms of oriented gradients feature and textural characteristics.
4. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is, merges in the S3 method particularly includes: the shallow-layer feature of extraction and depth network characterization exist " conversion of flatten " layer It for one-dimensional vector, links together, then adds two full articulamentums.
5. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 4, special Sign is: the neuron number of described two full articulamentums is 4096.
6. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is: network structure frame uses Faster R-CNN detection framework in the S4.
7. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 6, special Sign is: the anchor point of four scales is selected in the Faster R-CNN algorithm, anchor point area is set as { 642,1282,2562, 5122Four classes, under each area size, choosing length-width ratio is respectively { 1:1,1:2,2:1 }.
8. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 6, special Sign is: the convolution Feature Mapping one of the last one shared convolutional layer output of the Faster R-CNN algorithm convolution feature A small network of sliding, this network are connected to entirely by sharing in 3 × 3 spatial window in the convolution feature that convolutional layer exports.
9. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is, Relu activation primitive, the Relu activation primitive are used in the convolutional neural networks layer specifically:Wherein functional gradient α is constant, and α ∈ (0,1), x are the input value of neuron.
10. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is: unanimously eliminating extra correct detection window using non-maximum during pedestrian detection;The weight of the detection window Folded rate isWherein w1With w2Two detection windows are respectively indicated, the θ is threshold value;Institute It states threshold θ and is set as 0.7.
CN201811438351.5A 2018-11-28 2018-11-28 A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features Pending CN109543632A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811438351.5A CN109543632A (en) 2018-11-28 2018-11-28 A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811438351.5A CN109543632A (en) 2018-11-28 2018-11-28 A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features

Publications (1)

Publication Number Publication Date
CN109543632A true CN109543632A (en) 2019-03-29

Family

ID=65850971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811438351.5A Pending CN109543632A (en) 2018-11-28 2018-11-28 A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features

Country Status (1)

Country Link
CN (1) CN109543632A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993122A (en) * 2019-04-02 2019-07-09 中国石油大学(华东) A kind of pedestrian based on depth convolutional neural networks multiplies staircase anomaly detection method
CN110163867A (en) * 2019-04-02 2019-08-23 成都真实维度科技有限公司 A method of divided automatically based on lesion faulted scanning pattern
CN110569971A (en) * 2019-09-09 2019-12-13 吉林大学 convolutional neural network single-target identification method based on LeakyRelu activation function
CN110674979A (en) * 2019-09-11 2020-01-10 腾讯科技(深圳)有限公司 Risk prediction model training method, prediction device, medium and equipment
CN110719444A (en) * 2019-11-07 2020-01-21 中国人民解放军国防科技大学 Multi-sensor fusion omnibearing monitoring and intelligent camera shooting method and system
CN110738648A (en) * 2019-10-12 2020-01-31 山东浪潮人工智能研究院有限公司 camera shell paint spraying detection system and method based on multilayer convolutional neural network
CN111898427A (en) * 2020-06-22 2020-11-06 西北工业大学 Multispectral pedestrian detection method based on feature fusion deep neural network
CN111985504A (en) * 2020-08-17 2020-11-24 中国平安人寿保险股份有限公司 Copying detection method, device, equipment and medium based on artificial intelligence
CN112233088A (en) * 2020-10-14 2021-01-15 哈尔滨市科佳通用机电股份有限公司 Brake hose loss detection method based on improved Faster-rcnn
CN113609887A (en) * 2021-04-26 2021-11-05 中国石油大学(华东) Sea surface oil spill detection method integrating deep learning decision and shallow learning decision
CN113887649A (en) * 2021-10-19 2022-01-04 齐鲁工业大学 Target detection method based on fusion of deep-layer features and shallow-layer features
CN114078230A (en) * 2021-11-19 2022-02-22 西南交通大学 Small target detection method for self-adaptive feature fusion redundancy optimization

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203506A (en) * 2016-07-11 2016-12-07 上海凌科智能科技有限公司 A kind of pedestrian detection method based on degree of depth learning art

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203506A (en) * 2016-07-11 2016-12-07 上海凌科智能科技有限公司 A kind of pedestrian detection method based on degree of depth learning art

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘浏: ""基于深度学习的摄像机网络行人识别系统研究与实现"", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *
叶晨等: ""基于CNN迁移学习的甲状腺结节检测方法"", 《计算机工程与应用》 *
曹本: ""基于深度学习特征提取的人脸认证算法研究"", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163867A (en) * 2019-04-02 2019-08-23 成都真实维度科技有限公司 A method of divided automatically based on lesion faulted scanning pattern
CN109993122A (en) * 2019-04-02 2019-07-09 中国石油大学(华东) A kind of pedestrian based on depth convolutional neural networks multiplies staircase anomaly detection method
CN110569971B (en) * 2019-09-09 2022-02-08 吉林大学 Convolutional neural network single-target identification method based on LeakyRelu activation function
CN110569971A (en) * 2019-09-09 2019-12-13 吉林大学 convolutional neural network single-target identification method based on LeakyRelu activation function
CN110674979A (en) * 2019-09-11 2020-01-10 腾讯科技(深圳)有限公司 Risk prediction model training method, prediction device, medium and equipment
CN110738648A (en) * 2019-10-12 2020-01-31 山东浪潮人工智能研究院有限公司 camera shell paint spraying detection system and method based on multilayer convolutional neural network
CN110719444A (en) * 2019-11-07 2020-01-21 中国人民解放军国防科技大学 Multi-sensor fusion omnibearing monitoring and intelligent camera shooting method and system
CN111898427A (en) * 2020-06-22 2020-11-06 西北工业大学 Multispectral pedestrian detection method based on feature fusion deep neural network
CN111985504A (en) * 2020-08-17 2020-11-24 中国平安人寿保险股份有限公司 Copying detection method, device, equipment and medium based on artificial intelligence
CN112233088A (en) * 2020-10-14 2021-01-15 哈尔滨市科佳通用机电股份有限公司 Brake hose loss detection method based on improved Faster-rcnn
CN112233088B (en) * 2020-10-14 2021-08-06 哈尔滨市科佳通用机电股份有限公司 Brake hose loss detection method based on improved Faster-rcnn
CN113609887A (en) * 2021-04-26 2021-11-05 中国石油大学(华东) Sea surface oil spill detection method integrating deep learning decision and shallow learning decision
CN113887649A (en) * 2021-10-19 2022-01-04 齐鲁工业大学 Target detection method based on fusion of deep-layer features and shallow-layer features
CN114078230A (en) * 2021-11-19 2022-02-22 西南交通大学 Small target detection method for self-adaptive feature fusion redundancy optimization
CN114078230B (en) * 2021-11-19 2023-08-25 西南交通大学 Small target detection method for self-adaptive feature fusion redundancy optimization

Similar Documents

Publication Publication Date Title
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN108573276B (en) Change detection method based on high-resolution remote sensing image
Chen et al. Survey of pedestrian action recognition techniques for autonomous driving
CN106909902B (en) Remote sensing target detection method based on improved hierarchical significant model
Luo et al. Multi-scale traffic vehicle detection based on faster R–CNN with NAS optimization and feature enrichment
CN104050471B (en) Natural scene character detection method and system
Costea et al. Creating roadmaps in aerial images with generative adversarial networks and smoothing-based optimization
CN108460403A (en) The object detection method and system of multi-scale feature fusion in a kind of image
CN114202696A (en) SAR target detection method and device based on context vision and storage medium
CN109902806A (en) Method is determined based on the noise image object boundary frame of convolutional neural networks
Li et al. A method of cross-layer fusion multi-object detection and recognition based on improved faster R-CNN model in complex traffic environment
CN108304873A (en) Object detection method based on high-resolution optical satellite remote-sensing image and its system
CN106845487A (en) A kind of licence plate recognition method end to end
CN111783523B (en) Remote sensing image rotating target detection method
CN108960404B (en) Image-based crowd counting method and device
CN106096602A (en) A kind of Chinese licence plate recognition method based on convolutional neural networks
CN113609896B (en) Object-level remote sensing change detection method and system based on dual-related attention
CN106023257A (en) Target tracking method based on rotor UAV platform
CN104504395A (en) Method and system for achieving classification of pedestrians and vehicles based on neural network
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN106529494A (en) Human face recognition method based on multi-camera model
CN113158943A (en) Cross-domain infrared target detection method
CN113326735B (en) YOLOv 5-based multi-mode small target detection method
CN109902585A (en) A kind of three modality fusion recognition methods of finger based on graph model
CN109993803A (en) The intellectual analysis and evaluation method of city tone

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination