CN109543632A

CN109543632A - A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features

Info

Publication number: CN109543632A
Application number: CN201811438351.5A
Authority: CN
Inventors: 邓红霞; 马垚; 杨晓峰; 李海芳; 杨雅茹
Original assignee: Taiyuan University of Technology
Current assignee: Taiyuan University of Technology
Priority date: 2018-11-28
Filing date: 2018-11-28
Publication date: 2019-03-29

Abstract

The present invention relates to computer vision target detection technique fields, more specifically, it is related to a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, on the basis of Faster R-CNN detection algorithm, proposes a kind of deep layer network pedestrian detection based on the guidance of shallow-layer Fusion Features.By histograms of oriented gradients feature, improved textural characteristics are merged with depth network characterization, obtain accurate pedestrian's feature, and using deep learning as core, shallow-layer feature is guided, and realize the mutual supplement with each other's advantages of shallow-layer study and deep learning.The result shows that the improved method proposed has preferable performance in terms of Detection accuracy and rate.The technical issues of efficiently solving the missing inspection under complex scene and the too small situation of target.

Description

A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features

Technical field

The present invention relates to computer vision target detection technique fields, more specifically, are related to a kind of based on shallow-layer feature Merge the deep layer network pedestrian detection method of guidance.

Background technique

In recent years, full-scale digital, the video monitoring system advantage of networking are more obvious, the opening of height, collection It becomes second nature and flexibility, the development for entire society's informationization provides more wide development space.Pedestrian detection is its key One of technology.But since scene where pedestrian is complicated, come in every shape, far and near not first-class reason to design a robustness Good, the high pedestrian detection algorithm of accuracy rate becomes a focus.

Currently, pedestrian detection mainly has two major classes method: based on manual feature and being based on depth convolutional neural networks The method of (ConvolutionalNeural Networks, CNN).Method based on manual feature is comparatively simple, efficiently； And the method based on CNN is more effective, but low efficiency.Comparing the classical pedestrian detection algorithm based on manual feature has: Dalal The HOG-LBP feature of the propositions such as the HOG characteristic image localized variation of equal propositions and Wang combines, they are and SVM classifier In conjunction with being detected；For processing target attitudes vibration, Felzenszwalb et al. uses deformable member model (Deformable Parts Model,DPM)；Integrating channel feature (ICF) is proposed Deng [4], the extraction office from HOG Portion feature channel and LUV color channel (i.e. HOG+LUV), then AdaBoost is cascaded for Study strategies and methods, efficiently solve mesh Mark occlusion issue；The ACF method that Dollar et al. is proposed, accelerates pedestrian detection speed etc..These methods are feature extraction The pedestrian detection method combined with classifier design.

With the proposition of deep learning method, also taken in object detection based on depth convolutional neural networks (CNN) method Obtained huge success.Usually it firstly generate candidate target frame, then using training CNN model to these candidate frames into Row classification；Ross Girshick in 2014 et al. propose based on region convolutional neural networks (Regions with CNN, R-CNN]) be CNN network classics, but it consumes very big in zoning candidate frame, later based on its improvement, proposes Fast R-CNN, this method very deep network implementations rate of real-time detection, but it has ignored formation zone time Select the time of frame；Next year, Kaiming He are waited and are proposed Faster R-CNN, average detected speed ratio R-CNN fast 1000 Times or more, Faster R-CNN depth network query function candidate frame, i.e. RPN (Region Proposal Networks), not It sacrifices under the premise of calculating consumption, performance also greatly improves；Jiale Cao et al. proposition utilizes HOG+LUV feature, in conjunction with Deep learning network characterization carries out Adaboost classification, and verification and measurement ratio increases, and the performance of target is detected in complex scene It improves.

Although the pedestrian detection algorithm based on Faster R-CNN obtains immense success, still there is improved place: 1) most methods only use convolutional neural networks the last layer feature, are divided with softmax or SVM motion in CNN Class.In fact, the different layers in CNN represent different image characteristics.Preceding several layers of image locals that better describe change, and most It is several layers of afterwards abstractively to describe image overall feature.Therefore, it can be trained, be detected with these features.2) a lot of methods are all It is feature extraction and the independent progress of classification, thus one piece of whole process is carried out, forms detection process end to end.

Summary of the invention

In the presence of overcoming the shortcomings of the prior art, the present invention provides a kind of depth based on the guidance of shallow-layer Fusion Features Layer network pedestrian detection method solves the problems, such as the missing inspection under complex scene and the too small situation of target of current pedestrian's detection algorithm.

In order to solve the above-mentioned technical problem, the technical scheme adopted by the invention is as follows:

A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, comprising the following steps:

S1, prepare data set for training network, data set is divided into data training set and data test set, and carry out Data set pretreatment；

S2, to image zooming-out character network in data training set in S1, the character network of extraction is divided into five block Section extracts depth network characterization from convolutional neural networks, shallow-layer feature is extracted after first block of neural network；

S3, the shallow-layer feature extracted in S2 is merged with depth network characterization, using VGG16 network as model, carries out feature The classification of fusion；

S4, planned network structural framing are trained convolutional neural networks using image in data training set in S1, obtain To the convolutional neural networks model of better performances；

S5, the convolutional neural networks model parameter obtained using S4, using data test collection in S1, to convolutional neural networks It is tested, realizes pedestrian detection.

Further, in the S1 data set pre-treatment step are as follows: choose caltech pedestrian's data set, by its data Collection is converted into a sheet by a sheet image and accomplishes fluently label, and generates .TXT file, for marking the coordinate of label.The data set is by 11 A video group Set00~set10 composition, the video resolution are 640*480.

Further, the shallow-layer feature includes histograms of oriented gradients feature and textural characteristics.

Further, it is merged in the S3 method particularly includes: the shallow-layer feature of extraction and depth network characterization exist " Flatten " layer is converted into one-dimensional vector, links together, and then adds two full articulamentums.

Further, the neuron number of described two full articulamentums is 4096.

Further, network structure frame uses Faster R-CNN detection framework in the S4.

Further, the anchor point that four scales are selected in the Faster R-CNN algorithm, selects the anchor point of four scales (anchor), anchor point area is set as { 64²,128²,256²,512²Four classes, and each area size selection length-width ratio is respectively {1:1,1:2,2:1}。

Further, the convolution of the last one shared convolutional layer output of the Faster R-CNN algorithm convolution feature The Feature Mapping one small network of sliding, this network are connected to entirely by sharing 3 × 3 sky in the convolution feature that convolutional layer exports Between on window.

Further, Relu activation primitive, the Relu activation primitive are used in the convolutional neural networks layer specifically:Wherein functional gradient α is constant, and α ∈ (0,1), x are the input value of neuron.

Further, extra correct detection window is unanimously eliminated using non-maximum during pedestrian detection；The inspection Survey the Duplication of windowWherein w₁With w₂Two detection windows are respectively indicated, the θ is Threshold value；The threshold θ is set as 0.7.

Compared with prior art, the advantageous effect of present invention is that:

The present invention provides a kind of deep layer network pedestrian detection methods based on the guidance of shallow-layer Fusion Features, in Faster On the basis of R-CNN detection algorithm, a kind of deep layer network pedestrian detection based on the guidance of shallow-layer Fusion Features is proposed.Pass through direction Histogram of gradients feature, improved textural characteristics are merged with depth network characterization, accurate pedestrian's feature are obtained, with deep learning For core, shallow-layer feature is guided, and realizes the mutual supplement with each other's advantages of shallow-layer study and deep learning.The result shows that the improvement proposed Method has preferable performance in terms of Detection accuracy and rate, efficiently solves and leaks under complex scene and the too small situation of target The problem of inspection.

Detailed description of the invention

Fig. 1 is a kind of network of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features provided by the invention Design structure；

Fig. 2 is RPN network structure used；

Fig. 3 is characterized fusion frame；

Fig. 4 is the improved textural characteristics form of expression；

Fig. 5 is characterized amalgamation mode；

Fig. 6 is vgg16, and resnet50 model loss compares；

Fig. 7 is vgg16, and resnet50 model classifier acc compares；

Fig. 8 is that each activation primitive loss compares；

Fig. 9 is vgg, vgg+hog, vgg+lbp, the accuracy rate of vgg+hog+lbp feature；

Figure 10 is pedestrian detection partial detection.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

S1, prepare data set for training network, data set is divided into data training set and data test set, and carry out Data set pretreatment, chooses caltech pedestrian's data set, which is made of 11 video group Set00~set10, the view Frequency division resolution is 640*480, converts a sheet by a sheet image for its data set and accomplishes fluently label, and generate .TXT file, is used to Mark the coordinate of label；

S2, to image zooming-out character network in data training set in S1, the character network of extraction is divided into five block Section extracts depth network characterization from convolutional neural networks, shallow-layer feature, institute is extracted after first block of neural network Stating shallow-layer feature includes histograms of oriented gradients feature and textural characteristics；

S3, the shallow-layer feature extracted in S2 is merged with depth network characterization, using VGG16 network as model, carries out feature The classification of fusion exists the shallow-layer feature of extraction and depth network characterization " flatten " layer is converted into one-dimensional vector, it is connected to one It rises, then adds two full articulamentums, the neuron number of described two full articulamentums is 4096；

S4, planned network structural framing are trained convolutional neural networks using image in data training set in S1；? To the convolutional neural networks model of better performances；

S5, the convolutional neural networks model obtained using S4 carry out convolutional neural networks using data test collection in S1 Pedestrian detection is realized in test.

The network structure frame uses Faster R-CNN detection framework in the present embodiment, uses convolutional network first Target candidate frame is obtained, then target candidate frame is classified and returned, in order to improve pedestrian detection accuracy rate, is solved multiple Missing inspection problem under miscellaneous scene and the too small situation of target guides convolutional network further feature using shallow-layer Fusion Features, by direction Histogram of gradients feature, textural characteristics and convolution Fusion Features.

Further, during pedestrian detection using non-maximum unanimously eliminate mostly with correct detection window；The inspection Survey the Duplication of windowWherein w₁With w₂Two detection windows are respectively indicated, the θ is Threshold value；The threshold θ is set as 0.7.

Embodiment

(1) network structure designs

In order to improve pedestrian detection accuracy rate, effectively the missing inspection in the too small situation of complex scene and target is solved the problems, such as, Enhance deep learning generalization ability.On the basis of the detection of Faster R-CNN algorithm.Convolution net is guided using shallow-layer Fusion Features Network further feature, by histograms of oriented gradients feature, textural characteristics and convolution Fusion Features, using deep learning as core, shallow-layer Feature guides, and realizes the mutual supplement with each other's advantages of shallow-layer study and deep learning.Network structure of the invention is as shown in Figure 1.This network Framework includes convolutional layer, pond layer, full articulamentum.And activation primitive is all employed in each convolutional layer.? After the last one convolutional layer, RPN network has been used, RPN network includes a convolutional layer, while connecting a classification layer, and one A recurrence layer, for generating target candidate frame；Followed by full articulamentum, the classification for finally carrying out network returns layer.Table 1 is net Network model design parameter.

Design parameter used in 1 network model of table

FasterR-CNN algorithm the biggest advantage is to region suggest network (Region Proposal Network, It RPN), is the convolution spy of a kind of full convolutional network (fully-convolutional network) and the shared full figure of convolutional network Sign, as shown in Fig. 2, height is different due to pedestrian's bodily form different in selected data set, in order to which the height for covering pedestrian is short and stout Thin equal size differences, select the anchor point (anchor) of four scales, anchor point area is set as { 64²,128²,256²,512²Four Class, and it is respectively { 1:1,1:2,2:1 } that each area size, which chooses length-width ratio,.In order to generate target candidate frame, at the last one The convolution Feature Mapping one small network of sliding of shared convolutional layer output, this network are connected to entirely by shared convolutional layer output Convolution feature on nxn spatial window on (n=3 herein).Each sliding window is mapped on a low-dimensional vector, And export that (cls, i.e., each candidate frame are targets/non-targeted to target candidate frame classification layer to the convolutional layer of two 1*1 at the same level- Probability) and target candidate frame return layer (reg, the i.e. codes co-ordinates of target candidate frame).

(2) selection and extraction of characteristics of image

There are many kinds of the features for influencing pedestrian detection, and the present embodiment chooses shallow-layer craft characteristic direction histogram of gradients feature (HOG) and textural characteristics (LBP), convolutional neural networks depth network characterization:

Each convolutional layer of convolutional neural networks represents different characteristics of image.It is preceding in image channel several layers of to better describe The localized variation of image, it is last it is several layers of in image channel abstractdesription image overall structure.Meanwhile shallow-layer feature is also well Image change is described.HOG feature describes image local edge direction and variance.Improved LBP Feature capturing edge or office Portion's shape information.In contrast, the characteristics of image of convolutional neural networks is very simple, relative efficiency is calculated.Therefore HOG spy is integrated Sign, both shallow-layer features of improved LBP feature and convolutional network feature construct the characteristic layer of image.Firstly, summarizing feature The framework of extraction.The basic framework of Fusion Features is as shown in Figure 3.The basic framework of Fusion Features can be divided into three parts: 1) firstly, the multi-layer image channel from L1 to LN is generated by convolutional neural networks.The image channel of traditional shallow-layer craft feature (HOG+LBP) after for first block.In each layer, there is multiple images channel.2) second step is feature extraction. HOG, LBP feature are extracted after first block of convolutional neural networks, L1 to LN extracts the convolutional neural networks aspect of model. 3) classify end to end finally, carrying out target candidate frame, return.

2.1 shallow-layer features

Histograms of oriented gradients (Histogram ofOriented Gradient, HOG) has been widely accepted and has caught for one kind Catch edge or best one of the feature of local shape information.HOG implements algorithm following steps:

G_x(x, y)=I (x+1, y)-I (x-1, y)

G_y(x, y)=I (x, y+1)-I (x, y-1)

Wherein G_x(x, y) indicates the horizontal direction gradient in input picture at pixel (x, y), G_y(x, y) indicates input figure Vertical gradient as at pixel (x, y), I (x, y) indicate the pixel value in input picture at pixel (x, y).

For HOG feature, using rectangle section (R-HOG), the section R-HOG (blocks) can carry out table there are three parameter Sign: cell factory number in each section, pixel number, the histogram number of active lanes of each cell in each cell factory. Before the computation, each section (block) plus a Gauss airspace window (Gaussian spatialwindow), in this way can be with Reduce the weight of the surrounding pixel point (pixels aroundthe edge) at edge.By adjusting different parameters, pedestrian detection Optimal parameter be arranged so that 3 × 3 cells/section, 6 × 6 pixels/cell, 9 histogram channels.

Local binary patterns (Local BinaryPattern, LBP) are a kind of sides for describing image local feature Method, key advantages are exactly the efficiency of invariance and calculating that its monotonic gray level changes.Original LBP characterizing definition is in 3*3 Neighborhood in, using centre of neighbourhood pixel as threshold value, the adjacent gray value of 8 pixels and the pixel value of the centre of neighbourhood are compared Compared with if surrounding pixel is greater than center pixel value, otherwise it is 0 that the position of the pixel, which is marked as 1,.By surrounding 0-1 sequence, It sequentially arranges, becomes 8 bits, be converted into decimal number.This decimal integer is exactly to characterize this The LBP value of a window, and reflect with this value the texture information in the region.Its fundamental formular are as follows:

WhereinIn formula, g_pIndicate center pixel gray value, g_p(p=0,1 ..., P-1) indicates radius For the gray value of the neighborhood territory pixel of R, P is the number of neighborhood territory pixel.But this LBP feature uses in fixed neighborhood Gray value, when the scale of image changes, mistake will occur for the coding of LBP feature, it is impossible to correctly reflect pixel Texture information around point.So in order to adapt to the textural characteristics of different scale, and reach wanting for gray scale and rotational invariance It asks, by 3 × 3 neighborhood extendings to any neighborhood, using circle shaped neighborhood region instead of square neighborhood, improved LBP operator allows There are any number of pixels in the circle shaped neighborhood region that radius is R.To obtain adopting in the border circular areas that radius is R containing P The LBP operator of sampling point.As shown in Figure 4.

The coordinate for calculating each sampled point is

2.2 depth convolutional network features and fusion feature

With Resnet50, VGG16 network model automatically extracts picture depth convolutional neural networks feature.By extraction Histograms of oriented gradients feature, textural characteristics, convolutional neural networks feature, " flatten " layer is converted into one-dimensional vector, it connects Together, two full articulamentums are then added to, neuron number difference 4096 and 4096 carries out convolutional neural networks model Training.Fusion Features mode as shown in Figure 5.

2.3 eliminate pedestrian detection overlapping

There are many relatively correct detection windows in the adjacent area around pedestrian's frame, these detection block Duplication are higher, Although eliminating the detection window of many non-pedestrian by the selection of RPN candidate frame and the detection based on FasterR-CNN algorithm, But extra correct detection window cannot be eliminated.Used here as non-maxima suppression (NMS), the Duplication o (w of detection window₁, w₂) is defined as:

Wherein w₁With w₂Respectively indicate two detection windows.Threshold θ is set as 0.7, eliminates the detection that Duplication is lower than 0.7 Window can be accelerated to detect speed, reduce and calculate cost.

(3) result shows and analyzes

The present invention uses Caltech pedestrian's data set, which is made of 11 video group Set00~set10.We It is trained with set00~set07, remaining part is for detecting.These videos are shot by vehicle-mounted camera.About 10 Or so a hour, video resolution 640*480.According to needed for the present invention, a sheet by a sheet image is converted by its data set and is done Label forms 122817 pictures altogether, shares 285558 pedestrians.In this data set, there are the pedestrian at distant view, the row at close shot People, some are blocked by vehicle, are mutually blocked between somebody and people, and somebody stands at bus stop board, and somebody is on road It walks, only half people in some pictures, background is complicated, and posture is different.

The test of 3.1 network models

Using two kinds of popular convolutional network models, Resnet50, VGG16 are for extracting feature, training network.Convolution kernel Size is 3*3, other initial parameters follow model trained in advance.In RPN network, two loss functions are used, i.e., RPN returns loss and Classification Loss.In study convolutional neural networks, we are also that detection has been used to return loss and detection classification Loss, so, the loss of network is this four the sum of loss probability.As a result such as Fig. 6, shown in 7, VGG16 training accuracy rate ratio Comparatively resnet50 will be got well, loss late is low, and VGG16 performance is better than resnet50.This shows in pedestrian detection, network Number of plies depth promotes pedestrian detection performance to a certain extent, but is not that depth is deeper, and effect is better.Select VGG16 network Model solves technical problem.

The test of 3.2 activation primitives and analysis

In convolutional neural networks, activation primitive is the nonlinear transformation carried out to input signal.The output of this transformation It is sent to the input of next layer of neuron, improves the Nonlinear Modeling ability of network, can learn and execute more complicated Task.If, even if there is many hidden layers, whole network is of equal value with monolayer neural networks without activation primitive.? In classification problem, be not using which kind of activation primitive it is unalterable, to be made a concrete analysis of according to particular problem, be to need to debug 's.Compare Relu function, tanh function, Leaky-relu function, influence of the Prelu function to its result.

Tanh function formula is as follows:

For this function when inputting very big or very little, output is all almost smooth, and gradient very little is unfavorable for weight more Newly.This disadvantage is improved, Relu function is proposed.

Relu activation primitive is defined as:

F (x)=max (0, x)

When input is negative value, Relu function will not activate Relu activation primitive completely, cause respective weights can not It updates, the case where for x < 0, Relu function is correspondingly improved, so that

This function removes 0 gradient, changes a non-zero gradient into, general α setting it is smaller, such as take 0.01, i.e., For Leakyrelu function.Similar, relu activation primitive (the Parameterised ReLU of a parametrization can be set Function), α is become the parameter for needing to learn.Parameter generally takes the number between 0~1.As shown in figure 8, in this reality It applies in example, Relu activation primitive is used in convolutional neural networks layer, loss late is comparatively lower, and the non-thread sexuality of function is good. Table 2 is each activation primitive loss.

Each activation primitive loss of table 2

The test of 3.3 Fusion Features

Using VGG16 network as model, the classification of Fusion Features, test experience, as a result as shown in figure 9, when traditional hand are carried out Work feature increases with accuracy rate after convolutional neural networks Fusion Features.In summary, it is solved with the method in complicated field Missing inspection problem under scape and the too small situation of target carries out pedestrian detection.

In shown in Figure 10, (a is schemed₁)(b₁)(c₁) it is not incorporate HOG, the detection knot of LBP feature in convolutional neural networks Fruit, (a₂)(b₂)(c₂) it is that HOG is incorporated in convolutional neural networks, the testing result after LBP feature.(a) group picture medium long shot It blocks pedestrian and is detected in place；(b) group picture two-shot goes out to block pedestrian and be detected；(c) imperfect pedestrian is detected in group picture Out.The target that is blocked is detected after improvement, the target pedestrian at distant view, while can also detect incomplete pedestrian in image, Detection performance increases.

Only presently preferred embodiments of the present invention is explained in detail above, but the present invention is not limited to above-described embodiment, Within the knowledge of a person skilled in the art, it can also make without departing from the purpose of the present invention each Kind variation, various change should all be included in the protection scope of the present invention.

Claims

1. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features, which comprises the following steps:

S1, prepare data set for training network, data set is divided into data training set and data test set, and carry out data Collection pretreatment；

S2, to image zooming-out character network in data training set in S1, the character network of extraction is divided into five sections block, Depth network characterization is extracted from convolutional neural networks, shallow-layer feature is extracted after first block of convolutional neural networks；

S3, the shallow-layer feature extracted in S2 is merged with depth network characterization, using VGG16 network as model, carries out Fusion Features Classification；

S4, planned network structural framing are trained convolutional neural networks using image in data training set in S1；Obtaining property It can preferable convolutional neural networks model；

S5, the convolutional neural networks model parameter obtained using S4 carry out convolutional neural networks using data test collection in S1 Pedestrian detection is realized in test.

2. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is, the pre-treatment step of data set in the S1 are as follows: chooses caltech pedestrian's data set, converts one for its data set It opens an image and accomplishes fluently label, and generate .TXT file, for marking the coordinate of label.

3. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is: the shallow-layer feature includes histograms of oriented gradients feature and textural characteristics.

4. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is, merges in the S3 method particularly includes: the shallow-layer feature of extraction and depth network characterization exist " conversion of flatten " layer It for one-dimensional vector, links together, then adds two full articulamentums.

5. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 4, special Sign is: the neuron number of described two full articulamentums is 4096.

6. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is: network structure frame uses Faster R-CNN detection framework in the S4.

7. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 6, special Sign is: the anchor point of four scales is selected in the Faster R-CNN algorithm, anchor point area is set as { 64²,128²,256², 512²Four classes, under each area size, choosing length-width ratio is respectively { 1:1,1:2,2:1 }.

8. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 6, special Sign is: the convolution Feature Mapping one of the last one shared convolutional layer output of the Faster R-CNN algorithm convolution feature A small network of sliding, this network are connected to entirely by sharing in 3 × 3 spatial window in the convolution feature that convolutional layer exports.

9. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is, Relu activation primitive, the Relu activation primitive are used in the convolutional neural networks layer specifically:Wherein functional gradient α is constant, and α ∈ (0,1), x are the input value of neuron.

10. a kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features according to claim 1, special Sign is: unanimously eliminating extra correct detection window using non-maximum during pedestrian detection；The weight of the detection window Folded rate isWherein w₁With w₂Two detection windows are respectively indicated, the θ is threshold value；Institute It states threshold θ and is set as 0.7.