CN109840498A - A kind of real-time pedestrian detection method and neural network, target detection layer - Google Patents

A kind of real-time pedestrian detection method and neural network, target detection layer Download PDF

Info

Publication number
CN109840498A
CN109840498A CN201910095995.7A CN201910095995A CN109840498A CN 109840498 A CN109840498 A CN 109840498A CN 201910095995 A CN201910095995 A CN 201910095995A CN 109840498 A CN109840498 A CN 109840498A
Authority
CN
China
Prior art keywords
frame
pedestrian
target
layer
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910095995.7A
Other languages
Chinese (zh)
Other versions
CN109840498B (en
Inventor
胡永健
阿尔法西·萨吉尔·艾哈迈德·萨吉尔
刘琲贝
王宇飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Sino Singapore International Joint Research Institute
Original Assignee
South China University of Technology SCUT
Sino Singapore International Joint Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT, Sino Singapore International Joint Research Institute filed Critical South China University of Technology SCUT
Priority to CN201910095995.7A priority Critical patent/CN109840498B/en
Publication of CN109840498A publication Critical patent/CN109840498A/en
Application granted granted Critical
Publication of CN109840498B publication Critical patent/CN109840498B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kind of real-time pedestrian detection methods, the step of this method, specifically includes that determining default resolution, video frame is read, segmentation block number is determined according to zoom factor, adjusts video frame size, divide video frame, segmentation rear video frame sub-block is stacked and is extracted feature, the coordinate parameters and pedestrian's confidence of predicting candidate pedestrian's frame filter out final pedestrian's frame result, according to present frame pedestrian's size adjusting zoom factor, next frame is continued with until completing whole Detection tasks.The invention discloses a kind of neural networks, including 7 or 8 or 9 layers of convolutional layer.The invention also discloses a kind of target detection layer, which realizes that two parts function is predicted in the prediction of pedestrian target frame coordinate and target frame confidence level.The present invention carries out self adaptive pantographic to video frame by zoom factor, in the case where guaranteeing detection accuracy and arithmetic speed, has been improved particularly the detection effect to small size pedestrian target.

Description

A kind of real-time pedestrian detection method and neural network, target detection layer
Technical field
It is the present invention relates to deep learning technical field of video processing, in particular to a kind of based on depth convolutional neural networks Real-time pedestrian detection method and neural network, target detection layer.
Background technique
Target detection is a kind of important computer vision technique, wherein pedestrian detection algorithm is in intelligent robot, video The forward positions popular domain such as monitoring and automatic Pilot is with a wide range of applications, by the academic attention with industrial circle.Past ten Many pedestrian detection methods are invented between many years, but there are also numerous actual application problems are urgently to be resolved.Pedestrian detection is in computer Visual field is still an extremely challenging task.
Conventional pedestrian's detection algorithm is mostly based on hand-designed feature, such as SIFT, SURF and HOG feature etc..With depth The development of learning art, the effective convolutional neural networks (Convolutional especially in image analysis tasks Neural Network, CNN) since invention, start to realize pedestrian's identification and detection using deep learning algorithm.Cai et al. exists 2016 European Computer visual conference (ECCV2016) publish thesis " A unified multi-scale deep Convolutional neural network for fast object detection ", utilize the different convolutional layers in CNN The image for matching different scale, carries out the Detection task under different scale to combine end-to-end training.It is detected compared to conventional pedestrian Algorithm, which can be improved Detection accuracy, but the recognition speed is slower, be only capable of reaching using a piece of tall and handsome Titan model GPU that reaches To 15 frames/second detection speed, it is difficult to meet requirement of real-time.Du et al. applies meeting in IEEE winter computer vision in 2017 View (WACV2017) publishes thesis " Fused DNN:A deep neural network fusion approach to fast And robust pedestrian detection ", Detection accuracy is improved using multiple parallel C NN, but due to network parameter Excessively, this method detection speed is slower, can only achieve about 3 frames/second detection speed using a piece of tall and handsome TitanX model GPU that reaches Degree.Brazil et al. 2017 international computer visual conference (ICCV2017) publish thesis " Illuminating Pedestrian via simultaneous detection and segmentation ", by the detection of sharing feature and Divide network, preferably realizes the pedestrian detection task in the stream of people.But due to the complicated network structure, it is empty to consume a large amount of storages Between, detection speed is also difficult to meet requirement of real time.Other than computing cost is big, real-time is poor, above-mentioned several method for from Video camera farther out, the lesser pedestrian of size will appear a large amount of missing inspections, the testing requirements for making it be difficult to meet practical application scene.
In practical applications, since background complexity is different, pedestrian's appearance is different (different sizes or garment language), light The problems such as line/weather condition is different and partial occlusion, the pedestrian detection method based on deep learning are generally required using complexity Neural network can be only achieved Detection accuracy requirement, cost is to increase algorithm complexity, reduces algorithm real-time.
Summary of the invention
The purpose of the present invention is to overcome the shortcomings of the existing technology and deficiency, provides a kind of real-time pedestrian detection method, this Method is directed to the quick inspection of sizes pedestrian target by realizing from zoom technology under the premise of guaranteeing Detection accuracy It surveys, improves algorithm real-time.
The purpose of the present invention is realized by the following technical solution: a kind of real-time pedestrian detection method, according to row in video The size of people is automatically split video frame, and single iteration is carried out in single width video frame, exports pedestrian target frame and pedestrian Confidence realizes efficient detection;Include the following steps:
Determine the default resolution of network reception video in algorithm: Hd×Wd× 3, wherein Hd、WdRespectively refer to the height of image And width, the color channel number that 3 finger images include;
Present frame I is read, resolution ratio is H × W × 3;
According to the value of zoom factor z, the segmentation block number B of present frame I is determined;
According to zoom factor z and segmentation block number B, the size of adjustment present frame I is H' × W';
Pixel value in frame after normalization adjustment size;
Frame after segmentation normalization is B subgraph;
The subgraph that present frame is divided is according to (B, Hd,Wd, 3) dimension arrangement, carry out feature extraction, and obtain The pedestrian target frame coordinate of characteristic pattern confidence level corresponding with the frame;
Valid frame is screened from target frame, the target frame of reservation and its corresponding pedestrian's classification confidence level can be used as pedestrian The output result of detection;
Calculate the average height H of all pedestrians detected in present frameped, and set minimum and highest threshold value Hθ_minWith Hθ_maxIf Hped< Hθ_min, then zoom factor z is increased by 1, if Hped> Hθ_max, then zoom factor z is reduced 1, other situations Then keep zoom factor constant;Detection next frame video is repeated, until whole section of video detection finishes.
Preferably, the segmentation block number B of the determining present frame I method particularly includes:
Further, for first frame image, zoom factor z is initialized as 0.
Preferably, it is described according to zoom factor z and segmentation block number B, adjust present frame I size method specifically:
As B=1, make H '=Hd, W '=Wd
As B=2, make H '=Hd
As B > 2, make
Further, adjust video frame size purpose be to ensure that and partition a frame into B block after, every piece of image resolution ratio is Hd×Wd, meet the input requirements of neural network.
Preferably, the frame after the normalization adjustment size is to take each pixel value in the frame after adjustment size divided by pixel It is worth the upper limit, it is made to normalize to section [0,1].
Preferably, the frame after normalization is divided into B subgraph, specifically includes following 3 kinds of situations:
As B=1, do not make to divide, whole frame is input in network model;
As B=2, by frame vertical segmentation at two parts, the ranks coordinate of pixel in present frame I is respectively indicated with x and y, Then a portion is Il=I (x, y), 0≤x < Wd, 0≤y < Hd, another part Ir=I (x, y), W '-Wd≤ x < W ', 0 ≤ y < Hd;
Work as B=z2When, frame is divided into z row z column, total z2A subgraph, size are Hd×Wd
Preferably, valid frame specific steps are screened from candidate target frame are as follows:
Confidence threshold value θ and target frame number upper limit k is setbox, in Hout×outOnly retain confidence level in × 9 candidate frames Frame not less than θ, and retain quantity and be no more than kboxIt is a, wherein HoutAnd WoutIt is the height and width for exporting characteristic pattern respectively; The target frame and its corresponding pedestrian's classification confidence level that are retained can be used as the output result of pedestrian detection.
Preferably, it reads when the current frame, if video frame to be detected is single channel (such as gray scale) image, directly duplication should Channel information constructs 3 channel images.
A kind of neural network, including 7 layers of convolutional layer, wherein the 1st layer be regular volume lamination, behind each layer be separable Depth convolutional layer;
Level 1 volume lamination uses 32 3 × 3 filters, standardizes (batch followed by batch Normalization, BN) layer and rectification linear unit (Rectified Liner Units, ReLU) layer;
The block that the separable convolutional layer of depth is made of one group of depth network layer structure successively includes depth convolutional layer, ReLU layers, BN layers, 1 × 1 convolutional layer, ReLU layers, BN layers;
Step-length is used to carry out down-sampling, remaining convolution to characteristic pattern for the convolution kernel of [2,2] in the 1st, 3,5,7 convolutional layers Layer step-length is [1,1];
Filter quantity in preceding 6 feature extraction layers is followed successively by 32,64,128,128,256,256, remaining feature extraction Filter quantity in layer is 512, and the size of all filters is 3 × 3;
Neural network final output be dimension be (B, Hout,Wout, 512) characteristic pattern, wherein HoutAnd WoutIt is respectively Export the height and width of characteristic pattern.
Preferably, the neural network of the design is a kind of light weight network, and institute's containing parameter total amount is less, stores network structure Only need about 2.3MB.
A kind of target detection layer realizes two parts function, is the prediction of pedestrian target frame coordinate and target frame confidence level respectively Prediction;
The prediction of pedestrian target frame is realized by 4 × 9=36 1 × 1 filters, to each grid prediction 9 on characteristic pattern A candidate target frame, each target frame is by Far Left coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd bottom Coordinate ymaxFour parameters determine;
The prediction of target frame confidence level is realized by 2 × 9=18 1 × 1 filters, to 9 candidate mesh on each grid Mark frame calculates its classification confidence level, including two class of pedestrian and background.
Compared with the prior art, the invention has the following advantages and beneficial effects:
Convolutional neural networks used in the present invention are light weight network, and network parameter is few, and algorithm arithmetic speed is fast, high-efficient, real Shi Xingqiang.Self adaptive pantographic is carried out to video frame by zoom factor, in the case where guaranteeing detection accuracy and arithmetic speed, especially Which raises the detection effects to small size pedestrian target.
Detailed description of the invention
Fig. 1 is the flow chart that the embodiment of the present invention carries out pedestrian detection.
Fig. 2 is convolutional neural networks structure chart used in the embodiment of the present invention.
Fig. 3 is testing result schematic diagram of the embodiment of the present invention.
Specific embodiment
For a better understanding of the technical solution of the present invention, the implementation that the present invention is described in detail provides with reference to the accompanying drawing Example, embodiments of the present invention are not limited thereto.
Embodiment
Under the premise of guaranteeing detection and positioning accuracy, the pedestrian detection method based on deep learning is solved The problem of middle efficiency and underspeed.The network model volume very little that the present invention uses, about 2.3MB, consumed computing resource is lower, But it can realize comparatively ideal detection speed and accuracy rate, after being trained on small data set and carrying out small parameter perturbations, can get 84.2% mAP (if when without fine tuning, can get 81% mAP).
One embodiment step of pedestrian detection is described in detail as follows:
The first step determines the default resolution of network reception video in algorithm: Hd×Wd× 3, H is set in this exampled= 256, Wd=448.
Second step reads frame image from picture pick-up device or existing video sequence.It is directly read from IP video camera in this example Taking resolution ratio is 1080 × 1920 × 3 color video, and note current frame image is I.
Third step determines the segmentation block number B of present frame I according to the value of zoom factor z as the following formula:
In this example, for first frame image, zoom factor z is initialized as 0, therefore B=1.For subsequent frame image, if Z=1, then B=2, if z=2, B=4, and so on.
The size adjusting of present frame I is H' × W', specifically included by the 4th step according to zoom factor z and segmentation block number B 3 kinds of situations below:
A. as B=1, make H '=Hd, W '=Wd
B. as B=2, make H '=Hd
C. as B > 2, make
In this example, for first frame image, because of B=1, it is applicable in the 1st kind of situation, i.e., is by video frame size adjusting 256×448.For subsequent frame image, if B=2, frame size is adjusted to 256 × 672, if B=4 (corresponding zoom factor z =2 the case where), then frame size is adjusted to 512 × 896, and so on.
5th step, by adjust size after frame in each pixel value divided by the pixel value upper limit, make its normalize to section [0, 1]。
Frame after normalization is divided into B subgraph by the 6th step, specifically includes following 3 kinds of situations:
A. as B=1, do not make to divide, whole frame is input in network model;
It b. is I respectively by frame vertical segmentation at left and right two parts as B=2l=I (x, y), 0≤x < Wd, 0≤y < Hd And Ir=I (x, y), W '-Wd≤ x < W ', 0≤y < Hd
C. work as B=z2When, frame is divided into z row z column, total z2A subgraph, size are Hd×Wd
In this example, for first frame image, because of B=1, it is applicable in the 1st kind of situation, i.e., does not make to divide, it directly will be whole Frame is input in network model.
For subsequent frame image, if B=2, by frame vertical segmentation at left and right two parts, left-hand component includes from the 0th column To the part of the 447th column, right-hand component includes the part from the 224th column to 671 column.If B=4 be (corresponding zoom factor z=2's Situation), then frame even partition is arranged for 2 rows 2, totally 4 subgraphs.Other situations are analogized.
7th step, the subgraph that present frame is divided is according to (B, Hd,Wd, 3) dimension arrangement, be then input to light Feature extraction is carried out in magnitude convolutional neural networks.In this example, for first frame image, because of B=1, input network Data mode is (1,256,448,3);For subsequent frame image, the data mode for inputting network is (B, 256,448,3).Feature Extract partial nerve network final output be dimension be (B, Hout,Wout, 512) characteristic pattern, in this example, HoutAnd WoutRespectively For 16 and 28.
Characteristic pattern obtained by 7th step is inputted target detection layer, it is corresponding with the frame to obtain pedestrian target frame coordinate by the 8th step Confidence level.Target detection layer realizes two parts function, is the prediction of pedestrian target frame coordinate and the prediction of target frame confidence level respectively.
The prediction of pedestrian target frame predicts 9 candidate target frames to each grid on characteristic pattern, and each target frame is by most Left side coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd coordinate y bottommaxFour parameters determine.This embodiment In, for first frame image, it need to predict 16 × 28 × 9=4032 candidate target frame;For subsequent frame image, if B=2, It need to predict 2 × 16 × 28 × 9=8064 candidate target frame;And so on.
The prediction of target frame confidence level calculates its classification confidence level, including pedestrian and background two to each candidate target frame Class.
The structure of convolutional neural networks (including the 7th step and the 8th step) used is as shown in Figure 2 in embodiment.
9th step screens valid frame from candidate target frame.Confidence threshold value θ and target frame number upper limit k is setbox, from Only retain the frame that confidence level is not less than θ in the candidate frame of previous step prediction, and retains quantity and be no more than kboxIt is a.The mesh retained Mark frame and its corresponding classification confidence level can be used as the output result of pedestrian detection.In this embodiment, θ=0.01, k are setbox =200.
Tenth step calculates the average height H for all pedestrians that present frame detectsped, and given threshold Hθ_minAnd Hθ_max, If Hped< Hθ_min, then zoom factor z is increased by 1, if Hped> Hθ_max, then zoom factor z is reduced 1, other situations are then kept Zoom factor is constant.Second step is repeated, next frame video is detected, until whole section of video detection finishes.In this embodiment, if Determine Hθ_min=80, Hθ_max=150.
A kind of lightweight convolutional neural networks, including 7 layers of convolutional layer, wherein the 1st layer be regular volume lamination, behind each layer it is equal For separable depth convolutional layer;
A. level 1 volume lamination uses 32 3 × 3 filters, standardizes (batch followed by batch Normalization, BN) layer and rectification linear unit (Rectified Liner Units, ReLU) layer;
B. the block that the separable convolutional layer of depth is made of one group of depth network layer structure successively includes depth convolution Layer, ReLU layers, BN layers, 1 × 1 convolutional layer, ReLU layers, BN layers;
C. step-length is used to carry out down-sampling to characteristic pattern for the convolution kernel of [2,2] in the 1st, 3,5,7 convolutional layers, remaining volume Lamination step-length is [1,1];
D. the filter quantity in preceding 6 feature extraction layers is followed successively by 32,64,128,128,256,256, remaining feature mentions Taking the filter quantity in layer is 512, and the size of all filters is 3 × 3;
Neural network final output be dimension be (B, Hout,Wout, 512) characteristic pattern, wherein HoutAnd WoutIt is respectively Export the height and width of characteristic pattern.
The neural network of the design is a kind of light weight network, and institute's containing parameter total amount is less, and storage network structure only needs about 2.3MB;Convolutional layer can also be 8 layers or 9 layers.
A kind of target detection layer realizes two parts function, is the prediction of pedestrian target frame coordinate and target frame confidence level respectively Prediction;
The prediction of pedestrian target frame is realized by 4 × 9=36 1 × 1 filters, to each grid prediction 9 on characteristic pattern A candidate target frame, each target frame is by Far Left coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd bottom Coordinate ymaxFour parameters determine;
The prediction of target frame confidence level is realized by 2 × 9=18 1 × 1 filters, to 9 candidate mesh on each grid Mark frame calculates its classification confidence level, including two class of pedestrian and background.
In embodiment, wherein the testing result of a frame is not as shown in figure 3, the training process of convolutional neural networks uses self shrinking It puts, 256 × 448 training image is directly sent into network according to batch size of B=32 and is trained.Trained objective function knot The smooth L1 loss that position is confined for the Classification Loss (Softmax loss) of confidence level prediction and pedestrian is closed.
The computer CPU of operation embodiment is configured to Corei5-6500 3.20GHz × 4, and GPU is configured to GTX1080Ti, Software environment is Caffe (1.0.0-rc3 version), for the video of 256 × 448 sizes, completes the pedestrian detection of a frame only about 7 Millisecond, it is averagely per second to handle 142 frames, it can satisfy most of real time monitoring processing requirement.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention, It should be equivalent substitute mode, be included within the scope of the present invention.

Claims (10)

1. a kind of real-time pedestrian detection method, which is characterized in that divided automatically video frame according to the size of pedestrian in video It cuts, single iteration is carried out in single width video frame, export pedestrian target frame and pedestrian's confidence;Include the following steps:
Determine the default resolution of network reception video in algorithm: Hd×Wd× 3, wherein Hd、WdRespectively refer to the height and width of image Degree, the color channel number that 3 finger images include;
Present frame I is read, resolution ratio is H × W × 3;
According to the value of zoom factor z, the segmentation block number B of present frame I is determined;
According to zoom factor z and segmentation block number B, the size of adjustment present frame I is H ' × W ';
Pixel value in frame after normalization adjustment size;
Frame after segmentation normalization is B subgraph;
The subgraph that present frame is divided is according to (B, Hd, Wd, 3) dimension arrangement, carry out feature extraction, and obtain feature The pedestrian target frame coordinate of figure confidence level corresponding with the frame;
Valid frame is screened from target frame, the target frame of reservation and its corresponding pedestrian's classification confidence level can be used as pedestrian detection Output result;
Calculate the average height H of all pedestrians detected in present frameped, and set minimum and highest threshold value Hθ_minAnd Hθ_max, If Hped< Hθ_min, then zoom factor z is increased by 1, if Hped>Hθ_max, then zoom factor z is reduced 1, other situations are then kept Zoom factor is constant;Detection next frame video is repeated, until whole section of video detection finishes.
2. real-time pedestrian detection method according to claim 1, which is characterized in that the segmentation block of the determining present frame I Number B's method particularly includes:
3. real-time pedestrian detection method according to claim 2, which is characterized in that for first frame image, will scale because Sub- z is initialized as 0.
4. real-time pedestrian detection method according to claim 1, which is characterized in that described according to zoom factor z and segmentation Block number B, the method for adjusting present frame I size specifically:
As B=1, make H '=Hd, W '=Wd
As B=2, make H '=Hd
As B > 2, make
5. real-time pedestrian detection method according to claim 1, which is characterized in that the frame after the normalization adjustment size It is that each pixel value makes it normalize to section [0,1] divided by the pixel value upper limit in the frame after adjusting size.
6. real-time pedestrian detection method according to claim 1, which is characterized in that the frame after normalization is divided into B Subgraph specifically includes following 3 kinds of situations:
As B=1, do not make to divide, whole frame is input in network model;
As B=2, by frame vertical segmentation at two parts, the ranks coordinate of pixel in present frame I is respectively indicated with x and y, then its Middle a part is Il=I (x, y), 0≤x < Wd, 0≤y < Hd, another part Ir=I (x, y), W '-Wd≤ x < W ', 0≤y < Hd
Work as B=z2When, frame is divided into z row z column, total z2A subgraph, size are Hd×Wd
7. real-time pedestrian detection method according to claim 1, which is characterized in that screen valid frame from candidate target frame Specific steps are as follows:
Confidence threshold value θ and target frame number upper limit k is setbox, in Hout×WoutOnly retain confidence level not in × 9 candidate frames Frame less than θ, and retain quantity and be no more than kboxIt is a, wherein HoutAnd WoutIt is the height and width for exporting characteristic pattern respectively;Institute The target frame of reservation and its corresponding pedestrian's classification confidence level can be used as the output result of pedestrian detection.
8. real-time pedestrian detection method according to claim 1, which is characterized in that read when the current frame, if view to be detected When frequency frame is single channel image, then the channel information is directly replicated, constructs 3 channel images.
9. a kind of neural network, which is characterized in that including 7 layers of convolutional layer, wherein the 1st layer is regular volume lamination, behind each layer it is equal For separable depth convolutional layer;
Level 1 volume lamination uses 32 3 × 3 filters, followed by batch normalization layer and rectifies linear elementary layer;
The block that the separable convolutional layer of depth is made of one group of depth network layer structure successively includes depth convolutional layer, ReLU Layer, BN layers, 1 × 1 convolutional layer, ReLU layers, BN layers;
Step-length is used to carry out down-sampling, remaining convolutional layer step to characteristic pattern for the convolution kernel of [2,2] in the 1st, 3,5,7 convolutional layers A length of [1,1];
Filter quantity in preceding 6 feature extraction layers is followed successively by 32,64,128,128,256,256, in remaining feature extraction layer Filter quantity be 512, the size of all filters is 3 × 3;
Neural network final output be dimension be (B, Hout, Wout, 512) characteristic pattern, wherein HoutAnd WoutIt is output respectively The height and width of characteristic pattern.
10. a kind of target detection layer, which is characterized in that the target detection layer realizes two parts function, is pedestrian target frame respectively Coordinate prediction and the prediction of target frame confidence level;
The prediction of pedestrian target frame is realized by 4 × 9=36 1 × 1 filters, predicts 9 times to each grid on characteristic pattern Target frame is selected, each target frame is by Far Left coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd coordinate bottom ymaxFour parameters determine;
The prediction of target frame confidence level is realized by 2 × 9=18 1 × 1 filters, to 9 candidate target frames on each grid Calculate its classification confidence level, including two class of pedestrian and background.
CN201910095995.7A 2019-01-31 2019-01-31 Real-time pedestrian detection method, neural network and target detection layer Expired - Fee Related CN109840498B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910095995.7A CN109840498B (en) 2019-01-31 2019-01-31 Real-time pedestrian detection method, neural network and target detection layer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910095995.7A CN109840498B (en) 2019-01-31 2019-01-31 Real-time pedestrian detection method, neural network and target detection layer

Publications (2)

Publication Number Publication Date
CN109840498A true CN109840498A (en) 2019-06-04
CN109840498B CN109840498B (en) 2020-12-15

Family

ID=66884347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910095995.7A Expired - Fee Related CN109840498B (en) 2019-01-31 2019-01-31 Real-time pedestrian detection method, neural network and target detection layer

Country Status (1)

Country Link
CN (1) CN109840498B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728200A (en) * 2019-09-23 2020-01-24 武汉大学 Real-time pedestrian detection method and system based on deep learning
CN113111770A (en) * 2021-04-12 2021-07-13 杭州赛鲁班网络科技有限公司 Video processing method, device, terminal and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104318216A (en) * 2014-10-28 2015-01-28 宁波大学 Method for recognizing and matching pedestrian targets across blind area in video surveillance
CN106845621A (en) * 2017-01-18 2017-06-13 山东大学 Dense population number method of counting and system based on depth convolutional neural networks
CN107316320A (en) * 2017-06-19 2017-11-03 江西洪都航空工业集团有限责任公司 The real-time pedestrian detecting system that a kind of use GPU accelerates
CN107578021A (en) * 2017-09-13 2018-01-12 北京文安智能技术股份有限公司 Pedestrian detection method, apparatus and system based on deep learning network
CN108960198A (en) * 2018-07-28 2018-12-07 天津大学 A kind of road traffic sign detection and recognition methods based on residual error SSD model
CN109063559A (en) * 2018-06-28 2018-12-21 东南大学 A kind of pedestrian detection method returned based on improvement region
CN109117717A (en) * 2018-06-29 2019-01-01 广州烽火众智数字技术有限公司 A kind of city pedestrian detection method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104318216A (en) * 2014-10-28 2015-01-28 宁波大学 Method for recognizing and matching pedestrian targets across blind area in video surveillance
CN106845621A (en) * 2017-01-18 2017-06-13 山东大学 Dense population number method of counting and system based on depth convolutional neural networks
CN107316320A (en) * 2017-06-19 2017-11-03 江西洪都航空工业集团有限责任公司 The real-time pedestrian detecting system that a kind of use GPU accelerates
CN107578021A (en) * 2017-09-13 2018-01-12 北京文安智能技术股份有限公司 Pedestrian detection method, apparatus and system based on deep learning network
CN109063559A (en) * 2018-06-28 2018-12-21 东南大学 A kind of pedestrian detection method returned based on improvement region
CN109117717A (en) * 2018-06-29 2019-01-01 广州烽火众智数字技术有限公司 A kind of city pedestrian detection method
CN108960198A (en) * 2018-07-28 2018-12-07 天津大学 A kind of road traffic sign detection and recognition methods based on residual error SSD model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHAOQING REN,ET AL.: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
YUXIN PENG,ET AL.: "Object-Part Attention Model for Fine-Grained Image Classification", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110728200A (en) * 2019-09-23 2020-01-24 武汉大学 Real-time pedestrian detection method and system based on deep learning
CN113111770A (en) * 2021-04-12 2021-07-13 杭州赛鲁班网络科技有限公司 Video processing method, device, terminal and storage medium
CN113111770B (en) * 2021-04-12 2022-09-13 杭州赛鲁班网络科技有限公司 Video processing method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN109840498B (en) 2020-12-15

Similar Documents

Publication Publication Date Title
Jia et al. Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN104463117B (en) A kind of recognition of face sample collection method and system based on video mode
CN109410168B (en) Modeling method of convolutional neural network for determining sub-tile classes in an image
CN108898145A (en) A kind of image well-marked target detection method of combination deep learning
CN108830196A (en) Pedestrian detection method based on feature pyramid network
CN110929593B (en) Real-time significance pedestrian detection method based on detail discrimination
CN110443763B (en) Convolutional neural network-based image shadow removing method
CN107808132A (en) A kind of scene image classification method for merging topic model
CN108647694A (en) Correlation filtering method for tracking target based on context-aware and automated response
CN107909081A (en) The quick obtaining and quick calibrating method of image data set in a kind of deep learning
CN109377445A (en) Model training method, the method, apparatus and electronic system for replacing image background
CN103262119A (en) Method and system for segmenting an image
CN112150821A (en) Lightweight vehicle detection model construction method, system and device
CN110705412A (en) Video target detection method based on motion history image
CN109685045A (en) A kind of Moving Targets Based on Video Streams tracking and system
CN108038455A (en) Bionic machine peacock image-recognizing method based on deep learning
CN113297956B (en) Gesture recognition method and system based on vision
CN107273933A (en) The construction method of picture charge pattern grader a kind of and apply its face tracking methods
CN105825168A (en) Golden snub-nosed monkey face detection and tracking algorithm based on S-TLD
CN110532959A (en) Real-time act of violence detection system based on binary channels Three dimensional convolution neural network
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN114359245A (en) Method for detecting surface defects of products in industrial scene
CN112163508A (en) Character recognition method and system based on real scene and OCR terminal
CN109840498A (en) A kind of real-time pedestrian detection method and neural network, target detection layer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201215

Termination date: 20220131

CF01 Termination of patent right due to non-payment of annual fee