CN109840498A - A kind of real-time pedestrian detection method and neural network, target detection layer - Google Patents
A kind of real-time pedestrian detection method and neural network, target detection layer Download PDFInfo
- Publication number
- CN109840498A CN109840498A CN201910095995.7A CN201910095995A CN109840498A CN 109840498 A CN109840498 A CN 109840498A CN 201910095995 A CN201910095995 A CN 201910095995A CN 109840498 A CN109840498 A CN 109840498A
- Authority
- CN
- China
- Prior art keywords
- frame
- pedestrian
- target
- layer
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a kind of real-time pedestrian detection methods, the step of this method, specifically includes that determining default resolution, video frame is read, segmentation block number is determined according to zoom factor, adjusts video frame size, divide video frame, segmentation rear video frame sub-block is stacked and is extracted feature, the coordinate parameters and pedestrian's confidence of predicting candidate pedestrian's frame filter out final pedestrian's frame result, according to present frame pedestrian's size adjusting zoom factor, next frame is continued with until completing whole Detection tasks.The invention discloses a kind of neural networks, including 7 or 8 or 9 layers of convolutional layer.The invention also discloses a kind of target detection layer, which realizes that two parts function is predicted in the prediction of pedestrian target frame coordinate and target frame confidence level.The present invention carries out self adaptive pantographic to video frame by zoom factor, in the case where guaranteeing detection accuracy and arithmetic speed, has been improved particularly the detection effect to small size pedestrian target.
Description
Technical field
It is the present invention relates to deep learning technical field of video processing, in particular to a kind of based on depth convolutional neural networks
Real-time pedestrian detection method and neural network, target detection layer.
Background technique
Target detection is a kind of important computer vision technique, wherein pedestrian detection algorithm is in intelligent robot, video
The forward positions popular domain such as monitoring and automatic Pilot is with a wide range of applications, by the academic attention with industrial circle.Past ten
Many pedestrian detection methods are invented between many years, but there are also numerous actual application problems are urgently to be resolved.Pedestrian detection is in computer
Visual field is still an extremely challenging task.
Conventional pedestrian's detection algorithm is mostly based on hand-designed feature, such as SIFT, SURF and HOG feature etc..With depth
The development of learning art, the effective convolutional neural networks (Convolutional especially in image analysis tasks
Neural Network, CNN) since invention, start to realize pedestrian's identification and detection using deep learning algorithm.Cai et al. exists
2016 European Computer visual conference (ECCV2016) publish thesis " A unified multi-scale deep
Convolutional neural network for fast object detection ", utilize the different convolutional layers in CNN
The image for matching different scale, carries out the Detection task under different scale to combine end-to-end training.It is detected compared to conventional pedestrian
Algorithm, which can be improved Detection accuracy, but the recognition speed is slower, be only capable of reaching using a piece of tall and handsome Titan model GPU that reaches
To 15 frames/second detection speed, it is difficult to meet requirement of real-time.Du et al. applies meeting in IEEE winter computer vision in 2017
View (WACV2017) publishes thesis " Fused DNN:A deep neural network fusion approach to fast
And robust pedestrian detection ", Detection accuracy is improved using multiple parallel C NN, but due to network parameter
Excessively, this method detection speed is slower, can only achieve about 3 frames/second detection speed using a piece of tall and handsome TitanX model GPU that reaches
Degree.Brazil et al. 2017 international computer visual conference (ICCV2017) publish thesis " Illuminating
Pedestrian via simultaneous detection and segmentation ", by the detection of sharing feature and
Divide network, preferably realizes the pedestrian detection task in the stream of people.But due to the complicated network structure, it is empty to consume a large amount of storages
Between, detection speed is also difficult to meet requirement of real time.Other than computing cost is big, real-time is poor, above-mentioned several method for from
Video camera farther out, the lesser pedestrian of size will appear a large amount of missing inspections, the testing requirements for making it be difficult to meet practical application scene.
In practical applications, since background complexity is different, pedestrian's appearance is different (different sizes or garment language), light
The problems such as line/weather condition is different and partial occlusion, the pedestrian detection method based on deep learning are generally required using complexity
Neural network can be only achieved Detection accuracy requirement, cost is to increase algorithm complexity, reduces algorithm real-time.
Summary of the invention
The purpose of the present invention is to overcome the shortcomings of the existing technology and deficiency, provides a kind of real-time pedestrian detection method, this
Method is directed to the quick inspection of sizes pedestrian target by realizing from zoom technology under the premise of guaranteeing Detection accuracy
It surveys, improves algorithm real-time.
The purpose of the present invention is realized by the following technical solution: a kind of real-time pedestrian detection method, according to row in video
The size of people is automatically split video frame, and single iteration is carried out in single width video frame, exports pedestrian target frame and pedestrian
Confidence realizes efficient detection;Include the following steps:
Determine the default resolution of network reception video in algorithm: Hd×Wd× 3, wherein Hd、WdRespectively refer to the height of image
And width, the color channel number that 3 finger images include;
Present frame I is read, resolution ratio is H × W × 3;
According to the value of zoom factor z, the segmentation block number B of present frame I is determined;
According to zoom factor z and segmentation block number B, the size of adjustment present frame I is H' × W';
Pixel value in frame after normalization adjustment size;
Frame after segmentation normalization is B subgraph;
The subgraph that present frame is divided is according to (B, Hd,Wd, 3) dimension arrangement, carry out feature extraction, and obtain
The pedestrian target frame coordinate of characteristic pattern confidence level corresponding with the frame;
Valid frame is screened from target frame, the target frame of reservation and its corresponding pedestrian's classification confidence level can be used as pedestrian
The output result of detection;
Calculate the average height H of all pedestrians detected in present frameped, and set minimum and highest threshold value Hθ_minWith
Hθ_maxIf Hped< Hθ_min, then zoom factor z is increased by 1, if Hped> Hθ_max, then zoom factor z is reduced 1, other situations
Then keep zoom factor constant;Detection next frame video is repeated, until whole section of video detection finishes.
Preferably, the segmentation block number B of the determining present frame I method particularly includes:
Further, for first frame image, zoom factor z is initialized as 0.
Preferably, it is described according to zoom factor z and segmentation block number B, adjust present frame I size method specifically:
As B=1, make H '=Hd, W '=Wd;
As B=2, make H '=Hd、
As B > 2, make
Further, adjust video frame size purpose be to ensure that and partition a frame into B block after, every piece of image resolution ratio is
Hd×Wd, meet the input requirements of neural network.
Preferably, the frame after the normalization adjustment size is to take each pixel value in the frame after adjustment size divided by pixel
It is worth the upper limit, it is made to normalize to section [0,1].
Preferably, the frame after normalization is divided into B subgraph, specifically includes following 3 kinds of situations:
As B=1, do not make to divide, whole frame is input in network model;
As B=2, by frame vertical segmentation at two parts, the ranks coordinate of pixel in present frame I is respectively indicated with x and y,
Then a portion is Il=I (x, y), 0≤x < Wd, 0≤y < Hd, another part Ir=I (x, y), W '-Wd≤ x < W ', 0
≤ y < Hd;
Work as B=z2When, frame is divided into z row z column, total z2A subgraph, size are Hd×Wd。
Preferably, valid frame specific steps are screened from candidate target frame are as follows:
Confidence threshold value θ and target frame number upper limit k is setbox, in Hout×outOnly retain confidence level in × 9 candidate frames
Frame not less than θ, and retain quantity and be no more than kboxIt is a, wherein HoutAnd WoutIt is the height and width for exporting characteristic pattern respectively;
The target frame and its corresponding pedestrian's classification confidence level that are retained can be used as the output result of pedestrian detection.
Preferably, it reads when the current frame, if video frame to be detected is single channel (such as gray scale) image, directly duplication should
Channel information constructs 3 channel images.
A kind of neural network, including 7 layers of convolutional layer, wherein the 1st layer be regular volume lamination, behind each layer be separable
Depth convolutional layer;
Level 1 volume lamination uses 32 3 × 3 filters, standardizes (batch followed by batch
Normalization, BN) layer and rectification linear unit (Rectified Liner Units, ReLU) layer;
The block that the separable convolutional layer of depth is made of one group of depth network layer structure successively includes depth convolutional layer,
ReLU layers, BN layers, 1 × 1 convolutional layer, ReLU layers, BN layers;
Step-length is used to carry out down-sampling, remaining convolution to characteristic pattern for the convolution kernel of [2,2] in the 1st, 3,5,7 convolutional layers
Layer step-length is [1,1];
Filter quantity in preceding 6 feature extraction layers is followed successively by 32,64,128,128,256,256, remaining feature extraction
Filter quantity in layer is 512, and the size of all filters is 3 × 3;
Neural network final output be dimension be (B, Hout,Wout, 512) characteristic pattern, wherein HoutAnd WoutIt is respectively
Export the height and width of characteristic pattern.
Preferably, the neural network of the design is a kind of light weight network, and institute's containing parameter total amount is less, stores network structure
Only need about 2.3MB.
A kind of target detection layer realizes two parts function, is the prediction of pedestrian target frame coordinate and target frame confidence level respectively
Prediction;
The prediction of pedestrian target frame is realized by 4 × 9=36 1 × 1 filters, to each grid prediction 9 on characteristic pattern
A candidate target frame, each target frame is by Far Left coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd bottom
Coordinate ymaxFour parameters determine;
The prediction of target frame confidence level is realized by 2 × 9=18 1 × 1 filters, to 9 candidate mesh on each grid
Mark frame calculates its classification confidence level, including two class of pedestrian and background.
Compared with the prior art, the invention has the following advantages and beneficial effects:
Convolutional neural networks used in the present invention are light weight network, and network parameter is few, and algorithm arithmetic speed is fast, high-efficient, real
Shi Xingqiang.Self adaptive pantographic is carried out to video frame by zoom factor, in the case where guaranteeing detection accuracy and arithmetic speed, especially
Which raises the detection effects to small size pedestrian target.
Detailed description of the invention
Fig. 1 is the flow chart that the embodiment of the present invention carries out pedestrian detection.
Fig. 2 is convolutional neural networks structure chart used in the embodiment of the present invention.
Fig. 3 is testing result schematic diagram of the embodiment of the present invention.
Specific embodiment
For a better understanding of the technical solution of the present invention, the implementation that the present invention is described in detail provides with reference to the accompanying drawing
Example, embodiments of the present invention are not limited thereto.
Embodiment
Under the premise of guaranteeing detection and positioning accuracy, the pedestrian detection method based on deep learning is solved
The problem of middle efficiency and underspeed.The network model volume very little that the present invention uses, about 2.3MB, consumed computing resource is lower,
But it can realize comparatively ideal detection speed and accuracy rate, after being trained on small data set and carrying out small parameter perturbations, can get
84.2% mAP (if when without fine tuning, can get 81% mAP).
One embodiment step of pedestrian detection is described in detail as follows:
The first step determines the default resolution of network reception video in algorithm: Hd×Wd× 3, H is set in this exampled=
256, Wd=448.
Second step reads frame image from picture pick-up device or existing video sequence.It is directly read from IP video camera in this example
Taking resolution ratio is 1080 × 1920 × 3 color video, and note current frame image is I.
Third step determines the segmentation block number B of present frame I according to the value of zoom factor z as the following formula:
In this example, for first frame image, zoom factor z is initialized as 0, therefore B=1.For subsequent frame image, if
Z=1, then B=2, if z=2, B=4, and so on.
The size adjusting of present frame I is H' × W', specifically included by the 4th step according to zoom factor z and segmentation block number B
3 kinds of situations below:
A. as B=1, make H '=Hd, W '=Wd;
B. as B=2, make H '=Hd、
C. as B > 2, make
In this example, for first frame image, because of B=1, it is applicable in the 1st kind of situation, i.e., is by video frame size adjusting
256×448.For subsequent frame image, if B=2, frame size is adjusted to 256 × 672, if B=4 (corresponding zoom factor z
=2 the case where), then frame size is adjusted to 512 × 896, and so on.
5th step, by adjust size after frame in each pixel value divided by the pixel value upper limit, make its normalize to section [0,
1]。
Frame after normalization is divided into B subgraph by the 6th step, specifically includes following 3 kinds of situations:
A. as B=1, do not make to divide, whole frame is input in network model;
It b. is I respectively by frame vertical segmentation at left and right two parts as B=2l=I (x, y), 0≤x < Wd, 0≤y < Hd
And Ir=I (x, y), W '-Wd≤ x < W ', 0≤y < Hd;
C. work as B=z2When, frame is divided into z row z column, total z2A subgraph, size are Hd×Wd。
In this example, for first frame image, because of B=1, it is applicable in the 1st kind of situation, i.e., does not make to divide, it directly will be whole
Frame is input in network model.
For subsequent frame image, if B=2, by frame vertical segmentation at left and right two parts, left-hand component includes from the 0th column
To the part of the 447th column, right-hand component includes the part from the 224th column to 671 column.If B=4 be (corresponding zoom factor z=2's
Situation), then frame even partition is arranged for 2 rows 2, totally 4 subgraphs.Other situations are analogized.
7th step, the subgraph that present frame is divided is according to (B, Hd,Wd, 3) dimension arrangement, be then input to light
Feature extraction is carried out in magnitude convolutional neural networks.In this example, for first frame image, because of B=1, input network
Data mode is (1,256,448,3);For subsequent frame image, the data mode for inputting network is (B, 256,448,3).Feature
Extract partial nerve network final output be dimension be (B, Hout,Wout, 512) characteristic pattern, in this example, HoutAnd WoutRespectively
For 16 and 28.
Characteristic pattern obtained by 7th step is inputted target detection layer, it is corresponding with the frame to obtain pedestrian target frame coordinate by the 8th step
Confidence level.Target detection layer realizes two parts function, is the prediction of pedestrian target frame coordinate and the prediction of target frame confidence level respectively.
The prediction of pedestrian target frame predicts 9 candidate target frames to each grid on characteristic pattern, and each target frame is by most
Left side coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd coordinate y bottommaxFour parameters determine.This embodiment
In, for first frame image, it need to predict 16 × 28 × 9=4032 candidate target frame;For subsequent frame image, if B=2,
It need to predict 2 × 16 × 28 × 9=8064 candidate target frame;And so on.
The prediction of target frame confidence level calculates its classification confidence level, including pedestrian and background two to each candidate target frame
Class.
The structure of convolutional neural networks (including the 7th step and the 8th step) used is as shown in Figure 2 in embodiment.
9th step screens valid frame from candidate target frame.Confidence threshold value θ and target frame number upper limit k is setbox, from
Only retain the frame that confidence level is not less than θ in the candidate frame of previous step prediction, and retains quantity and be no more than kboxIt is a.The mesh retained
Mark frame and its corresponding classification confidence level can be used as the output result of pedestrian detection.In this embodiment, θ=0.01, k are setbox
=200.
Tenth step calculates the average height H for all pedestrians that present frame detectsped, and given threshold Hθ_minAnd Hθ_max,
If Hped< Hθ_min, then zoom factor z is increased by 1, if Hped> Hθ_max, then zoom factor z is reduced 1, other situations are then kept
Zoom factor is constant.Second step is repeated, next frame video is detected, until whole section of video detection finishes.In this embodiment, if
Determine Hθ_min=80, Hθ_max=150.
A kind of lightweight convolutional neural networks, including 7 layers of convolutional layer, wherein the 1st layer be regular volume lamination, behind each layer it is equal
For separable depth convolutional layer;
A. level 1 volume lamination uses 32 3 × 3 filters, standardizes (batch followed by batch
Normalization, BN) layer and rectification linear unit (Rectified Liner Units, ReLU) layer;
B. the block that the separable convolutional layer of depth is made of one group of depth network layer structure successively includes depth convolution
Layer, ReLU layers, BN layers, 1 × 1 convolutional layer, ReLU layers, BN layers;
C. step-length is used to carry out down-sampling to characteristic pattern for the convolution kernel of [2,2] in the 1st, 3,5,7 convolutional layers, remaining volume
Lamination step-length is [1,1];
D. the filter quantity in preceding 6 feature extraction layers is followed successively by 32,64,128,128,256,256, remaining feature mentions
Taking the filter quantity in layer is 512, and the size of all filters is 3 × 3;
Neural network final output be dimension be (B, Hout,Wout, 512) characteristic pattern, wherein HoutAnd WoutIt is respectively
Export the height and width of characteristic pattern.
The neural network of the design is a kind of light weight network, and institute's containing parameter total amount is less, and storage network structure only needs about
2.3MB;Convolutional layer can also be 8 layers or 9 layers.
A kind of target detection layer realizes two parts function, is the prediction of pedestrian target frame coordinate and target frame confidence level respectively
Prediction;
The prediction of pedestrian target frame is realized by 4 × 9=36 1 × 1 filters, to each grid prediction 9 on characteristic pattern
A candidate target frame, each target frame is by Far Left coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd bottom
Coordinate ymaxFour parameters determine;
The prediction of target frame confidence level is realized by 2 × 9=18 1 × 1 filters, to 9 candidate mesh on each grid
Mark frame calculates its classification confidence level, including two class of pedestrian and background.
In embodiment, wherein the testing result of a frame is not as shown in figure 3, the training process of convolutional neural networks uses self shrinking
It puts, 256 × 448 training image is directly sent into network according to batch size of B=32 and is trained.Trained objective function knot
The smooth L1 loss that position is confined for the Classification Loss (Softmax loss) of confidence level prediction and pedestrian is closed.
The computer CPU of operation embodiment is configured to Corei5-6500 3.20GHz × 4, and GPU is configured to GTX1080Ti,
Software environment is Caffe (1.0.0-rc3 version), for the video of 256 × 448 sizes, completes the pedestrian detection of a frame only about 7
Millisecond, it is averagely per second to handle 142 frames, it can satisfy most of real time monitoring processing requirement.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment
Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention,
It should be equivalent substitute mode, be included within the scope of the present invention.
Claims (10)
1. a kind of real-time pedestrian detection method, which is characterized in that divided automatically video frame according to the size of pedestrian in video
It cuts, single iteration is carried out in single width video frame, export pedestrian target frame and pedestrian's confidence;Include the following steps:
Determine the default resolution of network reception video in algorithm: Hd×Wd× 3, wherein Hd、WdRespectively refer to the height and width of image
Degree, the color channel number that 3 finger images include;
Present frame I is read, resolution ratio is H × W × 3;
According to the value of zoom factor z, the segmentation block number B of present frame I is determined;
According to zoom factor z and segmentation block number B, the size of adjustment present frame I is H ' × W ';
Pixel value in frame after normalization adjustment size;
Frame after segmentation normalization is B subgraph;
The subgraph that present frame is divided is according to (B, Hd, Wd, 3) dimension arrangement, carry out feature extraction, and obtain feature
The pedestrian target frame coordinate of figure confidence level corresponding with the frame;
Valid frame is screened from target frame, the target frame of reservation and its corresponding pedestrian's classification confidence level can be used as pedestrian detection
Output result;
Calculate the average height H of all pedestrians detected in present frameped, and set minimum and highest threshold value Hθ_minAnd Hθ_max,
If Hped< Hθ_min, then zoom factor z is increased by 1, if Hped>Hθ_max, then zoom factor z is reduced 1, other situations are then kept
Zoom factor is constant;Detection next frame video is repeated, until whole section of video detection finishes.
2. real-time pedestrian detection method according to claim 1, which is characterized in that the segmentation block of the determining present frame I
Number B's method particularly includes:
3. real-time pedestrian detection method according to claim 2, which is characterized in that for first frame image, will scale because
Sub- z is initialized as 0.
4. real-time pedestrian detection method according to claim 1, which is characterized in that described according to zoom factor z and segmentation
Block number B, the method for adjusting present frame I size specifically:
As B=1, make H '=Hd, W '=Wd;
As B=2, make H '=Hd、
As B > 2, make
5. real-time pedestrian detection method according to claim 1, which is characterized in that the frame after the normalization adjustment size
It is that each pixel value makes it normalize to section [0,1] divided by the pixel value upper limit in the frame after adjusting size.
6. real-time pedestrian detection method according to claim 1, which is characterized in that the frame after normalization is divided into B
Subgraph specifically includes following 3 kinds of situations:
As B=1, do not make to divide, whole frame is input in network model;
As B=2, by frame vertical segmentation at two parts, the ranks coordinate of pixel in present frame I is respectively indicated with x and y, then its
Middle a part is Il=I (x, y), 0≤x < Wd, 0≤y < Hd, another part Ir=I (x, y), W '-Wd≤ x < W ', 0≤y <
Hd;
Work as B=z2When, frame is divided into z row z column, total z2A subgraph, size are Hd×Wd。
7. real-time pedestrian detection method according to claim 1, which is characterized in that screen valid frame from candidate target frame
Specific steps are as follows:
Confidence threshold value θ and target frame number upper limit k is setbox, in Hout×WoutOnly retain confidence level not in × 9 candidate frames
Frame less than θ, and retain quantity and be no more than kboxIt is a, wherein HoutAnd WoutIt is the height and width for exporting characteristic pattern respectively;Institute
The target frame of reservation and its corresponding pedestrian's classification confidence level can be used as the output result of pedestrian detection.
8. real-time pedestrian detection method according to claim 1, which is characterized in that read when the current frame, if view to be detected
When frequency frame is single channel image, then the channel information is directly replicated, constructs 3 channel images.
9. a kind of neural network, which is characterized in that including 7 layers of convolutional layer, wherein the 1st layer is regular volume lamination, behind each layer it is equal
For separable depth convolutional layer;
Level 1 volume lamination uses 32 3 × 3 filters, followed by batch normalization layer and rectifies linear elementary layer;
The block that the separable convolutional layer of depth is made of one group of depth network layer structure successively includes depth convolutional layer, ReLU
Layer, BN layers, 1 × 1 convolutional layer, ReLU layers, BN layers;
Step-length is used to carry out down-sampling, remaining convolutional layer step to characteristic pattern for the convolution kernel of [2,2] in the 1st, 3,5,7 convolutional layers
A length of [1,1];
Filter quantity in preceding 6 feature extraction layers is followed successively by 32,64,128,128,256,256, in remaining feature extraction layer
Filter quantity be 512, the size of all filters is 3 × 3;
Neural network final output be dimension be (B, Hout, Wout, 512) characteristic pattern, wherein HoutAnd WoutIt is output respectively
The height and width of characteristic pattern.
10. a kind of target detection layer, which is characterized in that the target detection layer realizes two parts function, is pedestrian target frame respectively
Coordinate prediction and the prediction of target frame confidence level;
The prediction of pedestrian target frame is realized by 4 × 9=36 1 × 1 filters, predicts 9 times to each grid on characteristic pattern
Target frame is selected, each target frame is by Far Left coordinate xmin, rightmost coordinate xmax, the top coordinate yminAnd coordinate bottom
ymaxFour parameters determine;
The prediction of target frame confidence level is realized by 2 × 9=18 1 × 1 filters, to 9 candidate target frames on each grid
Calculate its classification confidence level, including two class of pedestrian and background.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910095995.7A CN109840498B (en) | 2019-01-31 | 2019-01-31 | Real-time pedestrian detection method, neural network and target detection layer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910095995.7A CN109840498B (en) | 2019-01-31 | 2019-01-31 | Real-time pedestrian detection method, neural network and target detection layer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109840498A true CN109840498A (en) | 2019-06-04 |
CN109840498B CN109840498B (en) | 2020-12-15 |
Family
ID=66884347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910095995.7A Expired - Fee Related CN109840498B (en) | 2019-01-31 | 2019-01-31 | Real-time pedestrian detection method, neural network and target detection layer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109840498B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110728200A (en) * | 2019-09-23 | 2020-01-24 | 武汉大学 | Real-time pedestrian detection method and system based on deep learning |
CN113111770A (en) * | 2021-04-12 | 2021-07-13 | 杭州赛鲁班网络科技有限公司 | Video processing method, device, terminal and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318216A (en) * | 2014-10-28 | 2015-01-28 | 宁波大学 | Method for recognizing and matching pedestrian targets across blind area in video surveillance |
CN106845621A (en) * | 2017-01-18 | 2017-06-13 | 山东大学 | Dense population number method of counting and system based on depth convolutional neural networks |
CN107316320A (en) * | 2017-06-19 | 2017-11-03 | 江西洪都航空工业集团有限责任公司 | The real-time pedestrian detecting system that a kind of use GPU accelerates |
CN107578021A (en) * | 2017-09-13 | 2018-01-12 | 北京文安智能技术股份有限公司 | Pedestrian detection method, apparatus and system based on deep learning network |
CN108960198A (en) * | 2018-07-28 | 2018-12-07 | 天津大学 | A kind of road traffic sign detection and recognition methods based on residual error SSD model |
CN109063559A (en) * | 2018-06-28 | 2018-12-21 | 东南大学 | A kind of pedestrian detection method returned based on improvement region |
CN109117717A (en) * | 2018-06-29 | 2019-01-01 | 广州烽火众智数字技术有限公司 | A kind of city pedestrian detection method |
-
2019
- 2019-01-31 CN CN201910095995.7A patent/CN109840498B/en not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318216A (en) * | 2014-10-28 | 2015-01-28 | 宁波大学 | Method for recognizing and matching pedestrian targets across blind area in video surveillance |
CN106845621A (en) * | 2017-01-18 | 2017-06-13 | 山东大学 | Dense population number method of counting and system based on depth convolutional neural networks |
CN107316320A (en) * | 2017-06-19 | 2017-11-03 | 江西洪都航空工业集团有限责任公司 | The real-time pedestrian detecting system that a kind of use GPU accelerates |
CN107578021A (en) * | 2017-09-13 | 2018-01-12 | 北京文安智能技术股份有限公司 | Pedestrian detection method, apparatus and system based on deep learning network |
CN109063559A (en) * | 2018-06-28 | 2018-12-21 | 东南大学 | A kind of pedestrian detection method returned based on improvement region |
CN109117717A (en) * | 2018-06-29 | 2019-01-01 | 广州烽火众智数字技术有限公司 | A kind of city pedestrian detection method |
CN108960198A (en) * | 2018-07-28 | 2018-12-07 | 天津大学 | A kind of road traffic sign detection and recognition methods based on residual error SSD model |
Non-Patent Citations (2)
Title |
---|
SHAOQING REN,ET AL.: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
YUXIN PENG,ET AL.: "Object-Part Attention Model for Fine-Grained Image Classification", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110728200A (en) * | 2019-09-23 | 2020-01-24 | 武汉大学 | Real-time pedestrian detection method and system based on deep learning |
CN113111770A (en) * | 2021-04-12 | 2021-07-13 | 杭州赛鲁班网络科技有限公司 | Video processing method, device, terminal and storage medium |
CN113111770B (en) * | 2021-04-12 | 2022-09-13 | 杭州赛鲁班网络科技有限公司 | Video processing method, device, terminal and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109840498B (en) | 2020-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jia et al. | Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN104463117B (en) | A kind of recognition of face sample collection method and system based on video mode | |
CN109410168B (en) | Modeling method of convolutional neural network for determining sub-tile classes in an image | |
CN108898145A (en) | A kind of image well-marked target detection method of combination deep learning | |
CN108830196A (en) | Pedestrian detection method based on feature pyramid network | |
CN110929593B (en) | Real-time significance pedestrian detection method based on detail discrimination | |
CN110443763B (en) | Convolutional neural network-based image shadow removing method | |
CN107808132A (en) | A kind of scene image classification method for merging topic model | |
CN108647694A (en) | Correlation filtering method for tracking target based on context-aware and automated response | |
CN107909081A (en) | The quick obtaining and quick calibrating method of image data set in a kind of deep learning | |
CN109377445A (en) | Model training method, the method, apparatus and electronic system for replacing image background | |
CN103262119A (en) | Method and system for segmenting an image | |
CN112150821A (en) | Lightweight vehicle detection model construction method, system and device | |
CN110705412A (en) | Video target detection method based on motion history image | |
CN109685045A (en) | A kind of Moving Targets Based on Video Streams tracking and system | |
CN108038455A (en) | Bionic machine peacock image-recognizing method based on deep learning | |
CN113297956B (en) | Gesture recognition method and system based on vision | |
CN107273933A (en) | The construction method of picture charge pattern grader a kind of and apply its face tracking methods | |
CN105825168A (en) | Golden snub-nosed monkey face detection and tracking algorithm based on S-TLD | |
CN110532959A (en) | Real-time act of violence detection system based on binary channels Three dimensional convolution neural network | |
CN110852199A (en) | Foreground extraction method based on double-frame coding and decoding model | |
CN114359245A (en) | Method for detecting surface defects of products in industrial scene | |
CN112163508A (en) | Character recognition method and system based on real scene and OCR terminal | |
CN109840498A (en) | A kind of real-time pedestrian detection method and neural network, target detection layer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20201215 Termination date: 20220131 |
|
CF01 | Termination of patent right due to non-payment of annual fee |