CN110490174A - Multiple dimensioned pedestrian detection method based on Fusion Features - Google Patents
Multiple dimensioned pedestrian detection method based on Fusion Features Download PDFInfo
- Publication number
- CN110490174A CN110490174A CN201910799142.1A CN201910799142A CN110490174A CN 110490174 A CN110490174 A CN 110490174A CN 201910799142 A CN201910799142 A CN 201910799142A CN 110490174 A CN110490174 A CN 110490174A
- Authority
- CN
- China
- Prior art keywords
- pedestrian detection
- layer
- network
- multiple dimensioned
- convolutional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
Abstract
The invention discloses a kind of multiple dimensioned pedestrian detection method based on Fusion Features, belong to the pedestrian detection technology field in computer vision, it solves in the prior art, to Fusion Features used in pedestrian detection, that multistage detection method will cause small target deteection precision is low, or detection time is long, resource requirement is high, thus the problem of being unable to reach real-time.The present invention includes: that the pedestrian detection data set that will acquire pre-processes;The multiple dimensioned pedestrian detection convolutional neural networks based on Fusion Features are constructed, multiple dimensioned pedestrian detection convolutional neural networks include the shared convolutional neural networks extracted for Fusion Features and the scale sub-network for detecting fusion feature;Pretreated pedestrian detection data set is input in multiple dimensioned pedestrian detection convolutional neural networks and is trained, the multiple dimensioned pedestrian detection convolutional neural networks after being trained;Pedestrian image to be detected is input to the multiple dimensioned pedestrian detection convolutional neural networks after training, obtains final testing result.
Description
Technical field
A kind of multiple dimensioned pedestrian detection method based on Fusion Features is used for multiple dimensioned pedestrian detection, belongs to computer view
Pedestrian detection technology field in feel.
Background technique
Pedestrian detection is one of the important topic in computer vision and area of pattern recognition.Pedestrian detection can simply divide
For two tasks: positioning;Classification.Positioning is exactly that the specific location of pedestrian in the picture is identified, obtains corresponding recurrence
Frame.Classification is exactly to assign label to pedestrian target, since pedestrian detection only exists two classifications, i.e. pedestrian and background, so
Classification task is easier to complete.Therefore, the most important task of pedestrian detection is exactly that pedestrian target is accurately positioned.Row
The technologies such as people's detection technique has very strong use value, it can be identified with multi-human tracking, pedestrian again in conjunction with, be applied to automobile without
People's control loop, intelligent robot, intelligent video monitoring, human body behavioural analysis, people flow rate statistical system, intelligent transportation field.
Since pedestrian has the double attribute of rigid objects and flexible article, various postures and shape, table are had
See feature worn, posture, visual angle etc. influence it is very big, be in addition also faced with block, the factors such as illumination influence, this makes pedestrian's mesh
Mark detection becomes an extremely challenging research direction in computer vision field.Currently used pedestrian detection method is main
It is divided into two classes: the method based on motion detection and the method based on machine learning.Method main thought based on motion detection is
Using background modeling, the foreground target of movement is extracted, is then classified using foreground target of the classifier to movement, is judged
It whether is pedestrian.Method based on machine learning is the mainstream algorithm in current pedestrian detection field.Method based on machine learning
It can be further divided into and the method for classifier be added based on manual feature and based on the method for deep learning.But the above method is all
It cannot smoothly solve the problems, such as multiple dimensioned in pedestrian target detection field, this is mainly due to large scale pedestrian targets and small scale
The opinion of the feature of pedestrian target is difficult to be resolved by prior art means there are biggish difference.
Current existing solution mainly has Fusion Features, multistage detection method.Fusion Features although it is available more
Add the feature of robustness, but since there are great differences for the feature of the pedestrian target of the pedestrian target and large scale of small scale,
For example, the body bone of large-sized pedestrian target can provide information abundant for human body target detection, but small size
Pedestrian target lacks enough skeleton characters, therefore Feature fusion not can solve the problem, to make small sized
Pedestrian target, the i.e. detection accuracy of Small object reduce.Multistage detection method, that is to say, that obtained first by a detection network
This preliminary testing result, is then re-entered into that network trunk is continuous to be detected by preliminary testing result.This method
Belong to serial approach, the testing result of back depend critically upon front as a result, the fortune of network can not be accelerated by parallel computation
Scanning frequency degree.In addition, if there are multiple pedestrians in the inside, preliminary testing result just has multiple and different for a picture
Region, it would be desirable to these different zones are sent in network and are trained, it is assumed that say that Preliminary detection has 20 detection blocks,
So we just need to carry out the forward calculation of 20 networks in the second level is detected, if continuing cascade detectors below, that
Forward calculation more times is just needed, on the one hand such case increases the computational resource requirements of hardware, another aspect network pushes away
Drill overlong time, it is difficult to reach real-time.Therefore a kind of multiple dimensioned method of pedestrian detection that effectively solves is urgently to be resolved asks
Topic.
Summary of the invention
Aiming at the problem that the studies above, the multiple dimensioned pedestrian inspection based on Fusion Features that the purpose of the present invention is to provide a kind of
Survey method, solves in the prior art, will cause small target deteection essence to Fusion Features used in pedestrian detection, multistage detection method
Low or detection time length, resource requirement height are spent, thus the problem of being unable to reach real-time.
In order to achieve the above object, the present invention adopts the following technical scheme:
A kind of multiple dimensioned pedestrian detection method based on Fusion Features, includes the following steps:
S1: the pedestrian detection data set that will acquire is pre-processed;
S2: multiple dimensioned pedestrian detection convolutional neural networks of the building based on Fusion Features, multiple dimensioned pedestrian detection convolution mind
It include the shared convolutional neural networks extracted for Fusion Features and the scale sub-network for detecting fusion feature through network;
S3: pretreated pedestrian detection data set is input in multiple dimensioned pedestrian detection convolutional neural networks and is instructed
Practice, the multiple dimensioned pedestrian detection convolutional neural networks after being trained;
S4: pedestrian image to be detected is input to the multiple dimensioned pedestrian detection convolutional neural networks after training, is obtained
Final testing result.
Further, the pedestrian detection data set in the step S1 is the training set extracted from Caltech data set,
Caltech data set has 11 file Set00~Set10, and each file includes multiple videos, wherein the resolution of video
Rate is 640*480;
Pretreatment refers to the VOC data format that each frame image in pedestrian detection data set is converted to standard, regenerates
The file of corresponding band mark, file format .xml, i.e. file subsequent are .xml.
Further, the shared convolutional neural networks in the step S2 successively include the first convolutional layer, the first BN layers, first
ReLu layers, the second convolutional layer, the 2nd BN layers, the 2nd ReLu layers, maximum pond layer, the first intensive residual error module, the first maximum pond
Change layer, the second intensive residual error module, the second maximum pond layer, the intensive residual error module of third, third maximum pond layer;First is intensive
Residual error module, the second intensive residual error module and the intensive residual error module of third include three convolutional layers and the residual error structure of ResNet,
Wherein, first convolutional layer is used to carry out the port number of input data dimension-reduction treatment, and second convolutional layer is used for first
The result of convolutional layer output carries out channel and rises dimension processing, and the residual error structure of ResNet is used for input data and second convolutional layer
Output result be added, obtain it is after being added as a result, third convolutional layer be used for result after being added carry out dimensionality reduction at
Reason, the processing result of three convolutional layers is intensively connected, i.e., the feature of different levels is overlapping together, each intensive residual error
Module exports the feature for having shallow-layer feature and further feature.
Further, the dimension size of the shared convolutional neural networks is 28*28*512, the first convolutional layer and the second convolution
Layer core size be 3*3, step-length 1, padding SAME;Maximum pond layer, the first maximum pond layer, the second maximum pond
Layer and third maximum pond layer core size be 2*2, step-length 2;First convolutional layer is the convolutional layer of 1*1* (c/2), 1*1 table
Show core size, the convolutional layer that second convolutional layer is 1*1*c, 1*1 indicates core size, and third convolutional layer is 1*1* (c/2)
Convolutional layer, 1*1 indicate core size, wherein the port number of c expression characteristic pattern.
Further, in the step S2, scale sub-network includes core network, branching networks, core network and branched network
The result of network output is weighted, as final testing result;
The core network includes for detecting the large scale sub-network of large scale target and detecting the small of small scaled target
Scale sub-network, large scale sub-network and small scale sub-network successively include the first convolutional layer, the first BN layers, the first ReLu layers,
Second convolutional layer, the 2nd BN layers, the 2nd ReLu layers, the first maximum pond layer, third convolutional layer, the 3rd BN layers, the 3rd ReLu layers,
Volume Four lamination, the 4th BN layers, the 4th ReLu layers, the second maximum pond layer, intensive residual error module, the 5th convolutional layer, the 5th BN
Layer, the 5th ReLu layers and loss function;
The branching networks include according to share convolutional neural networks output result height to large scale sub-network with
The scale perceptual weighting layer of weight is assigned in the output of small scale sub-network;Weight computing formula in scale perceptual weighting layer are as follows:
Wherein, ωlFor the weight of large scale sub-network, ωsThe weight of small scale sub-network is represented,Represent pedestrian target
Average height, α and β are proportionality coefficients, optimize the two parameters by backpropagation, h indicates the height of any pedestrian target.
Further, the first convolutional layer in the core network, the second convolutional layer, third convolutional layer and Volume Four lamination
Core size is 3*3, and the core size of the 5th convolutional layer is 3*3, step-length 1, padding SAME;First maximum pond layer and the
The core size of two maximum pond layers is 2*2, step-length 2.
Further, the detection framework of the output of the 5th convolutional layer of the large scale sub-network and small scale sub-network uses
YoLo algorithm, the anchor point in YoLo algorithm utilize pedestrian's bbox depth-width ratio in k-means clustering pedestrian detection data set
Feature obtains, and the area of anchor point is set as 7*7, and under the area size, choosing length-width ratio is respectively { 3: 1,5: 2,5: 3 },
In, bbox is callout box.
Further, the loss function is the weighting of intersection entropy loss and the Smooth L1 based on positioning based on classification
With, using stochastic gradient descent method be optimization method, initial learning rate be set as 0.001, loss no longer decline as training knot
Beam condition.
Further, in the step S3, the shared convolutional neural networks of pre-training on ImageNet data set are used
The initiation parameter of initial parameter of the parameter as shared convolutional neural networks, the sub-network based on scale uses distribution initialization
Parameter, i.e., common deep learning initialization mode;In the training of the multiple dimensioned pedestrian detection convolutional neural networks, pass through ladder
Degree decline carries out backpropagation, carries out parameter update.
The present invention compared with the existing technology, its advantages are shown in:
One, the present invention is based on the shared convolutional neural networks of Fusion Features, take full advantage of convolutional network shallow-layer feature with
Further feature, shallow-layer feature resolution is big, is conducive to accurate positioning;Further feature semantic information is abundant, and it is correct to be conducive to classification;
Two, the present invention is based on the shared convolutional neural networks of Fusion Features simultaneously, combines scale sub-network, solves pedestrian
Small scale and large scale sub-network are fused in unified frame, provide by the different problem of target scale in detection process
A kind of settling mode end to end, finally obtains good pedestrian detection result;
Three, the multiple dimensioned pedestrian detection frame of the present invention based on Fusion Features is based on YoLo, for pedestrian
The characteristics of depth-width ratio, design are suitable for the bounding box of pedestrian detection, i.e., are clustered using k-means thought to labeled data,
The depth-width ratio distribution situation with pedestrian is obtained, so that pedestrian detection result is more anxious accurate.
Detailed description of the invention
Fig. 1 is flow diagram of the invention;
Fig. 2 is the structural schematic diagram of intensive residual error module in the present invention, wherein w is characterized figure width, h is characterized figure height
Degree, c are characterized figure port number;
Fig. 3 is the structural schematic diagram of multiple dimensioned pedestrian detection convolutional neural networks in the present invention, wherein conv indicates convolution
Layer, stride indicate that step-length, pool indicate that maximum pond layer, DRM indicate that intensive residual error module, weight1 indicate large scale net
Weight, the weight2 of network indicate the weight of small scale network;
Fig. 4 is the schematic diagram of the multiple dimensioned pedestrian detection frame based on Fusion Features in the present invention;
Fig. 5 is in embodiment using YOLO detection method testing result schematic diagram in the prior art;
Fig. 6 is the testing result detected in embodiment using multiple dimensioned pedestrian detection convolutional neural networks detection method
Schematic diagram.
Specific embodiment
Below in conjunction with the drawings and the specific embodiments, the invention will be further described.
A kind of multiple dimensioned pedestrian detection method based on Fusion Features, includes the following steps:
S1: the pedestrian detection data set that will acquire is pre-processed;Pedestrian detection data set is from Caltech data set
The training set of extraction, Caltech data set have 11 file Set00~Set10, and each file includes multiple videos,
In, the resolution ratio of video is 640*480;
Pretreatment refers to the VOC data format that each frame image in pedestrian detection data set is converted to standard, regenerates
The file of corresponding band mark, file format .xml, i.e. file suffixes are .xml.
S2: multiple dimensioned pedestrian detection convolutional neural networks of the building based on Fusion Features, multiple dimensioned pedestrian detection convolution mind
It include the shared convolutional neural networks extracted for Fusion Features and the scale sub-network for detecting fusion feature through network;
Enjoy convolutional neural networks successively and include the first convolutional layer, the first BN layers, the first ReLu layers, the second convolutional layer, the 2nd BN layers, second
ReLu layers, maximum pond layer, the first intensive residual error module, the first maximum pond layer, the second intensive residual error module, the second maximum pond
Change layer, the intensive residual error module of third, third maximum pond layer;First intensive residual error module, the second intensive residual error module and third
Intensive residual error module includes three convolutional layers and the residual error structure of ResNet, wherein first convolutional layer is used for input data
Port number carry out dimension-reduction treatment, second convolutional layer is used to carry out the result of first convolutional layer output in channel to rise Wei Chu
Reason, the residual error structure of ResNet are used for the defeated of input data and second convolutional layer through " shortcut connection "
Result is added out, obtains after being added as a result, third convolutional layer is used to incite somebody to action result after being added progress dimension-reduction treatment
The processing result of three convolutional layers is intensively connected, i.e., the feature of different levels is overlapping together, each intensive residual error module
Output has the feature of shallow-layer feature and further feature.The dimension size of the shared convolutional neural networks is 28*28*512, the
One convolutional layer and the second convolutional layer core size be 3*3, port number 64, step-length 1, padding SAME;Maximum pond layer,
The core size of first maximum pond layer, the second maximum pond layer and third maximum pond layer is 2*2, step-length 2;First convolution
Layer is the convolutional layer of 1*1* (c/2), and 1*1 indicates core size, the convolutional layer that second convolutional layer is 1*1*c, and 1*1 indicates that core is big
Small, third convolutional layer is the convolutional layer of 1*1* (c/2), and 1*1 indicates core size, wherein the port number of c expression characteristic pattern.
Scale sub-network includes core network, branching networks, and the result that core network and branching networks export is weighted,
As final testing result;
The core network includes for detecting the large scale sub-network of large scale target and detecting the small of small scaled target
Scale sub-network, large scale sub-network and small scale sub-network successively include the first convolutional layer, the first BN layers, the first ReLu layers,
Second convolutional layer, the 2nd BN layers, the 2nd ReLu layers, the first maximum pond layer, third convolutional layer, the 3rd BN layers, the 3rd ReLu layers,
Volume Four lamination, the 4th BN layers, the 4th ReLu layers, the second maximum pond layer, intensive residual error module, the 5th convolutional layer, the 5th BN
Layer, the 5th ReLu layers and loss function;
The branching networks include according to share convolutional neural networks output result height to large scale sub-network with
The scale perceptual weighting layer of weight is assigned in the output of small scale sub-network;Weight computing formula in scale perceptual weighting layer are as follows:
Wherein, ωlFor the weight of large scale sub-network, ωsThe weight of small scale sub-network is represented,Represent pedestrian target
Average height, α and β are proportionality coefficients, optimize the two parameters by backpropagation, h indicates the height of any pedestrian target.
The core size of the first convolutional layer, the second convolutional layer, third convolutional layer and Volume Four lamination in the core network
For 3*3, port number 512, the core size of the 5th convolutional layer is 3*3, port number 12, step-length 1, padding SAME;The
The core size of one maximum pond layer and the second maximum pond layer is 2*2, step-length 2.
The detection framework of the output of 5th convolutional layer of the large scale sub-network and small scale sub-network is calculated using YoLo
Method, the anchor point in YoLo algorithm are obtained using pedestrian's bbox depth-width ratio feature in k-means clustering pedestrian detection data set
It arrives, the area of anchor point is set as 7*7, and under the area size, choosing length-width ratio is respectively { 3: 1,5: 2,5: 3 }, wherein bbox
For callout box.
The loss function is the weighted sum of intersection entropy loss and the Smooth L1 based on positioning based on classification, is used
Stochastic gradient descent method is optimization method, and initial learning rate is set as 0.001, loss and no longer declines as training termination condition.
S3: pretreated pedestrian detection data set is input in multiple dimensioned pedestrian detection convolutional neural networks and is instructed
Practice (training method is existing mode), the multiple dimensioned pedestrian detection convolutional neural networks after being trained;Even if being used in
Initial parameter of the parameter of the shared convolutional neural networks of pre-training as shared convolutional neural networks on ImageNet data set,
The initiation parameter of sub-network based on scale uses distribution initiation parameter, i.e., common deep learning initialization mode;Institute
In the training for stating multiple dimensioned pedestrian detection convolutional neural networks, backpropagation is carried out by stochastic gradient descent, carries out parameter more
Newly.
S4: pedestrian image to be detected is input to the multiple dimensioned pedestrian detection convolutional neural networks after training, is obtained
Final testing result.
Embodiment
Test set is extracted from Caltech data set, the pedestrian image of the 448*448*3 to be detected in test set is inputted
Multiple dimensioned pedestrian detection convolutional neural networks to after training, obtain final testing result, as shown in Figure 6.
The pedestrian image of 448*448*3 to be detected in test set is detected using existing YOLO mode, is obtained
The testing result arrived is as shown in Figure 5.
By experiment it is found that the multiple dimensioned pedestrian detection convolution in YOLO detection method in the prior art, the present invention is refreshing
Through network detecting method and the present invention in multiple dimensioned pedestrian detection convolutional neural networks+YOLO algorithm detection accuracy mAP and
The contrast table of one second video frame number FPS that can be handled is as follows:
If only clearly can see that detection accuracy is very poor using YOLO, that is, be directed to the detection accuracy of Small object
It is very poor, this is mainly due to the small scale pedestrian accounting of Caltech data set is very big, and contains and hidden in labeled data data
Target is kept off, the difficulty of detection is increased.And the very good solution of the present invention different problem of pedestrian detection mesoscale, therefore at this
It is showed on data set good.Meanwhile multiple dimensioned pedestrian detection convolutional neural networks+YOLO algorithm combines, i.e., not only 7*7's
It is detected on feature map, the last one experimental selection is detected on { 7*7,14*14 } two scales, final to detect
As a result 2% or so are all promoted, the present invention significantly improves detection essence in the case where FPS is reduced little by little in summary
Degree.
The above is only the representative embodiment in the numerous concrete application ranges of the present invention, to protection scope of the present invention not structure
At any restrictions.It is all using transformation or equivalence replacement and the technical solution that is formed, all fall within rights protection scope of the present invention it
It is interior.
Claims (9)
1. a kind of multiple dimensioned pedestrian detection method based on Fusion Features, which comprises the steps of:
S1: the pedestrian detection data set that will acquire is pre-processed;
S2: multiple dimensioned pedestrian detection convolutional neural networks of the building based on Fusion Features, multiple dimensioned pedestrian detection convolutional Neural net
Network includes the shared convolutional neural networks extracted for Fusion Features and the scale sub-network for detecting fusion feature;
S3: pretreated pedestrian detection data set being input in multiple dimensioned pedestrian detection convolutional neural networks and is trained,
Multiple dimensioned pedestrian detection convolutional neural networks after being trained;
S4: being input to the multiple dimensioned pedestrian detection convolutional neural networks after training for pedestrian image to be detected, obtains final
Testing result.
2. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 1, which is characterized in that described
Pedestrian detection data set in step S1 is the training set extracted from Caltech data set, and Caltech data set has 11 texts
Part presss from both sides Set00~Set10, and each file includes multiple videos, wherein the resolution ratio of video is 640*480;
Pretreatment refers to the VOC data format that each frame image in pedestrian detection data set is converted to standard, and regeneration corresponds to
With mark file, file format .xml, i.e. file suffixes be .xml.
3. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 1 or 2, which is characterized in that
Shared convolutional neural networks in the step S2 successively include the first convolutional layer, the first BN layers, the first ReLu layers, the second convolution
Layer, the 2nd BN layer, the 2nd ReLu layer, it is maximum pond layer, the first intensive residual error module, the first maximum pond layer, second intensively residual
Difference module, the second maximum pond layer, the intensive residual error module of third, third maximum pond layer;It is first intensive residual error module, second close
Collect residual error module and the intensive residual error module of third includes three convolutional layers and the residual error structure of ResNet, wherein first convolution
Layer carries out dimension-reduction treatment, the result that second convolutional layer is used to export first convolutional layer for the port number to input data
It carries out channel and rises dimension processing, the residual error structure of ResNet is used to input data and the output result of second convolutional layer carrying out phase
Add, obtains after being added as a result, third convolutional layer is used to carry out dimension-reduction treatment to result after being added, by three convolutional layers
Processing result is intensively connected, i.e., the feature of different levels is overlapping together, and each intensive residual error module output has shallow-layer
The feature of feature and further feature.
4. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 3, which is characterized in that described
The dimension size of convolutional neural networks is shared for 28*28*512, and the core size of the first convolutional layer and the second convolutional layer is 3*3, walks
A length of 1, padding SAME;Maximum pond layer, the first maximum pond layer, the second maximum pond layer and third maximum pond layer
Core size be 2*2, step-length 2;First convolutional layer is the convolutional layer of 1*1* (c/2), and 1*1 indicates core size, second volume
Lamination is the convolutional layer of 1*1*c, and 1*1 indicates core size, and third convolutional layer is the convolutional layer of 1*1* (c/2), and 1*1 indicates that core is big
It is small, wherein the port number of c expression characteristic pattern.
5. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 1 or 2, which is characterized in that
In the step S2, scale sub-network includes core network, branching networks, the result progress of core network and branching networks output
Weighting, as final testing result;
The core network includes for detecting the large scale sub-network of large scale target and detecting the small scale of small scaled target
Sub-network, large scale sub-network and small scale sub-network successively include the first convolutional layer, the first BN layers, the first ReLu layers, second
Convolutional layer, the 2nd BN layers, the 2nd ReLu layers, the first maximum pond layer, third convolutional layer, the 3rd BN layers, the 3rd ReLu layers, the 4th
Convolutional layer, the 4th BN layers, the 4th ReLu layers, the second maximum pond layer, intensive residual error module, the 5th convolutional layer, the 5th BN layers, the
Five ReLu layers and loss function;
The branching networks include the height according to the output result for sharing convolutional neural networks to large scale sub-network and small ruler
The scale perceptual weighting layer of weight is assigned in the output for spending sub-network;Weight computing formula in scale perceptual weighting layer are as follows:
Wherein, ωlFor the weight of large scale sub-network, ωsThe weight of small scale sub-network is represented,Represent being averaged for pedestrian target
Highly, α and β is proportionality coefficient, optimizes the two parameters by backpropagation, h indicates the height of any pedestrian target.
6. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 5, which is characterized in that described
The core size of the first convolutional layer, the second convolutional layer, third convolutional layer and Volume Four lamination in core network be 3*3, volume five
The core size of lamination is 3*3, step-length 1, padding SAME;The core of first maximum pond layer and the second maximum pond layer is big
Small is 2*2, step-length 2.
7. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 6, which is characterized in that described
The detection framework of the output of 5th convolutional layer of large scale sub-network and small scale sub-network uses YoLo algorithm, in YoLo algorithm
Anchor point obtained using pedestrian's bbox depth-width ratio feature in k-means clustering pedestrian detection data set, the area of anchor point
It is set as 7*7, under the area size, choosing length-width ratio is respectively { 3: 1,5: 2,5: 3 }, wherein bbox is callout box.
8. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 7, which is characterized in that described
Loss function is the weighted sum of intersection entropy loss and the Smooth L1 based on positioning based on classification, uses stochastic gradient descent
Method is optimization method, and initial learning rate is set as 0.001, loss and no longer declines as training termination condition.
9. a kind of multiple dimensioned pedestrian detection method based on Fusion Features according to claim 1, which is characterized in that described
In step S3, use the parameter of the shared convolutional neural networks of pre-training on ImageNet data set as shared convolutional Neural
The initiation parameter of the initial parameter of network, the sub-network based on scale uses distribution initiation parameter, i.e. common depth
Practise initialization mode;In the training of the multiple dimensioned pedestrian detection convolutional neural networks, carried out by stochastic gradient descent reversed
It propagates, carries out parameter update.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910799142.1A CN110490174A (en) | 2019-08-27 | 2019-08-27 | Multiple dimensioned pedestrian detection method based on Fusion Features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910799142.1A CN110490174A (en) | 2019-08-27 | 2019-08-27 | Multiple dimensioned pedestrian detection method based on Fusion Features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110490174A true CN110490174A (en) | 2019-11-22 |
Family
ID=68554575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910799142.1A Pending CN110490174A (en) | 2019-08-27 | 2019-08-27 | Multiple dimensioned pedestrian detection method based on Fusion Features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490174A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259736A (en) * | 2020-01-08 | 2020-06-09 | 上海海事大学 | Real-time pedestrian detection method based on deep learning in complex environment |
CN111488918A (en) * | 2020-03-20 | 2020-08-04 | 天津大学 | Transformer substation infrared image equipment detection method based on convolutional neural network |
CN111582092A (en) * | 2020-04-27 | 2020-08-25 | 西安交通大学 | Pedestrian abnormal behavior detection method based on human skeleton |
CN111986255A (en) * | 2020-09-07 | 2020-11-24 | 北京凌云光技术集团有限责任公司 | Multi-scale anchor initialization method and device of image detection model |
CN112016527A (en) * | 2020-10-19 | 2020-12-01 | 成都大熊猫繁育研究基地 | Panda behavior recognition method, system, terminal and medium based on deep learning |
CN112164034A (en) * | 2020-09-15 | 2021-01-01 | 郑州金惠计算机系统工程有限公司 | Workpiece surface defect detection method and device, electronic equipment and storage medium |
CN112990073A (en) * | 2021-03-31 | 2021-06-18 | 南京农业大学 | Suckling period piglet activity rule statistical system based on edge calculation |
CN113128316A (en) * | 2020-01-15 | 2021-07-16 | 北京四维图新科技股份有限公司 | Target detection method and device |
CN113139979A (en) * | 2021-04-21 | 2021-07-20 | 广州大学 | Edge identification method based on deep learning |
CN113269038A (en) * | 2021-04-19 | 2021-08-17 | 南京邮电大学 | Multi-scale-based pedestrian detection method |
CN113505640A (en) * | 2021-05-31 | 2021-10-15 | 东南大学 | Small-scale pedestrian detection method based on multi-scale feature fusion |
CN113516140A (en) * | 2020-05-07 | 2021-10-19 | 阿里巴巴集团控股有限公司 | Image processing method, model training method, system and equipment |
CN113850284A (en) * | 2021-07-04 | 2021-12-28 | 天津大学 | Multi-operation detection method based on multi-scale feature fusion and multi-branch prediction |
CN114652326A (en) * | 2022-01-30 | 2022-06-24 | 天津大学 | Real-time brain fatigue monitoring device based on deep learning and data processing method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180211130A1 (en) * | 2015-07-29 | 2018-07-26 | Nokia Technologies Oy | Object detection with neural network |
US20180307921A1 (en) * | 2017-04-25 | 2018-10-25 | Uber Technologies, Inc. | Image-Based Pedestrian Detection |
CN109033979A (en) * | 2018-06-29 | 2018-12-18 | 西北工业大学 | Indoor pedestrian detection method based on WIFI and camera sensor decision level fusion |
CN109102025A (en) * | 2018-08-15 | 2018-12-28 | 电子科技大学 | Pedestrian based on deep learning combined optimization recognition methods again |
-
2019
- 2019-08-27 CN CN201910799142.1A patent/CN110490174A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180211130A1 (en) * | 2015-07-29 | 2018-07-26 | Nokia Technologies Oy | Object detection with neural network |
US20180307921A1 (en) * | 2017-04-25 | 2018-10-25 | Uber Technologies, Inc. | Image-Based Pedestrian Detection |
CN109033979A (en) * | 2018-06-29 | 2018-12-18 | 西北工业大学 | Indoor pedestrian detection method based on WIFI and camera sensor decision level fusion |
CN109102025A (en) * | 2018-08-15 | 2018-12-28 | 电子科技大学 | Pedestrian based on deep learning combined optimization recognition methods again |
Non-Patent Citations (2)
Title |
---|
JIANAN LI等: "Scale-Aware Fast R-CNN for Pedestrian Detection", 《IEEE TRANSACTIONS ON MULTIMEDIA》 * |
王强: "智能视频监控中的行人检测系统设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111259736B (en) * | 2020-01-08 | 2023-04-07 | 上海海事大学 | Real-time pedestrian detection method based on deep learning in complex environment |
CN111259736A (en) * | 2020-01-08 | 2020-06-09 | 上海海事大学 | Real-time pedestrian detection method based on deep learning in complex environment |
CN113128316A (en) * | 2020-01-15 | 2021-07-16 | 北京四维图新科技股份有限公司 | Target detection method and device |
CN111488918A (en) * | 2020-03-20 | 2020-08-04 | 天津大学 | Transformer substation infrared image equipment detection method based on convolutional neural network |
CN111582092B (en) * | 2020-04-27 | 2023-12-22 | 西安交通大学 | Pedestrian abnormal behavior detection method based on human skeleton |
CN111582092A (en) * | 2020-04-27 | 2020-08-25 | 西安交通大学 | Pedestrian abnormal behavior detection method based on human skeleton |
CN113516140A (en) * | 2020-05-07 | 2021-10-19 | 阿里巴巴集团控股有限公司 | Image processing method, model training method, system and equipment |
CN111986255A (en) * | 2020-09-07 | 2020-11-24 | 北京凌云光技术集团有限责任公司 | Multi-scale anchor initialization method and device of image detection model |
CN111986255B (en) * | 2020-09-07 | 2024-04-09 | 凌云光技术股份有限公司 | Multi-scale anchor initializing method and device of image detection model |
CN112164034A (en) * | 2020-09-15 | 2021-01-01 | 郑州金惠计算机系统工程有限公司 | Workpiece surface defect detection method and device, electronic equipment and storage medium |
CN112164034B (en) * | 2020-09-15 | 2023-04-28 | 郑州金惠计算机系统工程有限公司 | Workpiece surface defect detection method and device, electronic equipment and storage medium |
CN112016527A (en) * | 2020-10-19 | 2020-12-01 | 成都大熊猫繁育研究基地 | Panda behavior recognition method, system, terminal and medium based on deep learning |
CN112990073A (en) * | 2021-03-31 | 2021-06-18 | 南京农业大学 | Suckling period piglet activity rule statistical system based on edge calculation |
CN113269038A (en) * | 2021-04-19 | 2021-08-17 | 南京邮电大学 | Multi-scale-based pedestrian detection method |
CN113269038B (en) * | 2021-04-19 | 2022-07-15 | 南京邮电大学 | Multi-scale-based pedestrian detection method |
CN113139979A (en) * | 2021-04-21 | 2021-07-20 | 广州大学 | Edge identification method based on deep learning |
CN113505640A (en) * | 2021-05-31 | 2021-10-15 | 东南大学 | Small-scale pedestrian detection method based on multi-scale feature fusion |
CN113850284A (en) * | 2021-07-04 | 2021-12-28 | 天津大学 | Multi-operation detection method based on multi-scale feature fusion and multi-branch prediction |
CN114652326A (en) * | 2022-01-30 | 2022-06-24 | 天津大学 | Real-time brain fatigue monitoring device based on deep learning and data processing method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490174A (en) | Multiple dimensioned pedestrian detection method based on Fusion Features | |
Dewi et al. | Synthetic Data generation using DCGAN for improved traffic sign recognition | |
CN109800689B (en) | Target tracking method based on space-time feature fusion learning | |
US11494938B2 (en) | Multi-person pose estimation using skeleton prediction | |
CN104599275B (en) | The RGB-D scene understanding methods of imparametrization based on probability graph model | |
CN106845499A (en) | A kind of image object detection method semantic based on natural language | |
CN104281853A (en) | Behavior identification method based on 3D convolution neural network | |
Gong et al. | Object detection based on improved YOLOv3-tiny | |
CN108764019A (en) | A kind of Video Events detection method based on multi-source deep learning | |
CN110555387A (en) | Behavior identification method based on local joint point track space-time volume in skeleton sequence | |
CN110046677B (en) | Data preprocessing method, map construction method, loop detection method and system | |
CN110232361A (en) | Human body behavior intension recognizing method and system based on the dense network of three-dimensional residual error | |
CN114821640A (en) | Skeleton action identification method based on multi-stream multi-scale expansion space-time diagram convolution network | |
CN110197121A (en) | Moving target detecting method, moving object detection module and monitoring system based on DirectShow | |
CN114689038A (en) | Fruit detection positioning and orchard map construction method based on machine vision | |
Shah et al. | Detection of different types of blood cells: A comparative analysis | |
Chu et al. | Target tracking via particle filter and convolutional network | |
CN117423032A (en) | Time sequence dividing method for human body action with space-time fine granularity, electronic equipment and computer readable storage medium | |
CN117079095A (en) | Deep learning-based high-altitude parabolic detection method, system, medium and equipment | |
CN110826575A (en) | Underwater target identification method based on machine learning | |
Xie et al. | Automatic parking space detection system based on improved YOLO algorithm | |
CN115909086A (en) | SAR target detection and identification method based on multistage enhanced network | |
CN112926681B (en) | Target detection method and device based on deep convolutional neural network | |
Qin et al. | Joint prediction and association for deep feature multiple object tracking | |
Putro et al. | Fast person detector with efficient multi-level contextual block for supporting assistive robot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191122 |
|
RJ01 | Rejection of invention patent application after publication |