CN110321869A - Personnel's detection and extracting method based on Multiscale Fusion network - Google Patents
Personnel's detection and extracting method based on Multiscale Fusion network Download PDFInfo
- Publication number
- CN110321869A CN110321869A CN201910617365.1A CN201910617365A CN110321869A CN 110321869 A CN110321869 A CN 110321869A CN 201910617365 A CN201910617365 A CN 201910617365A CN 110321869 A CN110321869 A CN 110321869A
- Authority
- CN
- China
- Prior art keywords
- image
- detection
- pedestrian
- density estimation
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of personnel's detection and extracting method based on Multiscale Fusion network, the detection of pedestrian target is carried out using computer using the detection method based on crowd density estimation.This method extracts a frame image of fire video image using computer as analysis platform, input is in trained crowd density estimation model, export an an equal amount of crowd density estimation image, whether have pedestrian target and pedestrian target position in the picture, and then complete the detection and extraction of pedestrian target in the picture if be can reflect out in image by crowd density estimation image.This method can use the detection that deep learning network directly carries out pedestrian target to original image, it does not need to carry out extra processing, compared with traditional feature extracting method, detection efficiency can be turned up, to different perspective image distortion better adaptabilities, this method effectively can carry out detection and quickly judgement to pedestrian target in fire video.
Description
Technical field
The present invention relates to a kind of personnel's detection and extracting method based on Multiscale Fusion network, belong to computer vision neck
Domain is the detection of video personnel targets and extracting method for cause of fire investigation.
Background technique
Screening is carried out to the pedestrian on scene of fire periphery during case investigation, judge whether its behavior is fiery with this
Calamity is relevant.In today's society, the detection of pedestrian target is a critically important project, is able to detect trip in the picture
People's target plays an important role in many fields.With the development of technology, in recent years pedestrian target detection technique there has also been
The progress of significant progress, especially artificial intelligence field, pedestrian target detection method is from original based on pedestrian target feature
The detection method of extraction is gradually evolved into the detection method based on deep learning.Based on deep learning method either from pedestrian
Detection accuracy or detection efficiency have very big promotion compared to pervious method.
In traditional pedestrian target detection work, main means are mainly detected as with the feature of pedestrian target, in face of sea
Video data is measured, investigator needs to take a long time observation image, and this traditional pedestrian target detection method has
Drawback and limitation.Firstly, conventional method in the method for pedestrian's target detection by the way of feature extraction, extract image
In pedestrian target various features, characteristic matching is then carried out with image to be detected, matched feature points correspond to figure
The pedestrian target sum as in, invariance of this method dependent on the characteristic point of each pedestrian, but the pedestrian in image has
When partial occlusion, if only occurred the part of people, such as leg, foot in picture, then it can not be judged as pedestrian, these methods cannot
Corresponding pedestrian's characteristic point is extracted well, then is unable to complete and carries out pedestrian target detection.Secondly as present various places
More and more it is mounted with video monitoring platform, the angle of the shooting of each camera is different from, and different shooting angle exists
Perspective distortion difference is then shown as in image, therefore the image effect of each camera shooting is different.Side based on feature extraction
Method has relatively high verification and measurement ratio for perspective distortion sensitivity under a certain angle shot scene, but after replacing scene,
The detection effect of pedestrian target can reduce very much, therefore this method, in different photographed scenes, pedestrian target detection effect is poor
It is different very big.
Summary of the invention
Situation and existing deficiency in view of the prior art, the present invention provides a kind of people based on Multiscale Fusion network
Member's detection and extracting method, this method are to use base using computer when there is pedestrian in a fire video scene
The detection of pedestrian target is carried out in the detection method of crowd density estimation.This method is extracted using computer as analysis platform
One frame image of fire video image, in trained crowd density estimation model, output one is an equal amount of for input
Whether crowd density estimation image, be can reflect out by crowd density estimation image has pedestrian target and pedestrian in image
The position of target in the picture, and then complete the detection and extraction of pedestrian target in the picture.
The present invention to achieve the above object, the technical solution adopted is that: personnel based on Multiscale Fusion network detection and
Extracting method, it is characterised in that: using computer as detection platform, using the method for image procossing, based on deep learning
Technology detects the personnel targets in original image, chooses the fire video image for preparing analysis in advance
Complete the detection to the pedestrian target in fire video image, the specific steps are as follows:
1, the training of crowd density estimation model:
1) preliminary processing is carried out to training fire video image, the preliminary treatment of image includes most possibly having in image
Pedestrian position marks in the mark and picture in the region of pedestrian target, and the pedestrian in image is by the way of Gaussian Blur
It is labeled, the number of people in the picture is all marked, and the number of people coordinate of label is then become corresponding density map, uses
A kind of simple mode is converted, i.e., the rectangle of a normal distribution, rectangular area model are generated in the center position of pedestrian
It is within enclosing and be 1, Gaussian Blur formula is as follows:
If a mark point on the xi of picture position, can be indicated, then corresponding ground with δ (x-xi)
The density map Y of truth can generate G δ using normalized Gaussian convolution, and S indicates the combination of mark point, entire density map
Integral be equal to the summations of all numbers in image;
2) network for the image input crowd density estimation completed mark, carries out the training of model, completes the figure of mark
As needing to carry out image amplification, i.e., image is overturn, is scaled, random cropping operation, using the image after the completion of processing as
Training set data inputs network, completes the training of network parameter;
3) test fire video image is inputted crowd density estimation model, carries out model after similarly handling
Measure of merit, equally by the 10 of entire data set as verifying collection, retraining carries out compliance test result after completing, and leads to
The training effect of test set is assessed in the performance completion for crossing verifying collection;
4) finally by the difference between test effect and actual effect, the loss function of network, Zhi Daoxiao are continued to optimize
Fruit is optimal.Loss function formula is as follows:
Wherein θ is the parameter set of model, and N is the quantity of training sample, XiIt is i-th of training sample, FlIt is XiIt is corresponding close
Spend the true value of figure, F (Xi;θ) be then model predicted value, the difference that L both is represents loss;
2, the application of crowd density estimation model:
Preliminary processing is carried out to fire video to be detected using computer, marks the area for being likely to occur pedestrian in image
Image to be detected input crowd density estimation model that processing is completed is predicted that output is the density estimation of crowd by domain
Figure, can be completed the detection whether detection image has pedestrian target.
The beneficial effects of the present invention are: the present invention is using the pedestrian target detection method based on density estimation, it is a kind of
Video personnel detection and extracting method applied to cause of fire investigation, are to carry out crowd density analysis to fire video image,
Judge whether there is pedestrian target in video image, is a kind of method based on deep learning.The present invention is regarded using existing fire
Frequency image is trained crowd density estimation model, is optimal the effect of model by adjusting the multiple parameters of model,
Then by image input model to be detected, the corresponding crowd density figure of a width is generated, it can be direct using crowd density figure
Whether there are pedestrian target and each pedestrian target position in the picture in analysis image.
The present invention carries out the assistant analysis of video image using depth learning technology, to image in pedestrian's target operation
In pedestrian target detection, there is great advantage, the present invention is generated in corresponding scene using depth learning technology
Crowd density distribution map can detecte out original image pedestrian target by the analysis of the crowd density distribution map to generation, be promoted
Pedestrian target detection efficiency and accuracy.
First, this method is analyzed based on video image, accurate and visual.
Second, this method is different from the pervious detection method based on pedestrian's feature, for flame, the object etc. in image
It blocks insensitive with camera perspective distortion.
Third has the extensive arrangement of monitoring and the development trend in future, this method universal in view of present each place
It is applicable in and has wide popularization space.
In short, using the pedestrian target detection method of crowd density estimation, than traditional pedestrian's mesh based on feature extraction
Marking detection method has much the detection that a little can use deep learning network directly to original image progress pedestrian target,
It does not need to carry out extra processing, compared with traditional feature extracting method, detection efficiency can be turned up, to different figures
As perspective distortion better adaptability, this method effectively can carry out detection and quickly judgement to pedestrian target in fire video.
Detailed description of the invention
Fig. 1 is present invention crowd's picture to be detected;
Fig. 2 is the video image for being possible to occur pedestrian area obtained after the present invention pre-processes;
Pedestrian Fig. 3 of the invention marks schematic diagram;
Fig. 4 is crowd density estimation model structure of the invention;
Crowd density estimation model Fig. 5 of the invention generates image schematic diagram;
Flow chart Fig. 6 of the invention.
Specific embodiment
As shown in Figures 1 to 6, personnel's detection and extracting method based on Multiscale Fusion network, are a kind of applied to fire
The video personnel of calamity causal investigation detect and extracting method, using computer as detection platform, using the side of image procossing
Method, the technology based on deep learning detect the personnel targets in original image, choose one in advance and prepare analysis
The detection to the pedestrian target in fire video image can be completed in fire video image, the specific steps are as follows:
1, the training of crowd density estimation model:
1) preliminary processing is carried out to training fire video image, the preliminary treatment of image includes most possibly having in image
Pedestrian position marks in the mark and picture in the region of pedestrian target, and the pedestrian in image is by the way of Gaussian Blur
It is labeled, the number of people in the picture is all marked, and the number of people coordinate of label is then become corresponding density map.We
It is converted using a kind of simple mode, i.e., generates the rectangle of a normal distribution, rectangle region in the center position of pedestrian
It is within the scope of domain and be 1.Gaussian Blur formula is as follows:
If a mark point on the xi of picture position, can be indicated with δ (x-xi).So corresponding ground
The density map Y of truth can generate G δ using normalized Gaussian convolution.The combination of S expression mark point.Entire density map
Integral be equal to the summations of all numbers in image.
2) network for the image input crowd density estimation completed mark, carries out the training of model.Complete the figure of mark
As needing to carry out image amplification, i.e., image is overturn, scaled, the image after the completion of processing is made in the operation such as random cropping
Network is inputted for training set data, completes the training of network parameter.
3) test fire video image is inputted crowd density estimation model, carries out model after similarly handling
Measure of merit, same we collect 10 or so of entire data set as verifying, and retraining carries out effect after completing
Verifying assesses the training effect of test set by the performance completion of verifying collection.
4) finally by the difference between test effect and actual effect, the loss function of network, Zhi Daoxiao are continued to optimize
Fruit is optimal, and loss function formula is as follows:
Wherein θ is the parameter set of model, and N is the quantity of training sample, XiIt is i-th of training sample, FiIt is XiIt is corresponding close
Spend the true value of figure, F (Xi;θ) be then model predicted value, the difference that L both is represents loss.
2, the application of crowd density estimation model:
Preliminary processing is carried out to fire video to be detected using computer, marks the area for being likely to occur pedestrian in image
Domain.Image to be detected input crowd density estimation model that processing is completed is predicted that output is the density estimation of crowd
Figure.The detection whether detection image has pedestrian target can be completed.
Fig. 1 is that crowd's picture to be detected passes through to not have the interference of pedestrian target region in rejection image
It there will not be the mode of pedestrian area blurring, only the region for being likely to occur pedestrian be trained, Fig. 2 is by processing
Crowd's picture.
Pedestrian Fig. 3 of the invention marks schematic diagram, is the figure after neural counting and network, is indicated in figure using origin
Single pedestrian target, multiple pedestrians are indicated by the way of stacking when aggregation, complete whole image finally by integral
Personnel count.
Fig. 4 is crowd density estimation model structure, that is, network structure of the invention, using biserial difference convolution kernel
Network, in order to extract the characteristics of image of different scale, first row extracts larger size characteristics of image, using from 11*11 to 7*7
Convolution kernel.Secondary series extracts the characteristics of image of smaller size, only with the convolution kernel of 3*3.Then the fusion of network is completed.
Since two column network structure of front has pooling layers of two layers of max, all output images only have original image resolution four/
One, so avoiding losing excessive letter in image to original image size by a column deconvolution network recovery after fusion
Breath.
Crowd density estimation model Fig. 5 of the invention generates image schematic diagram, and as image is after neural network model
The density image of generation, total number of persons are the sum of image integration.
Fire is first video image by processing change by flow chart, that is, inventive algorithm flow chart Fig. 6 of the invention
At multi-frame video image, image inputs deep learning network model after denoising and enhancing, generates one by the operation of model
Corresponding density of personnel image is opened, image size is identical with original video image size, passes through the integral of the density image to generation
Completion personnel count, and obtain personnel targets.
Claims (1)
1. a kind of personnel's detection and extracting method based on Multiscale Fusion network, it is characterised in that: using computer as inspection
Platform is surveyed, using the method for image procossing, the technology based on deep learning detects the personnel targets in original image,
The fire video image for preparing analysis is chosen in advance, and the detection to the pedestrian target in fire video image can be completed,
Specific step is as follows:
1, the training of crowd density estimation model:
1) preliminary processing is carried out to training fire video image, the preliminary treatment of image includes most possibly having pedestrian in image
Pedestrian position marks in the mark and picture of mesh target area, and the pedestrian in image is carried out by the way of Gaussian Blur
Mark, the number of people in the picture are all marked, and the number of people coordinate of label are then become corresponding density map, using a kind of letter
Single mode is converted, i.e. the rectangle in center position one normal distribution of generation of pedestrian, within the scope of rectangular area
And be 1, Gaussian Blur formula is as follows:
If a mark point on the xi of picture position, can be indicated with δ (x-xi), then corresponding ground truth
Density map Y can generate G δ using normalized Gaussian convolution, and S indicates the combination of mark point, and the integral of entire density map is just
Equal to the summation of numbers all in image;
2) network for the image input crowd density estimation completed mark, carries out the training of model, and the image for completing mark needs
Image amplification is carried out, i.e., image is overturn, is scaled, random cropping operation, using the image after the completion of processing as training
Collect data and input network, completes the training of network parameter;
3) test fire video image is inputted crowd density estimation model, carries out the effect of model after similarly handling
Test, equally by the 10 of entire data set as verifying collection, retraining carries out compliance test result after completing, and passes through verifying
The training effect of test set is assessed in the performance completion of collection;
4) finally by the difference between test effect and actual effect, continue to optimize the loss function of network, until effect most
Excellent, loss function formula is as follows:
Wherein θ is the parameter set of model, and N is the quantity of training sample, XiIt is i-th of training sample, FIIt is XiCorresponding density map
True value, F (Xi;θ) be then model predicted value, the difference that L both is represents loss;
2, the application of crowd density estimation model:
Preliminary processing is carried out to fire video to be detected using computer, marks the region for being likely to occur pedestrian in image,
Image to be detected input crowd density estimation model that processing is completed is predicted that output is the density estimation figure of crowd,
The detection whether detection image has pedestrian target can be completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910617365.1A CN110321869A (en) | 2019-07-10 | 2019-07-10 | Personnel's detection and extracting method based on Multiscale Fusion network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910617365.1A CN110321869A (en) | 2019-07-10 | 2019-07-10 | Personnel's detection and extracting method based on Multiscale Fusion network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110321869A true CN110321869A (en) | 2019-10-11 |
Family
ID=68123160
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910617365.1A Pending CN110321869A (en) | 2019-07-10 | 2019-07-10 | Personnel's detection and extracting method based on Multiscale Fusion network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110321869A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767881A (en) * | 2020-07-06 | 2020-10-13 | 中兴飞流信息科技有限公司 | Self-adaptive crowd density estimation device based on AI technology |
CN112115862A (en) * | 2020-09-18 | 2020-12-22 | 广东机场白云信息科技有限公司 | Crowded scene pedestrian detection method combined with density estimation |
CN113762219A (en) * | 2021-11-03 | 2021-12-07 | 恒林家居股份有限公司 | Method, system and storage medium for identifying people in mobile conference room |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2709065A1 (en) * | 2012-09-17 | 2014-03-19 | Lakeside Labs GmbH | Concept for counting moving objects passing a plurality of different areas within a region of interest |
CN104574352A (en) * | 2014-09-02 | 2015-04-29 | 重庆大学 | Crowd density grade classification method based on foreground image |
CN107301387A (en) * | 2017-06-16 | 2017-10-27 | 华南理工大学 | A kind of image Dense crowd method of counting based on deep learning |
CN108717528A (en) * | 2018-05-15 | 2018-10-30 | 苏州平江历史街区保护整治有限责任公司 | A kind of global population analysis method of more strategies based on depth network |
-
2019
- 2019-07-10 CN CN201910617365.1A patent/CN110321869A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2709065A1 (en) * | 2012-09-17 | 2014-03-19 | Lakeside Labs GmbH | Concept for counting moving objects passing a plurality of different areas within a region of interest |
CN104574352A (en) * | 2014-09-02 | 2015-04-29 | 重庆大学 | Crowd density grade classification method based on foreground image |
CN107301387A (en) * | 2017-06-16 | 2017-10-27 | 华南理工大学 | A kind of image Dense crowd method of counting based on deep learning |
CN108717528A (en) * | 2018-05-15 | 2018-10-30 | 苏州平江历史街区保护整治有限责任公司 | A kind of global population analysis method of more strategies based on depth network |
Non-Patent Citations (1)
Title |
---|
杨杰,张翔: "《视频目标检测和跟踪及其应用》", pages: 2 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767881A (en) * | 2020-07-06 | 2020-10-13 | 中兴飞流信息科技有限公司 | Self-adaptive crowd density estimation device based on AI technology |
CN112115862A (en) * | 2020-09-18 | 2020-12-22 | 广东机场白云信息科技有限公司 | Crowded scene pedestrian detection method combined with density estimation |
CN112115862B (en) * | 2020-09-18 | 2023-08-29 | 广东机场白云信息科技有限公司 | Congestion scene pedestrian detection method combined with density estimation |
CN113762219A (en) * | 2021-11-03 | 2021-12-07 | 恒林家居股份有限公司 | Method, system and storage medium for identifying people in mobile conference room |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
CN105809693B (en) | SAR image registration method based on deep neural network | |
CN104318548B (en) | Rapid image registration implementation method based on space sparsity and SIFT feature extraction | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN110059558A (en) | A kind of orchard barrier real-time detection method based on improvement SSD network | |
CN103345758B (en) | Jpeg image region duplication based on DCT statistical nature distorts blind checking method | |
CN111611874B (en) | Face mask wearing detection method based on ResNet and Canny | |
CN109409190A (en) | Pedestrian detection method based on histogram of gradients and Canny edge detector | |
Yang et al. | Traffic sign recognition in disturbing environments | |
CN110232387B (en) | Different-source image matching method based on KAZE-HOG algorithm | |
CN110321869A (en) | Personnel's detection and extracting method based on Multiscale Fusion network | |
CN107808161A (en) | A kind of Underwater targets recognition based on light vision | |
CN111445459A (en) | Image defect detection method and system based on depth twin network | |
CN106157308A (en) | Rectangular target object detecting method | |
Zhang et al. | Improved Fully Convolutional Network for Digital Image Region Forgery Detection. | |
CN106530271A (en) | Infrared image significance detection method | |
CN108257151A (en) | PCANet image change detection methods based on significance analysis | |
CN109800755A (en) | A kind of remote sensing image small target detecting method based on Analysis On Multi-scale Features | |
TWI765442B (en) | Method for defect level determination and computer readable storage medium thereof | |
CN110287862A (en) | Anti- detection method of taking on the sly based on deep learning | |
CN109685774A (en) | Varistor open defect detection method based on depth convolutional neural networks | |
CN110110618A (en) | A kind of SAR target detection method based on PCA and global contrast | |
Zhu et al. | Towards automatic wild animal detection in low quality camera-trap images using two-channeled perceiving residual pyramid networks | |
CN105069459B (en) | One kind is directed to High Resolution SAR Images type of ground objects extracting method | |
CN110930384A (en) | Crowd counting method, device, equipment and medium based on density information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |