CN110321869A - Personnel's detection and extracting method based on Multiscale Fusion network - Google Patents

Personnel's detection and extracting method based on Multiscale Fusion network Download PDF

Info

Publication number
CN110321869A
CN110321869A CN201910617365.1A CN201910617365A CN110321869A CN 110321869 A CN110321869 A CN 110321869A CN 201910617365 A CN201910617365 A CN 201910617365A CN 110321869 A CN110321869 A CN 110321869A
Authority
CN
China
Prior art keywords
image
detection
pedestrian
density estimation
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910617365.1A
Other languages
Chinese (zh)
Inventor
王鑫
张良
鲁志宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Institute Of Fire Protection Ministry Of Emergency Management
Tianjin Fire Research Institute of MEM
Original Assignee
Tianjin Institute Of Fire Protection Ministry Of Emergency Management
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Institute Of Fire Protection Ministry Of Emergency Management filed Critical Tianjin Institute Of Fire Protection Ministry Of Emergency Management
Priority to CN201910617365.1A priority Critical patent/CN110321869A/en
Publication of CN110321869A publication Critical patent/CN110321869A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of personnel's detection and extracting method based on Multiscale Fusion network, the detection of pedestrian target is carried out using computer using the detection method based on crowd density estimation.This method extracts a frame image of fire video image using computer as analysis platform, input is in trained crowd density estimation model, export an an equal amount of crowd density estimation image, whether have pedestrian target and pedestrian target position in the picture, and then complete the detection and extraction of pedestrian target in the picture if be can reflect out in image by crowd density estimation image.This method can use the detection that deep learning network directly carries out pedestrian target to original image, it does not need to carry out extra processing, compared with traditional feature extracting method, detection efficiency can be turned up, to different perspective image distortion better adaptabilities, this method effectively can carry out detection and quickly judgement to pedestrian target in fire video.

Description

Personnel's detection and extracting method based on Multiscale Fusion network
Technical field
The present invention relates to a kind of personnel's detection and extracting method based on Multiscale Fusion network, belong to computer vision neck Domain is the detection of video personnel targets and extracting method for cause of fire investigation.
Background technique
Screening is carried out to the pedestrian on scene of fire periphery during case investigation, judge whether its behavior is fiery with this Calamity is relevant.In today's society, the detection of pedestrian target is a critically important project, is able to detect trip in the picture People's target plays an important role in many fields.With the development of technology, in recent years pedestrian target detection technique there has also been The progress of significant progress, especially artificial intelligence field, pedestrian target detection method is from original based on pedestrian target feature The detection method of extraction is gradually evolved into the detection method based on deep learning.Based on deep learning method either from pedestrian Detection accuracy or detection efficiency have very big promotion compared to pervious method.
In traditional pedestrian target detection work, main means are mainly detected as with the feature of pedestrian target, in face of sea Video data is measured, investigator needs to take a long time observation image, and this traditional pedestrian target detection method has Drawback and limitation.Firstly, conventional method in the method for pedestrian's target detection by the way of feature extraction, extract image In pedestrian target various features, characteristic matching is then carried out with image to be detected, matched feature points correspond to figure The pedestrian target sum as in, invariance of this method dependent on the characteristic point of each pedestrian, but the pedestrian in image has When partial occlusion, if only occurred the part of people, such as leg, foot in picture, then it can not be judged as pedestrian, these methods cannot Corresponding pedestrian's characteristic point is extracted well, then is unable to complete and carries out pedestrian target detection.Secondly as present various places More and more it is mounted with video monitoring platform, the angle of the shooting of each camera is different from, and different shooting angle exists Perspective distortion difference is then shown as in image, therefore the image effect of each camera shooting is different.Side based on feature extraction Method has relatively high verification and measurement ratio for perspective distortion sensitivity under a certain angle shot scene, but after replacing scene, The detection effect of pedestrian target can reduce very much, therefore this method, in different photographed scenes, pedestrian target detection effect is poor It is different very big.
Summary of the invention
Situation and existing deficiency in view of the prior art, the present invention provides a kind of people based on Multiscale Fusion network Member's detection and extracting method, this method are to use base using computer when there is pedestrian in a fire video scene The detection of pedestrian target is carried out in the detection method of crowd density estimation.This method is extracted using computer as analysis platform One frame image of fire video image, in trained crowd density estimation model, output one is an equal amount of for input Whether crowd density estimation image, be can reflect out by crowd density estimation image has pedestrian target and pedestrian in image The position of target in the picture, and then complete the detection and extraction of pedestrian target in the picture.
The present invention to achieve the above object, the technical solution adopted is that: personnel based on Multiscale Fusion network detection and Extracting method, it is characterised in that: using computer as detection platform, using the method for image procossing, based on deep learning Technology detects the personnel targets in original image, chooses the fire video image for preparing analysis in advance Complete the detection to the pedestrian target in fire video image, the specific steps are as follows:
1, the training of crowd density estimation model:
1) preliminary processing is carried out to training fire video image, the preliminary treatment of image includes most possibly having in image Pedestrian position marks in the mark and picture in the region of pedestrian target, and the pedestrian in image is by the way of Gaussian Blur It is labeled, the number of people in the picture is all marked, and the number of people coordinate of label is then become corresponding density map, uses A kind of simple mode is converted, i.e., the rectangle of a normal distribution, rectangular area model are generated in the center position of pedestrian It is within enclosing and be 1, Gaussian Blur formula is as follows:
If a mark point on the xi of picture position, can be indicated, then corresponding ground with δ (x-xi) The density map Y of truth can generate G δ using normalized Gaussian convolution, and S indicates the combination of mark point, entire density map Integral be equal to the summations of all numbers in image;
2) network for the image input crowd density estimation completed mark, carries out the training of model, completes the figure of mark As needing to carry out image amplification, i.e., image is overturn, is scaled, random cropping operation, using the image after the completion of processing as Training set data inputs network, completes the training of network parameter;
3) test fire video image is inputted crowd density estimation model, carries out model after similarly handling Measure of merit, equally by the 10 of entire data set as verifying collection, retraining carries out compliance test result after completing, and leads to The training effect of test set is assessed in the performance completion for crossing verifying collection;
4) finally by the difference between test effect and actual effect, the loss function of network, Zhi Daoxiao are continued to optimize Fruit is optimal.Loss function formula is as follows:
Wherein θ is the parameter set of model, and N is the quantity of training sample, XiIt is i-th of training sample, FlIt is XiIt is corresponding close Spend the true value of figure, F (Xi;θ) be then model predicted value, the difference that L both is represents loss;
2, the application of crowd density estimation model:
Preliminary processing is carried out to fire video to be detected using computer, marks the area for being likely to occur pedestrian in image Image to be detected input crowd density estimation model that processing is completed is predicted that output is the density estimation of crowd by domain Figure, can be completed the detection whether detection image has pedestrian target.
The beneficial effects of the present invention are: the present invention is using the pedestrian target detection method based on density estimation, it is a kind of Video personnel detection and extracting method applied to cause of fire investigation, are to carry out crowd density analysis to fire video image, Judge whether there is pedestrian target in video image, is a kind of method based on deep learning.The present invention is regarded using existing fire Frequency image is trained crowd density estimation model, is optimal the effect of model by adjusting the multiple parameters of model, Then by image input model to be detected, the corresponding crowd density figure of a width is generated, it can be direct using crowd density figure Whether there are pedestrian target and each pedestrian target position in the picture in analysis image.
The present invention carries out the assistant analysis of video image using depth learning technology, to image in pedestrian's target operation In pedestrian target detection, there is great advantage, the present invention is generated in corresponding scene using depth learning technology Crowd density distribution map can detecte out original image pedestrian target by the analysis of the crowd density distribution map to generation, be promoted Pedestrian target detection efficiency and accuracy.
First, this method is analyzed based on video image, accurate and visual.
Second, this method is different from the pervious detection method based on pedestrian's feature, for flame, the object etc. in image It blocks insensitive with camera perspective distortion.
Third has the extensive arrangement of monitoring and the development trend in future, this method universal in view of present each place It is applicable in and has wide popularization space.
In short, using the pedestrian target detection method of crowd density estimation, than traditional pedestrian's mesh based on feature extraction Marking detection method has much the detection that a little can use deep learning network directly to original image progress pedestrian target, It does not need to carry out extra processing, compared with traditional feature extracting method, detection efficiency can be turned up, to different figures As perspective distortion better adaptability, this method effectively can carry out detection and quickly judgement to pedestrian target in fire video.
Detailed description of the invention
Fig. 1 is present invention crowd's picture to be detected;
Fig. 2 is the video image for being possible to occur pedestrian area obtained after the present invention pre-processes;
Pedestrian Fig. 3 of the invention marks schematic diagram;
Fig. 4 is crowd density estimation model structure of the invention;
Crowd density estimation model Fig. 5 of the invention generates image schematic diagram;
Flow chart Fig. 6 of the invention.
Specific embodiment
As shown in Figures 1 to 6, personnel's detection and extracting method based on Multiscale Fusion network, are a kind of applied to fire The video personnel of calamity causal investigation detect and extracting method, using computer as detection platform, using the side of image procossing Method, the technology based on deep learning detect the personnel targets in original image, choose one in advance and prepare analysis The detection to the pedestrian target in fire video image can be completed in fire video image, the specific steps are as follows:
1, the training of crowd density estimation model:
1) preliminary processing is carried out to training fire video image, the preliminary treatment of image includes most possibly having in image Pedestrian position marks in the mark and picture in the region of pedestrian target, and the pedestrian in image is by the way of Gaussian Blur It is labeled, the number of people in the picture is all marked, and the number of people coordinate of label is then become corresponding density map.We It is converted using a kind of simple mode, i.e., generates the rectangle of a normal distribution, rectangle region in the center position of pedestrian It is within the scope of domain and be 1.Gaussian Blur formula is as follows:
If a mark point on the xi of picture position, can be indicated with δ (x-xi).So corresponding ground The density map Y of truth can generate G δ using normalized Gaussian convolution.The combination of S expression mark point.Entire density map Integral be equal to the summations of all numbers in image.
2) network for the image input crowd density estimation completed mark, carries out the training of model.Complete the figure of mark As needing to carry out image amplification, i.e., image is overturn, scaled, the image after the completion of processing is made in the operation such as random cropping Network is inputted for training set data, completes the training of network parameter.
3) test fire video image is inputted crowd density estimation model, carries out model after similarly handling Measure of merit, same we collect 10 or so of entire data set as verifying, and retraining carries out effect after completing Verifying assesses the training effect of test set by the performance completion of verifying collection.
4) finally by the difference between test effect and actual effect, the loss function of network, Zhi Daoxiao are continued to optimize Fruit is optimal, and loss function formula is as follows:
Wherein θ is the parameter set of model, and N is the quantity of training sample, XiIt is i-th of training sample, FiIt is XiIt is corresponding close Spend the true value of figure, F (Xi;θ) be then model predicted value, the difference that L both is represents loss.
2, the application of crowd density estimation model:
Preliminary processing is carried out to fire video to be detected using computer, marks the area for being likely to occur pedestrian in image Domain.Image to be detected input crowd density estimation model that processing is completed is predicted that output is the density estimation of crowd Figure.The detection whether detection image has pedestrian target can be completed.
Fig. 1 is that crowd's picture to be detected passes through to not have the interference of pedestrian target region in rejection image It there will not be the mode of pedestrian area blurring, only the region for being likely to occur pedestrian be trained, Fig. 2 is by processing Crowd's picture.
Pedestrian Fig. 3 of the invention marks schematic diagram, is the figure after neural counting and network, is indicated in figure using origin Single pedestrian target, multiple pedestrians are indicated by the way of stacking when aggregation, complete whole image finally by integral Personnel count.
Fig. 4 is crowd density estimation model structure, that is, network structure of the invention, using biserial difference convolution kernel Network, in order to extract the characteristics of image of different scale, first row extracts larger size characteristics of image, using from 11*11 to 7*7 Convolution kernel.Secondary series extracts the characteristics of image of smaller size, only with the convolution kernel of 3*3.Then the fusion of network is completed. Since two column network structure of front has pooling layers of two layers of max, all output images only have original image resolution four/ One, so avoiding losing excessive letter in image to original image size by a column deconvolution network recovery after fusion Breath.
Crowd density estimation model Fig. 5 of the invention generates image schematic diagram, and as image is after neural network model The density image of generation, total number of persons are the sum of image integration.
Fire is first video image by processing change by flow chart, that is, inventive algorithm flow chart Fig. 6 of the invention At multi-frame video image, image inputs deep learning network model after denoising and enhancing, generates one by the operation of model Corresponding density of personnel image is opened, image size is identical with original video image size, passes through the integral of the density image to generation Completion personnel count, and obtain personnel targets.

Claims (1)

1. a kind of personnel's detection and extracting method based on Multiscale Fusion network, it is characterised in that: using computer as inspection Platform is surveyed, using the method for image procossing, the technology based on deep learning detects the personnel targets in original image, The fire video image for preparing analysis is chosen in advance, and the detection to the pedestrian target in fire video image can be completed, Specific step is as follows:
1, the training of crowd density estimation model:
1) preliminary processing is carried out to training fire video image, the preliminary treatment of image includes most possibly having pedestrian in image Pedestrian position marks in the mark and picture of mesh target area, and the pedestrian in image is carried out by the way of Gaussian Blur Mark, the number of people in the picture are all marked, and the number of people coordinate of label are then become corresponding density map, using a kind of letter Single mode is converted, i.e. the rectangle in center position one normal distribution of generation of pedestrian, within the scope of rectangular area And be 1, Gaussian Blur formula is as follows:
If a mark point on the xi of picture position, can be indicated with δ (x-xi), then corresponding ground truth Density map Y can generate G δ using normalized Gaussian convolution, and S indicates the combination of mark point, and the integral of entire density map is just Equal to the summation of numbers all in image;
2) network for the image input crowd density estimation completed mark, carries out the training of model, and the image for completing mark needs Image amplification is carried out, i.e., image is overturn, is scaled, random cropping operation, using the image after the completion of processing as training Collect data and input network, completes the training of network parameter;
3) test fire video image is inputted crowd density estimation model, carries out the effect of model after similarly handling Test, equally by the 10 of entire data set as verifying collection, retraining carries out compliance test result after completing, and passes through verifying The training effect of test set is assessed in the performance completion of collection;
4) finally by the difference between test effect and actual effect, continue to optimize the loss function of network, until effect most Excellent, loss function formula is as follows:
Wherein θ is the parameter set of model, and N is the quantity of training sample, XiIt is i-th of training sample, FIIt is XiCorresponding density map True value, F (Xi;θ) be then model predicted value, the difference that L both is represents loss;
2, the application of crowd density estimation model:
Preliminary processing is carried out to fire video to be detected using computer, marks the region for being likely to occur pedestrian in image, Image to be detected input crowd density estimation model that processing is completed is predicted that output is the density estimation figure of crowd, The detection whether detection image has pedestrian target can be completed.
CN201910617365.1A 2019-07-10 2019-07-10 Personnel's detection and extracting method based on Multiscale Fusion network Pending CN110321869A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910617365.1A CN110321869A (en) 2019-07-10 2019-07-10 Personnel's detection and extracting method based on Multiscale Fusion network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910617365.1A CN110321869A (en) 2019-07-10 2019-07-10 Personnel's detection and extracting method based on Multiscale Fusion network

Publications (1)

Publication Number Publication Date
CN110321869A true CN110321869A (en) 2019-10-11

Family

ID=68123160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910617365.1A Pending CN110321869A (en) 2019-07-10 2019-07-10 Personnel's detection and extracting method based on Multiscale Fusion network

Country Status (1)

Country Link
CN (1) CN110321869A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767881A (en) * 2020-07-06 2020-10-13 中兴飞流信息科技有限公司 Self-adaptive crowd density estimation device based on AI technology
CN112115862A (en) * 2020-09-18 2020-12-22 广东机场白云信息科技有限公司 Crowded scene pedestrian detection method combined with density estimation
CN113762219A (en) * 2021-11-03 2021-12-07 恒林家居股份有限公司 Method, system and storage medium for identifying people in mobile conference room

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2709065A1 (en) * 2012-09-17 2014-03-19 Lakeside Labs GmbH Concept for counting moving objects passing a plurality of different areas within a region of interest
CN104574352A (en) * 2014-09-02 2015-04-29 重庆大学 Crowd density grade classification method based on foreground image
CN107301387A (en) * 2017-06-16 2017-10-27 华南理工大学 A kind of image Dense crowd method of counting based on deep learning
CN108717528A (en) * 2018-05-15 2018-10-30 苏州平江历史街区保护整治有限责任公司 A kind of global population analysis method of more strategies based on depth network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2709065A1 (en) * 2012-09-17 2014-03-19 Lakeside Labs GmbH Concept for counting moving objects passing a plurality of different areas within a region of interest
CN104574352A (en) * 2014-09-02 2015-04-29 重庆大学 Crowd density grade classification method based on foreground image
CN107301387A (en) * 2017-06-16 2017-10-27 华南理工大学 A kind of image Dense crowd method of counting based on deep learning
CN108717528A (en) * 2018-05-15 2018-10-30 苏州平江历史街区保护整治有限责任公司 A kind of global population analysis method of more strategies based on depth network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨杰,张翔: "《视频目标检测和跟踪及其应用》", pages: 2 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767881A (en) * 2020-07-06 2020-10-13 中兴飞流信息科技有限公司 Self-adaptive crowd density estimation device based on AI technology
CN112115862A (en) * 2020-09-18 2020-12-22 广东机场白云信息科技有限公司 Crowded scene pedestrian detection method combined with density estimation
CN112115862B (en) * 2020-09-18 2023-08-29 广东机场白云信息科技有限公司 Congestion scene pedestrian detection method combined with density estimation
CN113762219A (en) * 2021-11-03 2021-12-07 恒林家居股份有限公司 Method, system and storage medium for identifying people in mobile conference room

Similar Documents

Publication Publication Date Title
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN105809693B (en) SAR image registration method based on deep neural network
CN104318548B (en) Rapid image registration implementation method based on space sparsity and SIFT feature extraction
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN110059558A (en) A kind of orchard barrier real-time detection method based on improvement SSD network
CN103345758B (en) Jpeg image region duplication based on DCT statistical nature distorts blind checking method
CN111611874B (en) Face mask wearing detection method based on ResNet and Canny
CN109409190A (en) Pedestrian detection method based on histogram of gradients and Canny edge detector
Yang et al. Traffic sign recognition in disturbing environments
CN110232387B (en) Different-source image matching method based on KAZE-HOG algorithm
CN110321869A (en) Personnel's detection and extracting method based on Multiscale Fusion network
CN107808161A (en) A kind of Underwater targets recognition based on light vision
CN111445459A (en) Image defect detection method and system based on depth twin network
CN106157308A (en) Rectangular target object detecting method
Zhang et al. Improved Fully Convolutional Network for Digital Image Region Forgery Detection.
CN106530271A (en) Infrared image significance detection method
CN108257151A (en) PCANet image change detection methods based on significance analysis
CN109800755A (en) A kind of remote sensing image small target detecting method based on Analysis On Multi-scale Features
TWI765442B (en) Method for defect level determination and computer readable storage medium thereof
CN110287862A (en) Anti- detection method of taking on the sly based on deep learning
CN109685774A (en) Varistor open defect detection method based on depth convolutional neural networks
CN110110618A (en) A kind of SAR target detection method based on PCA and global contrast
Zhu et al. Towards automatic wild animal detection in low quality camera-trap images using two-channeled perceiving residual pyramid networks
CN105069459B (en) One kind is directed to High Resolution SAR Images type of ground objects extracting method
CN110930384A (en) Crowd counting method, device, equipment and medium based on density information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination