CN105989614B - Dangerous object detection method fusing multi-source visual information - Google Patents
Dangerous object detection method fusing multi-source visual information Download PDFInfo
- Publication number
- CN105989614B CN105989614B CN201510080128.8A CN201510080128A CN105989614B CN 105989614 B CN105989614 B CN 105989614B CN 201510080128 A CN201510080128 A CN 201510080128A CN 105989614 B CN105989614 B CN 105989614B
- Authority
- CN
- China
- Prior art keywords
- image
- motion
- visual information
- calculating
- dangerous object
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 36
- 230000000007 visual effect Effects 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 26
- 238000004364 calculation method Methods 0.000 claims description 10
- 230000004927 fusion Effects 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 230000003287 optical effect Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 4
- 238000012544 monitoring process Methods 0.000 description 3
- 238000007500 overflow downdraw method Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 231100001261 hazardous Toxicity 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
The invention provides a dangerous object detection method fusing multi-source visual information, which comprises the following steps: 1, multi-source visual image acquisition; incremental motion consistency considerations; 3, fusing multi-source visual information; and 4, calculating the detection rate. The method for detecting the dangerous object fusing the multi-source visual information solves the technical problems that the detection category of the dangerous object is limited and various information is not effectively utilized in the prior art.
Description
Technical Field
The invention belongs to the field of computer vision and image understanding, and particularly relates to a dangerous object detection method fusing multi-source visual information in video monitoring.
Background
The automatic prediction of dangerous objects which may appear during driving is a key technology in video monitoring. In general, detection of dangerous objects becomes difficult due to complicated object types, variable monitoring environments, and severe camera shake. At present, detection methods for dangerous objects are mainly divided into two main categories:
one is a detector-based approach that trains pedestrian or vehicle detectors in advance using manually collected pedestrian or vehicle samples, and then detects corresponding targets in the surveillance video. Xu et al, in the references "Y.xu, D.xu, S.Lin, T.Han, X.Cao, and X.Li.detection of Sudden Pedestrian crossing for driving Assistance Systems IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics,42(3): 729-. This document is novel to train with partial pedestrian samples so that pedestrians can be detected as soon as they are discovered. Rezaei and Terrauchi propose a vehicle Detection method that combines multiple features and the Dempster-craft Technology in the literature "M.Rezaei and M.Terrauchi. vehicle Detection base on Multi-feature reagents and Dempster-craft Fusion theory. in Proceedings of Pacific-RimSymposum on Image and Video Technology,2013, pp.60-72". While these methods may provide some level of detection of dangerous objects, they have the disadvantage of requiring additional training samples and of not covering all objects present in front of the vehicle.
The other is a method based on the fusion of significance and color features, which introduces an attention selection mechanism in psychology into danger detection by means of significance detection. For example, Alonso et al, in the documents "J.Alonso, E.R.Vidal, A.rotter, and M.Muhlenberg.Lane-change precision air System Based on motion-drive Vehicle tracking. IEEE Transactions on Vehicle Technology,57(5): 2736-. The method has the disadvantages that the motion significance of one side of the visual field is only considered, and the position of the dangerous object is uncertain in the real driving process, namely the dangerous object can appear in the left and right visual fields which are shot.
Disclosure of Invention
The invention provides a novel dangerous object detection method fusing multi-source visual information, and solves the technical problems that in the prior art, the detection category of dangerous objects is limited, and a plurality of kinds of information are not effectively utilized.
The technical solution of the invention is as follows:
a dangerous object detection method fusing multi-source visual information comprises the following steps:
multi-source visual image acquisition:
1.1, acquiring a color video image and a near-infrared video image in real time by using a multispectral camera;
1.2, obtaining a depth image corresponding to the color video image by using a single-sequence depth recovery method;
1.3, obtaining a motion image corresponding to the color video image by using a correlation optical flow method;
1.4, segmenting each frame image in the moving image by utilizing a linear iterative clustering method to obtain a superpixel grid;
1.5, overlapping the super pixel grids on the color video image, the near infrared video image and the depth image;
incremental motion consistency considerations:
2.1, dividing the moving image into a left moving video frame and a right moving video frame, wherein the dividing line is a central axis of the moving image;
2.2 training a normal motion mode base A including a left normal motion mode base A by using a super-pixel motion mode obtained by initial F frame segmentationlAnd the right normal movement mode base Ar(ii) a Wherein F is a value rangeThe circumference is 5-20;
2.3 at time t, the images of all channels are divided into N superpixels, and the characteristics y corresponding to the N superpixels in the moving image are calculatediWherein i is 1: N;
2.4, constructing a graph regular minimum soft threshold mean square objective formula model:
wherein U is the constructed Gauss-Laplace error term and Y is all YiThe matrix is formed, X is the sparse coefficient to be solved, L is Laplace matrix, lambda1Constraint coefficients for the gaussian-laplacian noise sparse term,constraint coefficients for geometric prevalence regularization terms;
2.5, obtaining a calculation result of the danger confidence value of the motion image:
and (3) integrating the calculation results of the left side and the right side:
2.6 obtaining the confidence of danger under the consideration of the motion information of the whole motion imageWill be provided withIs normalized to [0,1 ] using a max-min normalization method]To (c) to (d);
multi-source visual information fusion:
3.1 calculating the significance results of the color video image, the near-infrared video image and the depth image respectively by using a significance calculation method based on image evaluation to obtain the danger confidence coefficient S of the color video imaget CAnd the danger confidence of the near-infrared video imageAnd depth image risk confidence
3.2 Bayes model with significance
Fusing to obtain dangerous confidence maps of motion images, color video images, near-infrared video images and depth images
3.2.1 ] calculating the prior probability Pr (O):
obtaining an element distribution diagram according to the occurrence frequency of the superpixel features in the image spaceWherein the OPT is an element distribution diagram index; then the prior probability is
3.2.2 ] calculating likelihood probability Pr (S (z) | O):
will obtainBinarization and calculating the target areaFalling of a domain on a corresponding original visual imageNumber of pixels corresponding to the numerical valueCalculating the fall-in of the background region on the corresponding original visual imageNumber of pixels corresponding to the numerical valueThe likelihood probability of the target and the background is
3.3. calculating the fusion probability of all visual information after Bayes:
wherein,
and 4, calculating the detection rate:
repeating the steps 2-3 for each frame of image until the whole video image is processed; marking the real dangerous object area in the t-th frame of the video as GtThe detection rate is as follows:
TPR=TP/P,
FPR=FP/N.
wherein TP is the number of pixels with correct detection, FP is the number of pixels with wrong detection, and P is GtThe number of target pixel points in (1) is GtAnd the number of the middle background pixel points.
The element distribution map index OPT in step 3.2.1 above is:
and F in the step 2.2 is 10.
The invention has the advantages that:
the invention simultaneously and synergistically considers the complementarity and selectivity of the multi-source visual information, and the obtained road dangerous object detection result is obviously superior to other methods.
Drawings
FIG. 1 is a flow chart of a dangerous object detection method fusing multi-source visual information according to the present invention.
Detailed Description
Referring to fig. 1, the steps implemented by the present invention are as follows:
step 1, a multi-source visual image acquisition module.
(1a) The method comprises the steps of utilizing a multispectral camera to obtain color and near-infrared video images in real time, then utilizing a single-sequence depth recovery method to obtain depth images corresponding to the color videos, and utilizing a correlation optical flow method to obtain motion images corresponding to the color videos. The motion image is divided into a specified number of super pixel images by a Linear Iterative Clustering (SLIC) method, and super pixel grids are superposed on the near infrared, depth and color images so as to facilitate super pixel feature calculation.
And step 2, an incremental motion consistency consideration module.
(2a) Dividing the motion image into a left part and a right part, wherein the dividing line is a central axis of the motion image;
(2b) respectively training a left normal motion mode base A by using a super-pixel motion mode obtained by initial 10-frame segmentationlAnd the right normal movement mode base Ar. Method for minimum soft threshold mean square incremental expression by graph regularizationConsider dangerous motion information within the driver's field of view. Since the motion consistency is considered in the same manner for both sides, for the sake of brevity, AlAnd ArDenoted by a. Suppose that at time t, the images of all channels are divided into N superpixels, and the features y corresponding to the N superpixels in the moving image are calculatediAnd i is 1: N, and then constructing a graph regular minimum soft threshold mean square target formula model:
u is the constructed Gauss-Laplace error term and Y is all YiThe matrix is formed, X is the sparse coefficient to be solved, L is Laplace matrix, lambda1(0.05 in all examples) is the constraint coefficient of the sparse term of the laplacian of gaussian noise,(0.005 in all examples) is sparse constrained by geometric prevalence regularities.
(2c) Obtaining a risk confidence value calculation result of the moving image:
and (3) integrating the calculation results of the left side and the right side:
whereinβ is the balance coefficient of the errors on the left and right sides, 0.4 is taken in all the examples, and the danger confidence coefficient under the consideration of the whole motion image motion information is obtainedWill be provided withIs normalized to [0,1 ] using a max-min normalization method]In the meantime.
And step 3, a multi-source visual information fusion module.
(3a) The saliency results of color, near-infrared and depth channel images, respectively expressed as C.Yang, L.Zhang, H.Lu, X.Ruan, and M.Yang.Saliency detection via Graph-based Artificial ranking, in Proceedings of IEEE Conference on computer Vision and Pattern Recognition,2013: 3166-These results may also be referred to as a risk confidence map for color, near infrared and depth images.
(3b) Motion, color, near infrared and depth image danger confidence coefficient graph obtained by utilizing significance Bayes model fusionThe significant Bayesian model is as follows:
1) the prior probability pr (o) is calculated. Unlike previous significance fusion methods, the present invention utilizes a more efficient elemental distribution mapTo estimate a priori probabilities of including dangerous objects in the visual images. Wherein the elemental distribution map utilizes the literature "f.Y.Pritch,A.Hornung.Saliency filters:ContrastBased Filtering for Salient Region Detection.In Proceedings of IEEEConf.Computer Vision and Pattern Recognition,2012:733740' are calculated. The method is characterized in that the spatial variance change of the super-pixel feature to be detected in the image is calculated, namely the occurrence probability of the super-pixel feature to be detected at other positions of the image is calculated. The larger the value of the elemental distribution map, the less physical. Then the prior probability of the objectAnd the optimal element profile index OPT is:
2) calculating the similarity probability Pr (S (z) | O). Will obtainBinarization and then calculating the falling of the target region on the corresponding original visual imageNumber of pixels corresponding to the numerical valueSimultaneously calculating the falling of the background area on the corresponding original visual imageNumber of pixels corresponding to the numerical valueThe likelihood probabilities of the target and the background are:
(3c) calculating the fusion probability of all visual information after Bayesian:
wherein,
and 4, calculating the detection rate.
And (5) executing the step (2) and the step (3) in each frame in the image until the whole video is processed. The real dangerous object area in the t-th frame of the marked video is Gt. The detection rate is the ROC curve calculation formula:
TPR=TP/P,FPR=FP/N.
wherein TP is the number of pixel points with correct detection, FP is the number of pixel points with wrong detection, and P is GtThe number of target pixel points in (1) is GtAnd the number of the middle background pixel points.
A process flow for 12 segments of multi-source video images comprises the following processing steps:
1. determining simulation conditions
The invention is in the central processing unitAnd (3) simulating by using MATLAB software on an i 3-32403.3 GHz CPU and a memory 4G, WINDOWS 7 operating system.
The data used in the simulation is an autonomously acquired video sequence of 12 real road scenes.
2. Emulated content
The method of the invention is used for dangerous target detection according to the following steps:
firstly, the calculated motion image, the original color image, the near-infrared image and the restored depth image are simultaneously input into the system, and step 2 and step 3 are executed.
Next, the results of the test results obtained for each frame and the results of the real markers are calculated from the acceptable test curve of the subject and the area value under the curve, and the results are shown in table 1.
The method without PAD in table 1 is a result of direct multiplication of multiple information points, and all combinations of information are exhaustive in this example, and they are single motion information (M), motion-color information point Multiplication (MC), motion-near infrared information point Multiplication (MI), motion-depth information point Multiplication (MD), motion-color-near infrared information point Multiplication (MCI), motion-color-depth information point Multiplication (MID), motion-near infrared-depth information point Multiplication (MID), and motion-color-near infrared-depth information point Multiplication (MCID). PAD-MCID denotes the method according to the invention. Obtaining a visualization result image, wherein (a) is an original color image frame; (b) is a real result; (c) - (k) correspond to M, MC, MI, MD, MCI, MCD, MID, MCID, PAD in Table 1, respectively. As can be seen from Table 1, the recognition rate of the present invention is significantly higher than that of the simple multiplicative information fusion method.
TABLE 1 comparison of area values under ROC curve for hazardous object detection
Claims (4)
1. A dangerous object detection method fusing multi-source visual information is characterized by comprising the following steps: the method comprises the following steps:
multi-source visual image acquisition:
1.1, acquiring a color video image and a near-infrared video image in real time by using a multispectral camera;
1.2, obtaining a depth image corresponding to the color video image by using a single-sequence depth recovery method;
1.3, obtaining a motion image corresponding to the color video image by using a correlation optical flow method;
1.4, segmenting each frame image in the moving image by utilizing a linear iterative clustering method to obtain a superpixel grid;
1.5, overlapping the super pixel grids on the color video image, the near infrared video image and the depth image;
incremental motion consistency considerations:
2.1, dividing the moving image into a left moving video frame and a right moving video frame, wherein the dividing line is a central axis of the moving image;
2.2 training Normal motion mode base A, including left side Normal motion mode, with superpixel motion mode obtained from initial F frame segmentationGroup A of the formulalAnd the right normal movement mode base Ar(ii) a Wherein the value range of F is 5-20;
2.3 at time t, the images of all channels are divided into N superpixels, and the characteristics y corresponding to the N superpixels in the moving image are calculatediWherein i is 1: N;
2.4, constructing a graph regular minimum soft threshold mean square objective formula model:
wherein U is the constructed Gauss-Laplace error term and Y is all YiThe matrix is formed, X is the sparse coefficient to be solved, L is Laplace matrix, lambda1Constraint coefficients for the gaussian-laplacian noise sparse term,constraint coefficients for geometric prevalence regularization terms;
2.5, obtaining a calculation result of the danger confidence value of the motion image:
and (3) integrating the calculation results of the left side and the right side:
2.6 obtaining the confidence of danger under the consideration of the motion information of the whole motion imageWill be provided withIs normalized to [0,1 ] using a max-min normalization method]To (c) to (d);
multi-source visual information fusion:
3.1 calculating the significance results of the color video image, the near-infrared video image and the depth image respectively by using a significance calculation method based on image evaluation to obtain the danger confidence coefficient of the color video imageNear-infrared video image danger confidenceAnd depth image risk confidence
3.2 Bayes model with significance
Fusing to obtain dangerous confidence maps of motion images, color video images, near-infrared video images and depth images
3.2.1 ] calculating the prior probability Pr (O):
obtaining an element distribution diagram according to the occurrence frequency of the superpixel features in the image spaceWherein the OPT is an element distribution diagram index; then the prior probability is
3.2.2 ] calculating likelihood probability Pr (S (z) | O):
will obtainBinarization, calculating the falling of the target region on the corresponding original visual imageNumber of pixels corresponding to the numerical valueCalculating the fall-in of the background region on the corresponding original visual imageNumber of pixels corresponding to the numerical valueThe likelihood probability of the target and the background is
3.3. calculating the fusion probability of all visual information after Bayes:
wherein,
and 4, calculating the detection rate:
repeating the steps 2-3 for each frame of image until the whole video image is processed; marking real dangerous object area in t frame of videoIs GtThe detection rate is as follows:
TPR=TP/P,
FPR=FP/N
wherein TP is the number of pixels with correct detection, FP is the number of pixels with wrong detection, and P is GtThe number of target pixel points in (1) is GtAnd the number of the middle background pixel points.
4. the method for detecting a dangerous object fusing multi-source visual information according to claim 3, wherein: and F in the step 2.2 is 10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510080128.8A CN105989614B (en) | 2015-02-13 | 2015-02-13 | Dangerous object detection method fusing multi-source visual information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510080128.8A CN105989614B (en) | 2015-02-13 | 2015-02-13 | Dangerous object detection method fusing multi-source visual information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105989614A CN105989614A (en) | 2016-10-05 |
CN105989614B true CN105989614B (en) | 2020-09-01 |
Family
ID=57042282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510080128.8A Active CN105989614B (en) | 2015-02-13 | 2015-02-13 | Dangerous object detection method fusing multi-source visual information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105989614B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106767760A (en) * | 2016-12-30 | 2017-05-31 | 中国船舶重工集团公司第七0七研究所 | Multi-source ship target fusion method based on various dimensions |
CN108665484B (en) * | 2018-05-22 | 2021-07-09 | 国网山东省电力公司电力科学研究院 | Danger source identification method and system based on deep learning |
US10452947B1 (en) * | 2018-06-08 | 2019-10-22 | Microsoft Technology Licensing, Llc | Object recognition using depth and multi-spectral camera |
CN109711239B (en) * | 2018-09-11 | 2023-04-07 | 重庆邮电大学 | Visual attention detection method based on improved mixed increment dynamic Bayesian network |
CN109633661A (en) * | 2018-11-28 | 2019-04-16 | 杭州凌像科技有限公司 | A kind of glass inspection systems merged based on RGB-D sensor with ultrasonic sensor and method |
CN110457990B (en) * | 2019-06-19 | 2020-06-12 | 特斯联(北京)科技有限公司 | Machine learning security monitoring video occlusion intelligent filling method and system |
CN111832414B (en) * | 2020-06-09 | 2021-05-14 | 天津大学 | Animal counting method based on graph regular optical flow attention network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130042844A (en) * | 2011-10-19 | 2013-04-29 | 인하대학교 산학협력단 | Indoor exercise compensating position system |
CN103247074A (en) * | 2013-04-23 | 2013-08-14 | 苏州华漫信息服务有限公司 | 3D (three dimensional) photographing method combining depth information and human face analyzing technology |
CN103413151A (en) * | 2013-07-22 | 2013-11-27 | 西安电子科技大学 | Hyperspectral image classification method based on image regular low-rank expression dimensionality reduction |
CN103927551A (en) * | 2014-04-21 | 2014-07-16 | 西安电子科技大学 | Polarimetric SAR semi-supervised classification method based on superpixel correlation matrix |
-
2015
- 2015-02-13 CN CN201510080128.8A patent/CN105989614B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130042844A (en) * | 2011-10-19 | 2013-04-29 | 인하대학교 산학협력단 | Indoor exercise compensating position system |
CN103247074A (en) * | 2013-04-23 | 2013-08-14 | 苏州华漫信息服务有限公司 | 3D (three dimensional) photographing method combining depth information and human face analyzing technology |
CN103413151A (en) * | 2013-07-22 | 2013-11-27 | 西安电子科技大学 | Hyperspectral image classification method based on image regular low-rank expression dimensionality reduction |
CN103927551A (en) * | 2014-04-21 | 2014-07-16 | 西安电子科技大学 | Polarimetric SAR semi-supervised classification method based on superpixel correlation matrix |
Also Published As
Publication number | Publication date |
---|---|
CN105989614A (en) | 2016-10-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105989614B (en) | Dangerous object detection method fusing multi-source visual information | |
Wu et al. | Lane-mark extraction for automobiles under complex conditions | |
CN111209770B (en) | Lane line identification method and device | |
EP3007099B1 (en) | Image recognition system for a vehicle and corresponding method | |
Rasmussen | Texture-Based Vanishing Point Voting for Road Shape Estimation. | |
Vojíř et al. | The enhanced flock of trackers | |
WO2015010451A1 (en) | Method for road detection from one image | |
CN110222667B (en) | Open road traffic participant data acquisition method based on computer vision | |
CN103281477A (en) | Multi-level characteristic data association-based multi-target visual tracking method | |
CN104680510A (en) | RADAR parallax image optimization method and stereo matching parallax image optimization method and system | |
CN102156995A (en) | Video movement foreground dividing method in moving camera | |
CN103093198A (en) | Crowd density monitoring method and device | |
CN106558051A (en) | A kind of improved method for detecting road from single image | |
KR20170097265A (en) | System for tracking of moving multi target and method for tracking of moving multi target using same | |
Garg et al. | Look no deeper: Recognizing places from opposing viewpoints under varying scene appearance using single-view depth estimation | |
CN104537686A (en) | Tracing method and device based on target space and time consistency and local sparse representation | |
CN105913459A (en) | Moving object detection method based on high resolution continuous shooting images | |
CN105469054A (en) | Model construction method of normal behaviors and detection method of abnormal behaviors | |
CN113450457B (en) | Road reconstruction method, apparatus, computer device and storage medium | |
CN117315210B (en) | Image blurring method based on stereoscopic imaging and related device | |
CN107832732B (en) | Lane line detection method based on treble traversal | |
Abujayyab et al. | Integrating object-based and pixel-based segmentation for building footprint extraction from satellite images | |
Gupta et al. | Robust lane detection using multiple features | |
CN112990009A (en) | End-to-end-based lane line detection method, device, equipment and storage medium | |
CN110135382B (en) | Human body detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |