CN112258403A

CN112258403A - Method for extracting suspected smoke area from dynamic smoke

Info

Publication number: CN112258403A
Application number: CN202011073167.2A
Authority: CN
Inventors: 刘明珠; 贺雅楠
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2020-10-09
Filing date: 2020-10-09
Publication date: 2021-01-22

Abstract

A method for extracting suspected smoke area from dynamic smoke. The existing video smoke detection algorithm has the problem of inaccurate smoke detection. Belonging to the field of image recognition. A method for extracting suspected smoke area from dynamic smoke includes inputting video image and processing it frame by frame. And screening and cropping the video data set. The smoke dataset size 32x24 standard is normalized to a 320x240 video file, so that later video image blocks are selected and fed into the recognition model. After normalization, a proper filter is selected for denoising, then the moving object is subjected to angular point detection after blocking, and the moving direction is judged to extract a suspected smoke region. And finally, extracting features in the region for identification. The suspected smoke area is accurately extracted, and the omission factor is reduced by 2-5 times. The smoke identification accuracy rate reaches 94-96%.

Description

Method for extracting suspected smoke area from dynamic smoke

Technical Field

The invention relates to a method for extracting a suspected smoke area from dynamic smoke.

Background

In video smoke detection, accurate extraction of a dynamic smoke region, and how to better retain the integrity and irregularity of smoke have a crucial influence on smoke identification accuracy, and conventional moving object extraction methods are a Gaussian mixture model, an interframe difference method and an optical flow method. In the smoke video image, in terms of algorithm performance, an interframe difference method is greatly influenced by the environment, and the problem of omission is caused when the extraction effect of a smoke motion area is not obvious. Other algorithms are less affected. Although the extracted area is complete and the irregular characteristic of smoke is kept as much as possible, the dense optical flow method has low calculation speed and is difficult to realize the requirement of real-time detection. The Gaussian mixture model has a good extraction effect, but after the smoke continuously appears, the detection omission of the dynamic smoke is easily caused by taking a smoke motion area which moves slowly as a part of the background.

The smoke has the characteristics of translucency, irregularity and non-rigidity. Conventional algorithms that result in the extraction of moving objects do not have a good localization and extraction effect on dynamic smoke regions. Especially in the face of uncertain natural conditions, there is not good accuracy. Therefore, the existing video smoke detection algorithm has the problem of inaccurate smoke detection.

Disclosure of Invention

The invention aims to solve the problem that the existing video smoke detection algorithm is not accurate enough, and provides a method for extracting a suspected smoke area from dynamic smoke.

A method of extracting a suspected smoke region in dynamic smoke, the method comprising:

firstly, preprocessing an input video image;

denoising an input video image, and improving the anti-interference capability of a target region through the processing steps of selecting a color space and extracting a key frame; the target area is a suspected smoke area to be extracted;

the method specifically comprises the following steps:

firstly, inputting a video image, and carrying out frame-by-frame processing on the video image;

then, screening and cutting the video data set to obtain video images with uniform format and size of the data video file;

then, taking the size of the smoke data set as a standard, carrying out normalization processing on the video image, and sending the video image to an identification model;

then, denoising the video image by using a filter;

step two, combining a sparse optical flow method based on angular point detection and an algorithm based on video block motion direction judgment, judging the smoke motion direction, and extracting a suspected smoke area;

the method specifically comprises the following steps:

firstly, inputting smoke video images frame by frame;

then, carrying out angular point detection on the moving object;

then, carrying out optical flow vector estimation on the angular points of the moving object;

then, the area where the pixel point with the vector is located is segmented to form a video motion block;

then, determining an HSV color space suitable for the smoke image, setting a saturation threshold range, estimating the motion direction of the video motion block meeting the threshold, and determining the motion block with the upward motion direction as a suspected smoke area; extracting a region of the motion direction as an input for estimating the motion direction based on the video block;

analyzing the movement characteristics of the smoke, and extracting static characteristics and dynamic characteristics of the smoke; the smoke movement characteristics comprise color and gray level analysis, so that the area to be detected is divided;

step four, taking the suspected smoke area obtained in the last step as an area to be detected, and extracting features in the area to be detected by using a convolutional neural network model for identification;

the invention has the beneficial effects that:

the method avoids calculating the optical flow vectors of all pixel points in a dense optical flow method through the optical flow vector estimation of the angular points, and simultaneously well keeps the motion characteristics of moving objects in the smoke image. By researching the color characteristics of the smoke image, selecting an HSV color space suitable for the smoke image, setting a threshold value of the saturation, then estimating the motion direction of the video motion block meeting the threshold value, and finally forming a suspected smoke area.

The method combines the traditional optical flow method with an algorithm for judging the motion direction based on the video block, realizes the accurate extraction of the suspected smoke area, and reduces the omission ratio by 2-5 times. The smoke identification accuracy rate reaches 94-96%.

Drawings

FIG. 1 is a flow chart of a method of extracting a suspected smoke region from dynamic smoke according to the present invention;

FIG. 2 is a diagram of a convolution network model according to the present invention;

FIG. 3 is an illustration of corner points within a smoke region to which the present invention relates;

FIG. 4 is an illustration of corner points within a smoke region to which the present invention relates;

FIG. 5 is a visualization of optical flow vectors for corner points in accordance with the present invention;

FIG. 6 is a graphical representation of the output of each channel component of the HSV color space to which the present invention relates;

FIG. 7 is a graphical representation of the output of each channel component of the RGB color space involved in the present invention;

FIG. 8 is a labeled diagram of the motion direction numbering of video blocks according to the present invention.

Detailed Description

The first embodiment is as follows:

the method for extracting the suspected smoke area from the dynamic smoke in the embodiment is realized by the following steps: as shown in figure 1 of the drawings, in which,

firstly, preprocessing an input video image;

denoising an input video image, and improving the anti-interference capability of a target region through the processing steps of selecting a color space and extracting a key frame;

the method specifically comprises the following steps:

then, the size of the smoke data set is used as a standard, the video image is normalized, namely, the video image is normalized into a video file with the size of 320x240 according to the size of the smoke data set of 32x24, so that the video image beneficial to the later period is selected to be blocked and sent to an identification model;

then, denoising the video image by using a filter; the filter is a Gaussian filter;

step two, combining a sparse optical flow method based on angular point detection and an algorithm based on video block motion direction judgment, judging the smoke motion direction, and extracting a suspected smoke area; the omission ratio of the smoke is reduced, and the accuracy of dynamic smoke identification is improved;

as shown in fig. 1, specifically:

firstly, inputting smoke video images frame by frame;

then, carrying out angular point detection on the moving object, and reducing the calculated amount;

then, carrying out optical flow vector estimation on the corner points of the moving object, so that the optical flow vectors can be prevented from being calculated for all pixel points in a dense optical flow method, meanwhile, the motion characteristics of the moving object in the smoke image can be kept, and the false detection rate can be reduced;

then, determining an HSV color space suitable for the smoke image through the color characteristic research of the smoke image, setting a saturation threshold range, estimating the motion direction of a video motion block meeting the threshold, and determining the motion block with the upward motion direction as a suspected smoke area; extracting the area of the motion direction as the input of estimating the motion direction based on the video block;

analyzing the movement characteristics of the smoke, and extracting static characteristics and dynamic characteristics of the smoke; the smoke movement characteristics comprise color and gray level analysis, so that the area to be detected is divided, and the calculation amount can be reduced;

step four, taking the suspected smoke area obtained in the last step as an area to be detected, and extracting features in the area to be detected by using a convolutional neural network model for identification; the invention adopts the CNN method to automatically map, which is superior to the traditional manual extraction method;

the target area in the step one is a suspected smoke area to be extracted; the area is determined by estimating the motion direction from the motion area (cloud, haze, or the like) and by screening the threshold value.

The second embodiment is as follows:

different from the first specific embodiment, in the method for extracting a suspected smoke region from dynamic smoke of the second specific embodiment, the step of performing corner detection on the moving object in the second specific embodiment is:

the corner point is a characteristic point of which the gray value of the pixel point can be changed violently when the pixel point moves in the horizontal and vertical directions. Inputting the preprocessed smoke video images frame by frame, detecting corner points frame by frame, and searching for the corner points:

in fig. 3 and 4, it can be clearly seen that there are corner points in the rectangular frame region due to the occurrence of smoke, and the corner points of the smoke region in the image are calculated and judged by the following method:

let image I (x, y), the similarity when translated (Δ x, Δ y) at point (x, y) is:

ω (x, y) is a window centered at point (x, y), i.e. a weighting function, e.g. a gaussian weighting function, based on taylor expansion, which makes a first order approximation to image I (x, y) after translation (Δ x, Δ y):

I(u+Δx,v+Δy)＝I(u,v)+I_x(u,v)Δx+I_y(u,v)Δy+O(Δx²,Δy²)

≈I(u,v)+I_x(u,v)Δx+I_y(u,v)Δy (13)

wherein, I_x、I_yIs the partial derivative of image I (x, y), approximated by:

wherein M:

the method is simplified and can be obtained:

c(x,y；Δx,Δy)≈AΔx²+2CΔxΔy+BΔy² (16)

therefore, by the equation (15), the eigenvalue λ of the matrix M (x, y) is calculated₁,λ₂Judging whether the point (x, y) is an angular point, judging whether the point is the angular point by adopting the range of angular point response values, and calculating the angular point response as shown in a formula (18):

R＝λ₁λ₂-α(λ₁+λ₂)² (18)

wherein, α is 0.04, and R is a corner response value;

when R is approximately equal to 0, the area is judged to be a flat area, when R is less than 0, the area is judged to be a boundary, and when R is more than 0, the area is judged to be an angular point.

The third concrete implementation mode:

different from the second specific embodiment, in the method for extracting a suspected smoke region from dynamic smoke in the second specific embodiment, the step of performing optical flow vector estimation on corner points of a moving object in the second specific embodiment is:

the angular points detected by the smoke video image are subjected to optical flow vector estimation, and an angular point optical flow vector diagram can be obtained through the formula (4) to the formula (11), as shown in figure 5,

the optical flow is defined as the instantaneous speed of the motion of a pixel point on an image plane, and the corresponding relation between the previous frame and the current frame is found according to the change of the pixel in an image sequence on a time domain and the correlation between adjacent frames, so that the motion information of an object between the adjacent frames is calculated;

the optical flow method is divided into a dense optical flow and a sparse optical flow according to the density degree of vectors in the optical flow field;

in 1981, Horn and Schunck combined a two-dimensional velocity field with gray scale, and through an optical flow constraint equation, a basic algorithm of optical flow calculation can be obtained. The information contained in the optical flow vector is the instantaneous motion velocity vector information of each pixel point; the optical flow calculation method is based on the following three assumption premises:

1) the brightness is constant, and the brightness of the same pixel point does not jump with time;

2) small movement: the position is not changed drastically with the change of time;

3) the space is consistent: the projection of adjacent points on a scene onto the image is also adjacent points, and the speeds of the adjacent points are consistent. According to the above three premises, it is assumed that a pixel point on the image is located as (x, y), the luminance at time t is E, and u and v are used to represent the displacement components of the optical flow of the point in the horizontal and vertical directions:

after a time interval Δ t, the corresponding point has a brightness E (x + Δ x, y + Δ y, t + Δ t), and when Δ t is very small and approaches 0, the brightness of the point is considered to be unchanged, resulting in equation (6)

E(x,y,t)＝E(x+Δx,y+Δy,t+Δt) (6)

When the brightness of the point changes, the brightness of the point after moving is expanded by the Taylor formula, and the formula (7) can be obtained:

ε refers to higher orders, which can be ignored if the movement is small enough;

the brightness at the time t is E, and the displacement components of the point optical flow in the horizontal and vertical directions are simultaneously represented by u and v;

neglecting its second order infinitesimal magnitude, there are times when Δ t approaches 0

Where w is (u, v) and I is the identity matrix, the above equation is the basic optical flow constraint equation, where let

The gradient of the gray level of the pixel point in the image along the directions of x, y and t can be represented, and the formula (9) can be rewritten as follows:

E_xu+E_yv+E_t＝0 (10)

establishing a matrix equation of the following formula for n pixel points in the image by using an optical flow constraint equation to solve u and v;

the fourth concrete implementation mode:

different from the third specific embodiment, in the method for extracting a suspected smoke region from dynamic smoke in the present embodiment, the step two of segmenting the region where the pixel point with the vector is located to form the video motion block specifically includes:

the angular point position information with the optical flow vector is recorded so as to accurately extract the motion area in the original video image. And taking the corner points with the optical flow vectors as the central points of the motion areas to perform slicing processing in the original smoke images, and separating smoke motion block images.

The fifth concrete implementation mode:

different from the fourth embodiment, in the method for extracting a suspected smoke region from dynamic smoke according to the second embodiment, the size of the smoke motion block image in the second step is 32 × 24.

In the invention, the principle of screening the moving block images which accord with the HSV color space threshold is as follows:

screening the moving block image which accords with the HSV color space threshold is carried out before the suspected smoke extracting algorithm, and the conclusion of the algorithm is used for extracting the suspected smoke area part.

The smoke images are compared on HSV and RGB color spaces, and the HSV and RGB color space channel components are extracted from 500 smoke images and output, and the smoke images are shown in figures 6 to 7.

It can be seen that the smoke image has no distinct features in the RGB color space. However, the smoke image is obviously characterized in the HSV color space, so that the smoke image is preprocessed by selecting the HSV color space, and when the magnitudes of three components of H, S and V of 500 smoke images are recorded, the saturation of about 80% of the smoke images is found to be below 65, so that the selected smoke image saturation threshold is 65, which indicates that the smoke image is possible to be the smoke image when the saturation is lower than 65.

Therefore, HSV color space conversion is carried out on the obtained motion block image, the motion direction of the fast motion image meeting the threshold condition is detected, and the fast motion image is discarded if the fast motion image does not meet the threshold condition.

Motion direction detection of the motion block image: the calculation process first divides the direction of motion of the video into eight directions. The range of 360 degrees is divided by 45 degrees as one direction under the two-dimensional plane, and eight directions are numbered in order from the number 1 in the counterclockwise direction. As shown in fig. 8.

In the above figure, the direction No. 3 indicates the direction directly above, and the direction No. 7 indicates the direction directly below. The algorithm calculates the difference of eight neighborhood image blocks corresponding to coordinate positions in a central image and a next frame of image, and then selects the position with the minimum difference value, namely the motion direction of the central image block.

Let I_k(x, y) is the motion block in the smoke video image in the k frame, and then the notation I_k+1(x+1,y)，I_k+1(x+1,y+1)， I_k+1(x,y+1)，I_k+1(x+1,y-1)，I_k+1(x-1,y)，I_k+1(x-1,y-1)，I_k+1(x,y-1)，I_k+1And (x +1, y-1) respectively correspond to eight neighborhood image blocks positioned in the central motion block in the (k + 1) th frame smoke video image. The central motion block image I is then calculated using equation (19)_kAnd (x, y) and the difference value B of the corresponding positions of the k +1 frames.

Where w and h are the width and height of the motion block image, respectively. B denotes a disparity size, I (I, j) denotes a size of a pixel value of a point located at the coordinate (I, j), and k denotes a number of frames of the moving block image in the video image. By calculating B₁～B₈Comparing the 8 values, selecting the minimum difference value and obtaining I_kThe main motion direction of the (x, y) motion block, if the motion direction range is No. 2, No. 3, No. 4, the main motion direction of the area is determined to be upward, and is recorded as an image G_k(x, y) as the final suspected smoke area, can be sent to the recognition model for recognition, and the motion blocks in other directions are discarded.

Identification comparison description:

contrast test of convolutional neural network model and SVM recognition model

Introduction of a recognition model: the network has three layers of convolutions denoted C1, C2, and C3. The pooling layers are designated as S1, S2, S3, respectively. Note F1 is a full-junction layer, and 1024 neurons were arranged to connect to the pooling layer S2. The activation function selects ReLU (rectified Linear Unit). F2 is labeled as the output layer, and 2 neurons were connected to F1, with SoftMax as the activation function.

The convolution kernel size in convolutional layer C1 takes 5 × 5, the step size is 1, and the activation function is ReLU. The offset value is 1, and the input data is a color image of 32 × 24 size. The weight values of the convolution kernels are randomly generated using a normal distribution with a variance of 0.1, and 32 feature images of 32 × 24 size are output. The pooling layer S1 resamples the output result of the convolutional layer C1 using the maximum pooling method, and the result is 16 × 12 × 32 dimensional data. The convolution layers C2 and C3 both have a convolution kernel size of 5 × 5, a step size of 1, the number of feature maps of C2 is 96, the number of feature maps of C3 is 128, the dimensions of output data are respectively 16 × 12 × 96 and 8 × 6 × 128, the dimensions of output data of the pooling layers S2 and S3 are respectively 8 × 6 × 96 and 4 × 3 × 128, and the structure diagram of the convolution network model is shown in fig. 2.

The recognition stage mainly adopts a two-classification problem to realize the recognition capability of smoke and smoke in the smoke image, and the convolutional neural network and the HOG + SVM recognition model are adopted for comparison in the section.

HOG (Histogram of Oriented Gradients), a commonly used feature descriptor for images, is a well-known feature descriptor. The main principle is that the image is equally divided into blocks, the gradient in the 8-field direction is respectively calculated for each block area, and finally a gradient vector is formed to represent the characteristics of the image. SVMs (support Vector machines) are a commonly used recognition model in Machine learning. The images are classified mainly in the hyperplane by using a learning strategy with the most optimized interval, and the comparison result is shown in table 1.

TABLE 1 identification of model comparison results

Wherein the evaluation index of the model is identified^[13]Mainly comprises the following steps:

ACC＝(TP+TN)/N (20)

TPR＝TP/(TP+FN) (21)

TNR＝TN/(TN+FP) (22)

wherein ACC is the accuracy; TNR is true negative rate; TPR is the true rate; n is the total number of samples.

Wherein:

1) TP is the number of true samples, representing the number of smoke samples identified as smoke images.

2) FP is the number of false positive samples, representing the number of smoke samples that are not identified as smoke images.

3) TN is the number of true negative samples representing the number of non-smoke samples that are not identified as smoke images.

4) FN is the number of false negative samples, representing the number of identified non-smoke images in the non-smoke sample.

Results

The accuracy is as follows: the proportion of the number of the frames really having smoke in all the alarm frames in all the test videos shows the probability of the system sending out correct alarms. The omission rate is as follows: the ratio of the number of undetected smoke frames in the total smoke image in all test videos is shown. The accuracy and the omission factor of each video detection result are obtained, and then the accuracy and the omission factor of 14 videos are respectively averaged. The results are shown in Table 5.

TABLE 5 all video data set accuracy and miss rate

	Gaussian mixture	Difference between frames	Dense optical flow method	Methods of the invention
					Rate of accuracy	89.23％	82.49％	90.17	94.52
Rate of missed examination	29.15％	67.09％	23.72％	12.39％

As can be seen from table 5, in the above methods, the dynamic smoke recognition system established by the method proposed herein has a higher accuracy and a lower false detection rate.

Claims

1. A method of extracting a suspected smoke region from dynamic smoke, comprising: the method is realized by the following steps:

firstly, preprocessing an input video image;

the method specifically comprises the following steps:

then, denoising the video image by using a filter;

the method specifically comprises the following steps:

firstly, inputting smoke video images frame by frame;

then, carrying out angular point detection on the moving object;

then, determining an HSV color space suitable for the smoke image, setting a saturation threshold range, estimating the motion direction of the video motion block meeting the threshold, and determining the motion block with the upward motion direction as a suspected smoke area; extracting the area of the motion direction as the input of estimating the motion direction based on the video block;

and step four, taking the suspected smoke area obtained in the last step as an area to be detected, and extracting features in the area to be detected by using a convolutional neural network model for identification.

2. The method of claim 1, wherein the area of suspected smoke is extracted from the dynamic smoke by: step two, the step of detecting the angular points of the moving object specifically comprises the following steps:

the angular points are characteristic points of which the gray value of the pixel point can be changed violently when the pixel point moves in the horizontal direction and the vertical direction, the preprocessed smoke video images are input frame by frame, and the angular points are searched after the angular points are detected frame by frame: calculating and judging the angular point of the smoke region in the image by adopting the following method:

ω (x, y) is a window centered at point (x, y), i.e. a weighting function, which performs a first order approximation on image I (x, y) after translation (Δ x, Δ y):

I(u+Δx,v+Δy)＝I(u,v)+I_x(u,v)Δx+I_y(u,v)Δy+O(Δx²,Δy²)≈I(u,v)+I_x(u,v)Δx+I_y(u,v)Δy （13）

wherein, I_x、I_yIs the partial derivative of image I (x, y), approximated by:

wherein M:

the method is simplified and can be obtained:

c(x,y；Δx,Δy)≈AΔx²+2CΔxΔy+BΔy² (16)

R＝λ₁λ₂-α(λ₁+λ₂)² (18)

wherein, α is 0.04, and R is a corner response value;

3. A method of extracting suspected smoke areas from dynamic smoke as claimed in claim 2, wherein: the step of performing optical flow vector estimation on the angular points of the moving object described in the second step specifically includes:

performing optical flow vector estimation on corner points detected by the smog video image, obtaining a corner point optical flow vector diagram from a formula (4) to a formula (11),

the optical flow is defined as the instantaneous speed of the motion of a pixel point on an image plane, and the corresponding relation between the previous frame and the current frame is found according to the change of the pixel in an image sequence on a time domain and the correlation between adjacent frames, so that the motion information of an object between the adjacent frames is calculated; the information contained in the optical flow vector is the instantaneous motion velocity vector information of each pixel point; the optical flow calculation method is based on the following three assumption premises:

3) the space is consistent: projecting adjacent points on a scene onto the image is also adjacent points, and the speeds of the adjacent points are consistent; according to the above three premises, it is assumed that a pixel point on the image is located as (x, y), the luminance at time t is E, and u and v are used to represent the displacement components of the optical flow of the point in the horizontal and vertical directions:

E(x,y,t)＝E(x+Δx,y+Δy,t+Δt) (6)

E_xu+E_yv+E_t＝0 (10)

4. a method of extracting suspected smoke areas from dynamic smoke as claimed in claim 3, wherein: step two, the step of segmenting the region where the pixel point with the vector is located to form the video motion block specifically comprises the following steps:

and recording the position information of the corner points with the optical flow vectors, and taking the corner points with the optical flow vectors as the central points of the motion areas to perform slicing processing in the original smoke images so as to divide the smoke motion block images.

5. The method of claim 4, wherein the area of suspected smoke is extracted from the dynamic smoke by: and the size of the image of the smoke motion block in the second step is 32 multiplied by 24.