Disclosure of Invention
In order to solve the problems, the invention provides a method and a system for detecting abnormal conditions of flame and smoke, which are based on a fire detection method combining color features, wavelet analysis and a convolutional neural network and can realize accurate positioning of fire and smoke in a video.
In some embodiments, the following technical scheme is adopted:
a method of detecting flame and smoke anomalies, comprising:
collecting video sequence images of a set area;
extracting a flame candidate region from a video sequence image;
establishing a background block differential model, judging the characteristics of the smoke shielding object, and extracting a smoke candidate region;
and respectively screening the extracted flame candidate region and smoke candidate region, and determining the specific position of flame and/or smoke in the image.
In other embodiments, the following technical solutions are adopted:
a flame and smoke anomaly detection system comprising:
the device is used for acquiring original image information of a set area;
means for extracting a flame candidate region;
the device is used for establishing a background block differential model, distinguishing the characteristics of the smoke shielding object and extracting a smoke candidate region;
and the device is used for screening the extracted flame candidate region and smoke candidate region and determining the specific position of the flame and/or smoke in the image.
In other embodiments, the following technical solutions are adopted:
the video monitoring equipment adopts the flame and smoke abnormal condition detection method to detect flame and smoke.
In other embodiments, the following technical solutions are adopted:
the inspection robot adopts the flame and smoke abnormal condition detection method to detect flame and smoke.
Compared with the prior art, the invention has the beneficial effects that:
identifying a flame area by adopting a color segmentation method, and distinguishing the characteristics of a smoke shielding object by utilizing a background block differential model so as to detect the smoke area; the accuracy of flame and smoke detection can be ensured, the average detection time is minimum, and the real-time requirement of fire detection can be met;
in order to enhance the real-time performance of the flame smoke detection algorithm, the convolutional neural network model is used for reducing the false detection rate of the algorithm and further improving the accuracy of flame and smoke detection.
Example one
In one or more embodiments, a flame and smoke abnormal condition detection method is disclosed, wherein a video sample of a set area is acquired, for the acquired video sample, a candidate area of flame is extracted by using a color segmentation method, a candidate area of smoke is generated by using a background block differential model based on a full convolution neural network, then the candidate area is screened by using a trained CNN model, and the position of flame and smoke appearing in a picture is determined.
Finally, a large number of fire pictures in different scenes are used for testing the method. The test result shows that the method can accurately and quickly detect the positions of flame and smoke from the image or video, and can be practically applied to a fire detection task in a transformer substation scene.
In order to describe the algorithm, in this embodiment, a visible light camera is used to capture a flame video in an artificial simulated fire scene.
The method for detecting the abnormal conditions of the flame and the smoke specifically comprises the following steps:
(1) analyzing the collected flame and smoke videos into images, analyzing the color characteristics of the flame images, and establishing a color segmentation model of a candidate area for generating flame; obtaining a flame candidate region of the current image by using a color segmentation model;
flames typically appear red, and the RGB color model has less computational complexity than other color models. However, HIS and HSV color models are often used in flame image recognition because their way of describing color is more suitable for human perception of objective world color. The embodiment adopts a method of combining an RGB color model and an HSV color model to identify the flame region. Sampling flame images under different scenes, establishing an RGB space flame color characteristic model by analyzing R, G and B mean values of flame areas, and specifically comprising the following steps:
(1-1) since the R channel has a relatively large value and the B channel has a minimum value in most flame regions, the RGB color feature model of the flame in this embodiment is represented by the following formula:
(i)R>=G>=B
(ii)R>=mean(R),G>=mean(G),B>=mean(B)
in the above formula, mean (r), mean (g) and mean (b) respectively represent the average values of RGB channels of all pixels in one picture.
(1-2) RGB color space is commonly used in display systems and is not suitable for image segmentation and analysis, and the segmentation of flames by using only an RGB color model generates a large number of false flame regions, and the flame segmentation effect is not ideal, so that a video image needs to be converted into HSV space for extracting the flame regions.
In order to reduce the loss caused by fire, find the fire in time and send out an alarm, it is necessary to detect the flame and smoke in the early stage of the fire occurrence and to extinguish the fire in the bud.
The flame detection proposed by the embodiment aims at the initial stage of fire occurrence, at this time, the flame area is small, and is represented as R > G > B in the RGB space, and is mainly represented as saturation in the HSI color space, that is, the S value satisfies a certain rule. Converting the color model of the smoke image in the RGB space into the HSI space to obtain a characteristic model of the flame image in the HSI color space, wherein the characteristic model is represented by the following formula:
0<=H<=60°
S=1-3×(min(min(r,g),b)/(r+g+b));
h represents an H component in the image HSI space, S represents an S component in the image HSI space, the number of flame candidate regions can be changed by adjusting the parameters rTh and sTh, and the smaller the ratio of the parameters sTh to rTh, the more flame candidate regions are obtained, and the more false detections are. In order to detect all flame regions, the present embodiment sets rTh to 200 and sTh to 5.
(2) The method comprises the steps of partitioning a smoke image, rapidly identifying the partitioned image by building a small-sized full-convolution neural network, and realizing coarse segmentation of the smoke region by multi-scale scaling of the image, so that a background region and a foreground region are separated, and background modeling is carried out by using background subblocks. Once the background construction is completed, the roughly divided motion area can be finely divided by a background subtraction method to detect a smoke area. And identifying the sub-block image through a full convolution neural network, and if the sub-block image is identified as a non-smoke area, judging the sub-block image as a background. The method comprises the following specific steps:
(2-1) constructing a small-sized full convolution neural network, wherein the network structure comprises 4 convolution layers, 3 activation layers and 1 pooling layer. There is no full connection layer. The last convolutional layer output is 2. The image blocks of 12 × 12 size are input into the network, and the output is a 2-dimensional vector, which represents the probability of whether the image block is smoke or not, respectively.
And (2-2) analyzing the collected smoke video into images and labeling one by one. And then extracting the smoke region according to the label file, and placing the smoke region in one folder, wherein the background of the video image is placed in the other folder. And scaling the intercepted smoke image into a 12 x 12 image block, taking the image block with the confidence coefficient smaller than 0.5 as a difficult sample of smoke, scaling the background image into a 12 x 12 image block, and taking the image block with the confidence coefficient larger than 0.6 as a difficult negative sample. And training the network after the sample is manufactured to obtain a training model.
And (2-3) in order to obtain a background image of the scene, roughly segmenting the smoke image sequence by adopting the fully-convolutional neural network trained in the previous step. Assume that the smoke sequence image size is W × H. Firstly, dividing a sequence image into m × n sub-blocks, then scaling the image to the size of (W/m, H/n), then inputting the scaled image into a full convolution neural network, and finally calculating the probability value of whether the m × n sub-blocks are smoke or not. For example, for the k frame image, the probability value of each small block after division is
i is 1,2, … m, j is 1,2 … n, and the threshold T is set to 0.7. If it is not
The sub-block is determined to be a foreground block, otherwise it is determined to be a background sub-block.
(2-4) background construction and update, and setting the background image to be constructed as B (x, y), similar to the previous image frame processing, the background image is correspondingly divided into m × n sub-blocks, each sub-block being represented as
At the beginning, the gray value of all pixel points of the background image is set to be-1, namely B
0(x, y) — 1, indicating that the background is not updated. Using sub-blocks determined as background, based on the result of the coarse segmentation of the image in the video sequence
To construct or update the background image. Specifically, if the corresponding background sub-block is not updated, i.e. its pixel value is-1, the gray value determined as the background sub-block is directly substituted. Otherwise, the background sub-block of the current frame is used for proper updating.
Wherein, 0 is equal to or less than α is equal to or less than 1, which is the proportion of the current frame background sub-block in the background modeling, when α is equal to 1, the background model is not updated, and when α is equal to 0, the gray value of the background sub-block area of the current frame is directly used for replacing the corresponding area in the background model.
And (2-5) extracting a smoke area, subtracting and differentiating the foreground sub-blocks and the corresponding backgrounds to construct a binarization template of the motion area of each foreground sub-block of the kth frame image, merging the sub-blocks of the binarization templates of all the motion areas to obtain a motion target binarization template of the whole kth frame image, and performing AND operation on the template and the original image to extract the smoke area.
The foreground sub-block is an image block only containing smoke, and the background sub-block is an image area not containing smoke.
(3) Establishing a small convolutional neural network model for filtering out false candidate regions, wherein the convolutional neural network model comprises 3 convolutional layers, 3 pooling layers and two full-connection layers, and the size of the model is 391k, as shown in fig. 3;
due to the complexity of the image background, the extracted target candidate region may contain some misdetected image blocks, and the candidate regions of flame and smoke may be screened through the designed CNN classifier to filter out those false candidate regions. The construction process of the CNN network comprises the following steps:
(3-1) building convolutional layers, wherein the sizes of convolution kernels of the first two convolutional layers are 3 multiplied by 3, and the size of convolution kernel of the third convolutional layer is 2 multiplied by 2.
And (3-2) building pooling layers, wherein each convolution layer is followed by one pooling layer and a PReLu activation function. The first two pooling layers used Max pool (3 × 3) and the latter one used Max pool (2 × 2).
And (3-3) building a fully-connected layer, wherein the output characteristic dimension of the first fully-connected layer is 128, and the output characteristic dimension of the 2 nd fully-connected layer is 3, and respectively represents the category predicted values of the input image. The last softmax layer can calculate the probability of the input image block belonging to flame, smoke and background respectively.
(4) Respectively manufacturing flame and smoke samples, setting network parameters, and training a convolutional neural network model by using the manufactured samples; the method specifically comprises the following steps:
and (4-1) making training samples, wherein the training samples I are classified into three types, namely a flame sample, a smoke sample and a background. Firstly, marking flame and smoke areas of a collected fire image by using an image marking tool;
(4-2) randomly cutting image blocks with any size from the fire picture according to the labeling information, and scaling to a size of 24 x 24;
(4-3) the background sample is then randomly cropped a certain number of image blocks from the picture that does not contain flame and smoke and scaled to a size of 24 x 24. The training set contains 6 million images, and the number proportion of the flame sample, the smoke sample and the background sample is 1: 2: 3.
(4-4) training the network, wherein in the training process: 60% of the images were used as training set, 20% of the images were used as validation set, and 20% of the images were used as test set. The CNN network is trained using a random gradient descent (SGD) method, the size of the batch size is 256, and the weights in the network are initialized randomly. The initial learning rate was 0.01 and the momentum was 0.9. Meanwhile, in order to prevent the CNN network from generating an overfitting phenomenon in the training process, a Dropout layer is added behind the two fully-connected layers, and the value of Dropout _ ratio is 0.5. The network iterates a total of 10 ten thousand times during the training process.
(5) And inputting the images of the flame candidate area and the smoke candidate area into the convolutional neural network model trained by the corresponding sample to obtain accurate position information of flame and smoke in the current image.
Because the video data of ideal experiment is difficult to obtain, so gather the sample by oneself and test, concrete step includes:
(5-1) aiming at a flame detection experiment, a flame data set is automatically established, and 3000 flame pictures under different environments and scenes are collected totally, wherein the scenes comprise partial transformer substation flame pictures, forest flame pictures, grassland flame pictures, city flame pictures and the like.
(5-2) tests were performed on 8-segment smoke video for smoke detection experiments. 4 sections of videos contain smoke and are used for testing the recognition accuracy of the algorithm to the smoke, and the other 4 sections of videos do not contain the smoke and are used for testing the false detection rate of the algorithm.
(6) Testing the flame smoke sample collected by the visible light camera in real time according to the color segmentation model, the background block difference model and the Convolutional Neural Network (CNN) model, and checking the detection effect; fig. 1 and fig. 2 respectively show a flame detection effect diagram and a smoke detection effect diagram in different scenes obtained by using the method of the present embodiment, and it can be seen from the diagrams that the method of the present embodiment can accurately identify the positions of flame and smoke in the pictures.