CN115131622B

CN115131622B - Night open fire detection method based on video time sequence correlation

Info

Publication number: CN115131622B
Application number: CN202210559156.8A
Authority: CN
Inventors: 刘鹏宇; 袁静; 刘天禹
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2022-05-22
Filing date: 2022-05-22
Publication date: 2024-03-29
Anticipated expiration: 2042-05-22
Also published as: CN115131622A

Abstract

The invention discloses a night open fire detection method based on video time sequence correlation, and belongs to the technical field of target detection. The invention constructs an open flame image data set for training; constructing an open fire detection network; and constructing a time sequence judging module. The invention designs the open fire detection method by fully utilizing the advantages of deep learning in the target detection field, is not limited by scenes, distances and open fire types, designs the time sequence judging module according to the video sequence, combines the difference of open fire and lamplight, analyzes the state of the detected open fire, and effectively reduces the false detection of night lamplight on the basis of not reducing the real open fire detection precision.

Description

Night open fire detection method based on video time sequence correlation

Technical Field

The invention relates to the field of intelligent security, in particular to a night open fire detection method based on video time sequence correlation.

Background

Currently, in the field of open fire detection, there are three main methods, i.e., manual field monitoring, detection using a temperature sensor and a smoke sensor, and detection using an image processing technique. The first method needs related personnel to detect the scene which is easy to generate fire through camera pictures or field monitoring, and has high requirement on human resources. The second method is limited by the detection range of the sensor, and has poor detection effect on the middle-long distance scene. The third image processing technique is divided into a conventional feature extraction method and a deep learning object detection method. The traditional feature extraction method is used for judging by extracting the color and the outline of the open fire, has poor generalization, is influenced by light, and has high false detection rate for outdoor scenes at night. The deep learning target detection method utilizes the neural network to learn a large amount of open fire image data, extracts the open fire characteristics of different distances, scenes and categories, can well finish the open fire detection task, but still has false detection on night lamplight.

Analysis shows that similar characteristics exist on the colors and the forms of night lamplight and open fire, the method for generating the mapping relation by learning the image characteristics is deep learning, open fire detection is performed based on a single image, and from the working principle, false detection is unavoidable because the difference of the time sequence correlation of the open fire video and the night lamplight video is ignored. A large number of test data also show that when the warm-light exists in the image, the existing method can easily identify the warm-light as open fire, so that the bottleneck of the open fire detection technology is how to reduce false detection of the night light. The method is designed to have a time sequence correlation detection method by taking the difference of dynamic characteristics of open fire and lamplight in a video into consideration and utilizes the static characteristics in the image, the morphological change of a target to be detected in the video is analyzed on the basis of the existing deep learning detection, the detection result is further restrained by the change trend, and the purpose of high-robustness night open fire detection is achieved. The problem to be solved is divided into two points. The first point is to construct a deep learning open fire detection model with excellent performance, so that the open fire detection precision is fundamentally improved; and a judging module of time sequence correlation is added on the basis of detection at the second point, so that false detection caused by light interference is reduced. The invention utilizes the existing deep learning image processing technology to accurately detect open fire, then combines each frame in the monitoring video to construct a video sequence, carries out time sequence evaluation on the state of the detected open fire, and gives the detection result after calibration constraint.

Disclosure of Invention

The invention mainly solves the technical problems that: aiming at the problems that night light interference exists and false detection is easy to occur in night open fire detection, the night open fire detection method based on video time sequence correlation is provided. Comprises the following steps:

step 1: an open flame image dataset for training is constructed.

The quality of the data set in the deep learning is critical to the detection result, so that an original open flame image data set is constructed through various ways of open source image data, crawlers and shooting; the original open fire image data set comprises open fire images of various scenes and different distances in daytime and at night. And then adopting data labeling software to frame open fire in the original image data set to obtain a square frame containing open fire center coordinates and length and width, wherein the square frame is used for training an open fire detection network.

Step 2: and constructing an open fire detection network.

The open fire detection network mainly comprises a feature extraction network, a convolution module and an up-sampling module; the feature extraction network consists of a convolution layer, compresses the video sequence transmitted to the open fire detection network in the length and width directions, expands the channel number, distributes objects with the same features to corresponding channels, and obtains a feature map containing open fire information after processing. And (3) transmitting the feature map containing the open fire information to a convolution module and an up-sampling module, wherein the input of a second convolution module and the input of a third convolution module are added with the branch output in a feature extraction network on the channels of the feature map, then concentrating the channels of the feature map into 5 channels through the convolution module, wherein the first 4 channels represent the center x, y coordinates and width and height of the open fire, and the 5 th channel represents the confidence of the open fire. The up-sampling module amplifies the width and height of the feature map to 2 times of the original width and height, the two up-sampling modules are combined with the output of the two convolution modules and the output of the first convolution module, three outputs with three sizes can be obtained, the size relation of the three outputs is 4:2:1, the feature map with the largest size ratio is responsible for small target open fire, and the feature map with the smallest size ratio is responsible for large target open fire, so that the device has good detection effect on both the large target open fire and the small target open fire. And selecting one of the three outputs with the highest confidence coefficient as a final detection result, and mapping the detection result onto the original video sequence to obtain a sequence containing an open fire detection frame.

Step 3: and constructing a time sequence judging module.

The sequence containing the open fire detection frame obtained in the step 2 may contain a real open fire detection frame or a detection frame for light false detection. The video sequence containing open fire detection frames is sent to a time sequence judging module, detection frame information of a first frame in the video sequence is recorded and stored, then information corresponding to the same detection frame is searched on a second frame, and an IOU (input output unit) namely an intersection ratio is calculated with the detection frames of the first frame, the calculation flow is that the intersection area of the two detection frames is divided by the intersection area of the two frames, and the larger the IOU value is, the higher the coincidence between the two detection frames is. And then, carrying out the same processing between the third frame and the second frame of the video sequence, and cycling to process all the video sequences, thus obtaining the value of the IOU between all open fire detection frames between the frames, and constructing a line graph between the value of the IOU and the sequence number in a combined way. Compared with open fire, the night light is stable, the state change can not occur in the video sequence, the line graph of the night light tends to be in a gentle straight line state, the state of the open fire is continuously changed, and the line graph can show a change trend. Therefore, according to the state of the broken line, the object in the detection frame with the broken line gently deflected to the straight line is judged to be non-open fire, and the object in the detection frame with the broken line oscillating change is judged to be open fire.

Compared with the prior art, the invention has the following advantages:

1. the invention adopts the deep learning image processing technique method, is more accurate in open flame detection than the traditional image processing by utilizing color and contour detection, is not limited by scene, distance and open flame type, and can detect and process the shot picture in real time.

2. The timing sequence judging module is designed on the prior basis of deep learning open fire detection, and the motion state analysis is carried out on the detected open fire by combining the difference between the open fire and the lamplight, so that the false detection of the lamplight at night is effectively reduced.

Drawings

Fig. 1 is a schematic flow chart of a night open fire detection method based on video time sequence correlation in the invention.

Fig. 2 is a frame of an open flame detection method in the present invention.

Detailed Description

The invention mainly realizes the detection of open fire at night, and a specific method adopted by the invention is described in detail below with reference to the accompanying drawings.

Specifically, the overall flow of the night open fire detection method based on the video timing correlation is as shown in appendix fig. 1, and includes the following steps. S1, constructing an open flame image data set for training. S2, constructing an open fire detection network. S3, constructing a time sequence judging module.

(1) For S1, a fracture image dataset for network training is constructed.

The original open fire image data set is obtained through public resources, and the open fire image data set comprises open fires of various scenes and distances in the daytime and at night, so that the diversity of open fire detection is improved. And the data are randomly rotated and supplemented with light, so that the generalization of the data is improved. And then, marking open fire in the image by adopting a data marking method, and detecting the training of the network.

(2) And for S2, constructing an open fire detection network.

Open flame detection networks are shown in the upper part of the appendix of fig. 2. The designed network can be divided into a feature extraction network, a convolution module and an up-sampling module. The feature extraction network mainly comprises a convolution layer and is responsible for carrying out feature extraction on a video sequence transmitted to the network to obtain a feature map containing open flame information. And then, the feature map is conveyed to a convolution module and an up-sampling module after the feature extraction network, wherein the convolution module is in feature fusion with two branches in the feature extraction network and is responsible for fusing information of the feature map into 5 channels, the first 4 channels represent the center x, y coordinates and width and height of the open fire, and the 5 th channel represents the confidence of the open fire. The up-sampling layer is responsible for amplifying the width and height of the feature map to 2 times of the original width and height to obtain three outputs with multiple scales, the feature map with large size is responsible for a small target, and the feature map with small size is responsible for a large target, so that the up-sampling layer has a good detection effect on open fire of the large target and open fire of the small target. And selecting the one with the highest confidence as the final detection result for the three output results, and mapping the detection result onto the original video sequence to obtain a sequence containing an open fire detection frame.

(3) And for S3, constructing a time sequence judging module.

The structure diagram of the time sequence judging module is shown in the lower half part of the annex 2. Considering that the detection result of the open fire detection network may have false detection of night light, sending the detection result to a time sequence judging module, firstly recording and storing the detection frame information of the first frame in the video sequence, then searching the information corresponding to the same detection frame on the second frame, and calculating an IOU (cross-over ratio) with the detection frame of the first frame, wherein the larger the value is, the higher the overlap ratio between the two frames is. 1 represents complete coincidence and 0 represents no coincidence. And then, carrying out the same processing between the third frame and the second frame of the video sequence, and cycling to process all the video sequences, thus obtaining the value of the IOU between all open fire detection frames between every two frames, and constructing a line graph between the value and the sequence number in a combined way. Compared with open fire, the night light is stable, the state change can not occur in the video sequence, the line graph of the night light tends to be in a gentle straight line state, the state of the open fire is continuously changed, and the line graph can show a change trend. The analysis based on the state of the fold line detects whether the object in the box is an open fire.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting. Those skilled in the art will appreciate that: the above embodiments are not intended to limit the present invention in any way, and all similar technical solutions obtained by equivalent substitution or equivalent transformation are included in the protection scope of the present invention.

Claims

1. A night open fire detection method based on video time sequence correlation is characterized in that: the method comprises the steps of,

step 1: constructing an open flame image dataset for training;

constructing an original open flame image data set by means of the open source image data, the crawlers and shooting various approaches; the original open fire image data set comprises open fire images of various scenes and different distances in daytime and at night; then, adopting data labeling software to frame open fire in the original image data set to obtain a square frame containing open fire center coordinates and length and width, wherein the square frame is used for training an open fire detection network; the open fire detection network training is to calculate the loss of the network training result and the marked square frame and optimize the network so that the result is continuously approximate to the marked value;

step 2: constructing an open fire detection network;

the open fire detection network consists of a feature extraction network, a convolution module and an up-sampling module; the feature extraction network consists of a convolution layer, compresses a video sequence transmitted to the open fire detection network in the length and width directions, expands the number of channels, distributes objects with the same features to corresponding channels, and obtains a feature map containing open fire information after processing;

the characteristic diagram containing open fire information is conveyed to a convolution module and an up-sampling module, wherein the input of a second convolution module and the input of a third convolution module are added with the branch output in a characteristic extraction network on characteristic diagram channels, then the channels of the characteristic diagram are concentrated into 5 channels through the convolution module, the first 4 channels represent the center x, y coordinates and width and height of the open fire, and the 5 th channel represents the confidence of the open fire; the up-sampling module amplifies the width and height of the feature map to 2 times of the original width and height, the two up-sampling modules are combined with the output of the two convolution modules and the output of the first convolution module to obtain three outputs with three sizes, the size relation of the three outputs is 4:2:1, the feature map with the largest size ratio is responsible for small target open fire, and the feature map with the smallest size ratio is responsible for large target open fire; selecting one of the three outputs with the highest confidence coefficient as a final detection result, and mapping the detection result onto the original video sequence to obtain a sequence containing an open fire detection frame;

step 3: constructing a time sequence judging module;

the sequence containing the open fire detection frame obtained in the step 2 may contain a real open fire detection frame or a detection frame for light false detection; the video sequence containing open fire detection frames is sent to a time sequence judging module, detection frame information of a first frame in the video sequence is recorded and stored, then information corresponding to the same detection frame is searched on a second frame, and an IOU (input output unit) namely an intersection ratio is calculated with the detection frames of the first frame, the calculation flow is that the area of intersection of the two detection frames is divided by the area of intersection of the two frames, and the larger the value of the IOU is, the higher the coincidence degree between the two detection frames is; then, the same processing is carried out between the third frame and the second frame of the video sequence, and the processing is circulated to the completion of all video sequences, so that the value of the IOU between all open fire detection frames between the frames is obtained, and a line graph is constructed between the value of the IOU and the sequence number in a combined way; and according to the state of the broken line, judging the object in the detection frame with the broken line gently deflected to the straight line as non-open fire, and judging the object in the detection frame with the broken line oscillating change as open fire.