CN114445729B

CN114445729B - Small target fire detection method based on improved YOLO algorithm

Info

Publication number: CN114445729B
Application number: CN202111162490.1A
Authority: CN
Inventors: 郑文; 要媛媛; 武心南; 崔佳梅
Original assignee: Taiyuan University of Technology
Current assignee: Taiyuan University of Technology
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2024-03-29
Anticipated expiration: 2041-09-30
Also published as: CN114445729A

Abstract

The invention belongs to the technical field of computer vision and forest fire prevention, and particularly relates to a small target fire detection method based on an improved YOLO algorithm. Comprising the following steps. S100, establishing a fire disaster data set, and collecting images of fire disasters and smog, including images of flames and smog under different weather and light rays; s200, marking a data set; s300-building an improved YOLO-V3 algorithm frame, S400-pre-training a network, and pre-training the improved YOLO-V3 model through the fire image training data set to obtain prediction frames with different scales. The invention solves the problems of insufficient speed and precision of the traditional YOLO-V3 designed for universal target detection when detecting small targets.

Description

Small target fire detection method based on improved YOLO algorithm

Technical Field

The invention belongs to the technical field of computer vision and forest fire prevention, and particularly relates to a small target fire detection method based on an improved YOLO algorithm.

Background

Currently, detection of fires relies in large part on various sensor detections, including smoke alarms, temperature alarms, and infrared alarms. While these alarms can function, they have significant drawbacks. First, a certain concentration of particles in the air must be reached to trigger an alarm. When an alarm is triggered, the fire may be too strong to control, thereby defeating the purpose of the pre-warning. Secondly, most alarms can only work in a closed environment and cannot work in spaces such as the open air or public places. Third, there may be false positives. When the concentration of non-fire particles reaches the alarm concentration, an alarm sound is automatically sent out.

Small targets have a relatively small visual size compared to the surrounding environment, so detection of small targets is challenging compared to general detection tasks, and performance in small target detection is not ideal. These problems present a significant challenge to detecting small targets in fire scenarios.

The R-CNN series is represented in the two-stage detection, and the most widely used Faster R-CNN framework is currently used. A series of candidate frames serving as samples are generated by an algorithm, and then the samples are classified by a convolutional neural network. Typical detection frames for single-stage detection include YOLO (you only look once) series, SSD (single shot multibox detec-tor), retinaNet, etc. Such methods skip the Region suggestion (Region prediction) stage, once to get the final localization and content prediction. That is to say the area proposal network is fully integrated with the classification localization. Thus, redundant calculation can be reduced to a certain extent, and the speed is improved.

Disclosure of Invention

The invention provides a small target fire detection method based on an improved YOLO algorithm, which aims to solve the problem that the speed and the precision of YOLO-V3 which are conventionally designed for general target detection are insufficient when small targets are detected.

The invention adopts the following technical scheme: a small target fire detection method based on an improved YOLO algorithm comprises the following steps.

S100, establishing a fire disaster data set, and collecting images of fire disasters and smog, including images of flames and smog under different weather and light rays; s200, marking a data set; s300-building an improved YOLO-V3 algorithm framework, using a Darknet-53 framework as a basic network framework, firstly scaling an input image, dividing the input fire image into S multiplied by S grids, then using EfficientNet to extract the characteristics of the image, and detecting whether flame or smoke exists in each grid unit; s400-a pre-training network, pre-training an improved YOLO-V3 model through the fire image training data set, up-sampling a feature image learned by three-time depth separable convolution, pushing by a feature pyramid to obtain prediction frames with different scales, predicting 3 boundary frames by each grid, giving out confidence scores of the boundary frames, and finally screening the given boundary frames through non-great inhibition; the confidence reflects whether the mesh contains an object and the accuracy of the predicted bounding box when it contains an object, and when multiple bounding boxes detect one target at the same time, the YOLO network will use a non-maximal suppression method to select the best bounding box, i.e., to choose the bounding box whose confidence meets the threshold.

The step S200 includes the steps of,

s201-scaling the collected photographs of the target object to half of the original size.

S202, centering the scaled small target data set picture on top of 1850×1850 pixel picture.

S203, labeling a small target data set by using LabelImg software, and labeling the name, the path, the pixel position and the labeling category of a labeling frame of each label picture in an xml format; the final data set is stored in the format of a PASCAL VOC data set.

S204, in order to prevent the overfitting, the performances of different algorithms are better compared, the data sets are ensured to be uniformly distributed in consideration of the corresponding relation between the labels and the data, and the marked data sets are randomly divided into a training set, a verification set and a test set according to the proportion of 70%,20% and 10%. Positive samples with unclear pixel regions are not labeled in order to prevent overfitting in the neural network. The training set is used for fitting the model, and the classifying model is trained by setting parameters of the classifier; fitting a plurality of classifiers according to the modified parameter values when the verification set is combined subsequently; the verification set data are used for the accuracy of each model to the same data set, and then parameters corresponding to the model with the best effect are selected; after the optimal model is obtained through the training set and the verification set, model prediction is carried out by using the test set, and the performance and the classification capability of the optimal model are measured.

In step S300, the specific method for detecting flame or smoke by each grid unit is to calculate the gradient value by using gradu=ax (∂ u/∂ x) +ay (∂ u/∂ y) +az (∂ u/∂ z) by performing a difference operation on the center points of the separated high-risk area and flame area for each pixel point in four directions; and (3) storing the calculation results of a series of gradient values calculated in the flame boundary in a single linked list Q1, collecting the next frame, repeating the steps, recording as Q2, and judging that the flame is the flame if the difference between the Q1 and the Q2 is larger than a preset flame gradient threshold value.

In step S300, the calculation in the training process in the improved YOLO-V3 algorithm framework is shown in the formula (1):

（1）

wherein S represents gridsize; x is x _i Representing predicted value, y _i Representing a tag value; s2 represents 13×13, 26×26, 52×52; b represents a box and is used for the purpose of,indicating that if box at i, j has a target, its value is 1, otherwise it is 0; />Indicating that if box at i, j has no target, its value is 1, otherwise it is 0; w and h represent the width and height of the group Truth box; />Representing the confidence of the parameter; />Representing a classification error; if the center of the detected target falls in one grid, the grid is responsible for detecting the target, and features are extracted through continuous depth separable convolution, global average pooling, feature compression and feature expansion.

In step S400, the confidence calculation is defined as follows: confidence=P _r (Object) ×IoUtruth pred，P _r (Object) e {0,1}, when the target is in the grid, P _r =1, otherwise 0; ioU denotes the coincidence between the prediction bounding box and the real bounding box.

Compared with the prior art, the invention has the following beneficial effects: the small target fire detection method based on the improved YOLO algorithm provided by the invention can converge faster and has higher precision and robustness compared with the traditional detection model when the target to be detected in the service scene is a small target, and obtains better precision and speed in the detection process.

Drawings

FIG. 1 is a network architecture diagram of the improved YOLO-V3 algorithm of the present invention;

FIG. 2 is a modified YOLO-V3 algorithm detection step of the present invention;

FIG. 3 is a P-R curve of the modified YOLO-V3 algorithm of the invention compared to unmodified YOLO-V3 and the fast R-CNN algorithm;

FIG. 4 is a schematic representation of the detection of the modified YOLO-V3 algorithm of the present invention under different light rays and scenes;

FIG. 5 is a schematic representation of the detection of the improved YOLO-V3 algorithm of the present invention on a small target dataset.

Detailed Description

In order to make the solution of the embodiment of the present invention better understood by those skilled in the art, the embodiment of the present invention is further described in detail below with reference to the accompanying drawings and embodiments.

According to fig. 1 to 5, the present embodiment provides a small target fire detection method based on an improved YOLO algorithm, which includes the following steps:

s100, a fire data set is established, and the collected images of fire and smoke, including images of flames and smoke under different weather and light, are established through a large number of websites; the original image 19819 of the dataset, the large number of images can make the model better learn features.

S200, marking a data set:

in step S200, the collected photographs of the target object are scaled to half of the original size. The scaled small target dataset picture is centered on top of 1850 x 1850 pixel picture. Labeling a small target data set by using LabelImg software, and labeling the name, the path, the pixel position and the labeling category of a labeling frame of each label picture in an xml format; the final data set is stored in the format of a PASCAL VOC data set.

In order to prevent the overfitting, the performances of different algorithms are better compared, the corresponding relation between the labels and the data is considered, the data set is ensured to be uniformly distributed, and the marked data set is randomly divided into a training set, a verification set and a test set according to the proportion of 70%,20% and 10%. Wherein the training set comprises 210 pictures, the verification set comprises 60 pictures and the test set comprises 30 pictures. Positive samples with unclear pixel regions are not marked. In order to prevent overfitting in the neural network, a training set is used for fitting a model, and a classification model is trained by setting parameters of a classifier; fitting a plurality of classifiers according to the modified parameter values when the verification set is combined subsequently; the verification set data are used for the accuracy of each model to the same data set, and then parameters corresponding to the model with the best effect are selected; after the optimal model is obtained through the training set and the verification set, model prediction is carried out by using the test set, and the performance and the classification capability of the optimal model are measured.

S300: an improved YOLO-V3 algorithm framework is established:

as a preferred embodiment of the present invention, the dark-53 architecture is used as the basic network architecture in S300, and the input image is scaled to 416 first416 pixels, and dividing the input fire image into S +.>S grids, then, using EfficientNet for image feature extraction, and detecting whether flame or smoke exists in each grid unit. The method is characterized in that the difference value operation is carried out on the center points of the separated high-risk areas and the flame areas in four directions pixel by pixel, so that the gradient in the four directions is approximately taken as an example in the horizontal direction, the gradient value is calculated in an area 3X 3 with Rs as the center, the gradient value of R6 is calculated by using the same template, the calculated result of a series of gradient values calculated in the flame boundary is stored in a single linked list Q1, when the next frame is acquired, the steps are repeated, the step is recorded as Q2, and if the difference between the Q1 and the Q2 is larger than a preset flame gradient threshold value, the flame can be judged. Loss meter during training in improved YOLO-V3 algorithm frameworkThe calculation is as follows:

S400, pre-training the network:

as a preferred technical scheme of the invention, in step S400, the improved YOLO-V3 model is pre-trained through the fire image training data set, in order to better process high-resolution images, firstly, an input image is scaled to 416×416 pixels, then, feature extraction of the images is carried out by using EfficientNet, after continuous depth separable convolution, global average pooling, feature compression and feature expansion, the feature images learned by three times of depth separable convolution are up-sampled, and prediction frames with different scales are obtained through pushing of feature pyramids.

As a preferred embodiment of the present invention, the modified YOLO-V3 model proposed in step S400 predicts three bounding boxes of different dimensions, 13×13, 26×26, 52×52, respectively. Fire detection is provided for the target classification. The P-R curve of the modified YOLO-V3 algorithm compared to the unmodified YOLO-V3 and the fast R-CNN algorithm is shown in FIG. 3.

As a preferred technical scheme of the invention, in step S400, a series of experiments are performed on the test image by using the improved YOLOV3t model, the detection result is shown in FIG. 4, so as to verify the performance of the algorithm, and the correlation index accuracy (Precision), recall (Recall) and F1 score of the effectiveness of the neural network model are evaluated as follows:

precision and Recall are defined as follows:

(1)

(2)

the definition of the F1 fraction is as follows:

(3)

the study also used another evaluation index for object detection-Average Precision (AP). The definition is as follows:

(4)

the contents of the small target data set are all fires and smoke containing small targets, and the detected targets are very small in area in the image. By comparing the detection of the three models on the data set of the small target, the proposed model has better detection effect on the very small target than the fast R-CNN, the Yolo-V3 network, in the detection result, the Yolo 3-EfficientNet can detect all fires and cigarettes in the picture, and the Yolo-V3 and the fast R-CNN can only detect part of the targets in the image. The final detection result is shown in FIG. 5.

The present invention is not limited to the specific technical solutions described in the above embodiments, and other embodiments may be provided in addition to the above embodiments. Any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art, which are within the spirit and principles of the present invention, are intended to be included within the scope of the present invention.

Claims

1. A small target fire detection method based on an improved YOLO algorithm is characterized by comprising the following steps of: comprises the steps of,

s100, establishing a fire data set, and collecting images of fire and smoke, including images of flames and smoke under different weather and light;

s200, marking a data set;

s300-building an improved YOLO-V3 algorithm framework, using a Darknet-53 framework as a basic network framework, firstly scaling an input image, dividing the input fire image into S multiplied by S grids, then using EfficientNet to extract the characteristics of the image, and detecting whether flame or smoke exists in each grid unit:

s400-a pre-training network, pre-training an improved YOLO-V3 model through the fire image training data set, up-sampling a feature image learned by three-time depth separable convolution, pushing by a feature pyramid to obtain prediction frames with different scales, predicting 3 boundary frames by each grid, giving out confidence scores of the boundary frames, and finally screening the given boundary frames through non-great inhibition; the confidence reflects whether the mesh contains an object and the accuracy of the predicted bounding box when it contains an object, and when multiple bounding boxes detect one target at the same time, the YOLO network will use a non-maximal suppression method to select the best bounding box, i.e., to choose the bounding box whose confidence meets the threshold.

2. The small target fire detection method based on the modified YOLO algorithm of claim 1, wherein: the step S200 includes the following steps,

s201, scaling the collected photos of the target object to half of the original size;

s202, centering the scaled small target data set picture on 1850×1850 pixel picture;

s203, labeling a small target data set by using LabelImg software, and labeling the name, the path, the pixel position and the labeling category of a labeling frame of each label picture in an xml format; the final data set is stored in the format of a paspal VOC data set;

s204, randomly dividing the marked data set into a training set, a verification set and a test set according to the proportion of 70%,20% and 10%.

3. The small target fire detection method based on the modified YOLO algorithm of claim 2, wherein: in the step S300, the specific method for detecting flame or smoke by each grid unit is to calculate the gradient value by using gradu=ax (∂ u/∂ x) +a ᵧ (∂ u/∂ y) +az (∂ u/∂ z) by performing a difference operation on the center points of the separated high-risk area and flame area pixel by pixel in four directions;

and (3) storing the calculation results of a series of gradient values calculated in the flame boundary in a single linked list Q1, collecting the next frame, repeating the steps, recording as Q2, and judging that the flame is the flame if the difference between the Q1 and the Q2 is larger than a preset flame gradient threshold value.

4. The small target fire detection method based on the modified YOLO algorithm of claim 3, wherein: in the step S300, the Loss calculation in the training process in the improved YOLO-V3 algorithm frame is shown as follows:

wherein S represents gridsize; s is S ² 13×13, 26×26, 52×52; b represents a box and is used for the purpose of,indicating that if box at i, j has a target, its value is 1, otherwise it is 0; />Indicating that if box at i, j has no target, its value is 1, otherwise it is 0; w and h represent the width and height of the group Truth box; />Representing the confidence of the parameter; />Representing a classification error; if the center of the detected target falls in one grid, the grid is responsible for detecting the target, and features are extracted through continuous depth separable convolution, global average pooling, feature compression and feature expansion.

5. The small target fire detection method based on the modified YOLO algorithm of claim 4, wherein: in the step S400, the definition of the confidence coefficient calculation is as follows: confidence=P _r (Object) ×IoUtruth pred，P _r (Object) e {0,1}, when the target is in the grid, P _r =1, otherwise 0; ioU denotes the coincidence between the prediction bounding box and the real bounding box.