CN116311078A

CN116311078A - Forest fire analysis and monitoring method and system

Info

Publication number: CN116311078A
Application number: CN202310525215.4A
Authority: CN
Inventors: 薛方俊; 蒋先勇; 李志刚; 魏长江; 李财; 胡晓晨; 税强; 曹尔成
Original assignee: Sichuan Sanside Technology Co ltd
Current assignee: Sichuan Sanside Technology Co ltd
Priority date: 2023-05-11
Filing date: 2023-05-11
Publication date: 2023-06-23

Abstract

The invention relates to a forest fire monitoring and protecting method and system, in particular to a forest fire analysis and monitoring method and system, wherein the method comprises the following operations: acquiring a real-time image of a monitoring area; preprocessing an image set; extracting features of the CNN network model, and distinguishing smoke areas and non-smoke areas; extracting features and calibrating positions; judging whether alarm information is sent out; the CNN network model performs feature extraction comprising: carrying out infrared, color and texture feature region enhancement and feature extraction on an input image, and optimizing a CNN network model by using a confusion matrix and cross verification; and carrying out feature extraction and position calibration on the preprocessed image by using the optimized CNN network model. The method enhances multidimensional processing through preprocessing the image, enhancing image characteristics in infrared, color and texture characteristic areas, improving the distinguishing degree of smoke and non-smoke areas and smoke and fog, and judging and positioning in time.

Description

Forest fire analysis and monitoring method and system

Technical Field

The invention relates to forest fire monitoring and protection, in particular to a forest fire analysis and monitoring method and system, electronic equipment and a computer storage medium.

Background

Forest fire is a serious disaster and causes great harm to human society and natural environment. The traditional forest fire monitoring method mainly adopts manual observation and patrol, but the method has the problems of manpower resource waste, limited monitoring range, low accuracy and the like. In recent years, along with the rapid development of computer vision and artificial intelligence technology, research on forest fire monitoring by utilizing technologies such as image processing and deep learning is also receiving more and more attention. However, the existing forest fire monitoring method has the problems of low detection precision, high false alarm rate, insensitivity to various image noises and the like, for example, when the mist and smoke in a large-range and long-distance monitoring area are identified, because the image acquisition distance is far, details in the image can become very small due to the limitation of a camera, the resolution is reduced, the details in the image are not clear enough, finer analysis and processing are difficult to carry out, particularly, the identification of the mist and the smoke is easier to mix, such as mist evaporation (when the vapor in the air reaches saturation in the morning and evening, water mist is formed, the phenomenon is particularly common in the humid climate, such as the rainforest and the mist forest, and the like), and when the smoke is simultaneously present, the current simple machine identification is adopted, the identification is negatively influenced, and the monitoring and the working are difficult even in the areas with a large heat source distribution of a field fountain, a hot spring or other non-fire sources, so that the current machine identification is only spread, has the characteristics of very obvious, even the fire is visible, the mist is easy to mix up, the mist is formed when the mist evaporation is saturated, the water vapor in the humid climate, the mist is particularly common in the areas such that the mist is particularly common in the humid climate, the mist forest and the mist is more frequent, the fire is more accurate, the fire is required to be more accurate to be detected, and the fire is even the fire is more difficult to be more difficult to fire when the fire is in the fire is required to be detected, and has a fire to be in a fire accident situation, and has a fire situation with a fire has a fire accident, and has a fire situation.

Disclosure of Invention

The invention aims to provide a forest fire analysis and monitoring method and system, which utilize deep learning to identify and position a smoke area by preprocessing an image set, extracting various different characteristics and adopting the characteristic enhancement technology of the scheme, thereby realizing accurate judgment, monitoring and early warning of forest fires.

The first aspect of the invention provides a forest fire analysis and monitoring method, which comprises the following operations:

acquiring real-time images of a monitoring area from ground cameras distributed in different positions of a forest zone, and acquiring real-time images of the monitoring area from unmanned aerial vehicles or/and satellites to obtain a real-time image set to be preprocessed; preprocessing an image set; transmitting the preprocessed image to a preset CNN network model for feature extraction, and distinguishing smoke and non-smoke areas; extracting features and calibrating positions of the identified smoke areas; judging whether alarm information is sent out; the step of transmitting the preprocessed image to the CNN network model for feature extraction comprises the following steps: performing infrared enhancement, color feature area enhancement and texture feature area enhancement on an input image, extracting corresponding features, and optimizing a CNN network model by using a confusion matrix and cross verification according to at least two of the infrared enhancement, the color feature area enhancement and the texture feature area enhancement images by using the CNN network model; and carrying out feature extraction and position calibration on the preprocessed image by the optimized CNN network model.

When the monitoring distance of the currently used image acquisition equipment is too far or the monitoring range is larger, smoke appearing in a forest causes certain trouble and confusion to the current machine identification or manual identification, if the characteristics of smoke and water vapor are quite obvious, a time window for controlling fire as early as possible is missed, the method is used for enhancing the characteristics of the image through preprocessing the images, including infrared enhancement, color characteristic area enhancement and texture characteristic area enhancement, multi-dimensional and multi-layer processing, improving the distinguishing degree of the smoke and non-smoke areas, and then utilizing a CNN network model to conduct characteristic extraction and distinguishing of the smoke and the non-smoke areas on the preprocessed image, so that the smoke and the distinguishing of fire and the non-smoke areas can be effectively identified, the characteristic extraction and the position calibration of the identified smoke areas are guaranteed, the position and the range of the fire can be determined, the accuracy and the robustness of the model can be improved through the confusion matrix and the cross verification optimization CNN network model, the smoke areas and the fire areas can be better identified, the error rate and the alarm rate can be reduced, the relevant personnel can be timely and timely notified to reduce the loss caused by the fire.

In some possible embodiments, the method adopts different environmental adjustment to be acquired according to different carrier platforms, and adopts different enhancement modes to the images input into the CNN network model according to different characteristics, as follows:

the method comprises the steps that a ground camera acquisition image, an unmanned aerial vehicle acquisition image and a satellite acquisition image are respectively configured with CNN network models with different full-connection layer output functions, wherein the full-connection layer output functions comprise infrared weights, texture weights and color weights;

configuring a first CNN network model for an unmanned aerial vehicle acquired image, extracting infrared, color and texture characteristics, wherein the texture weight and the color weight are greater than the infrared weight in a full-connection layer output function of the first CNN network model;

configuring a second CNN network model for the satellite acquired image, and extracting infrared, color and texture characteristics, wherein in the output function of a full-connection layer of the second CNN network model, the infrared weight and the color weight are greater than the texture weight;

configuring a third CNN network model for the ground camera acquired image, extracting infrared, color and texture characteristics, wherein the texture weight and the color weight are greater than the infrared weight in the output function of the full-connection layer of the third CNN network model; the texture weight of the first CNN network model and the texture weight of the third CNN network model are larger than the texture weight of the second CNN network model; setting a global average pooling layer at the tail of each CNN network model, and converting each feature map into a vector with a fixed length; carrying out normalization processing on the output vector of the CNN network model; and (c) performing any operation of a or b:

a. Splicing the output vector of the first CNN network model or the second CNN network model with the output vector of the third CNN network model to obtain a total comprehensive feature vector; the total comprehensive feature vector is 2 times of the output vector of each CNN network model after normalization processing;

b. splicing the output vectors of the first CNN network model, the second CNN network model and the third CNN network model to obtain a total comprehensive feature vector; the total comprehensive feature vector is 3 times of the output vector of each CNN network model after normalization processing;

and inputting the obtained total comprehensive feature vector into a classifier for classification, and obtaining a judgment result that the type of the smoke in the smoke area is smoke or fog.

According to the scheme, three different platforms (image data acquisition sources) are adopted to configure different CNN network models, and feature extraction is carried out on all three features respectively, so that the comprehensive enhancement effect of each feature is achieved, and the advantages of different image acquisition sources are exerted.

Specifically, infrared, texture, and color vectors can be input into one fully connected layer, and then three different weights are output by the activation function. Specifically, assume that the full connection layer output is f(x)，f _IR (x)、f _Texture (x) Andf _Colo r (x) is three output branches of the full connection layer respectively, and then the weights of the infrared, texture and color vectors are respectivelyw _IR 、w _Texture Andw _Color then it can be expressed as:

wherein the softmax function converts the output of the fully connected layer into a probability distribution.

The weights may then be multiplied by the infrared, texture, and color vectors, respectively, to obtain weighted feature vectors:

f' _IR (x)、f' _Texture (x) Andf' _Colo r (x) is the weighted infrared, texture and color vectors, respectively.

The image collected by the camera is mainly affected by environmental factors such as illumination and weather, and the color characteristics may be more obvious, so that higher weight can be set for the color characteristics. Meanwhile, as the camera is placed on a fixed position, the texture features can have a certain rule, and the scheme improves the weight of the texture features.

The unmanned aerial vehicle collected images generally have higher resolution and wider visual field, and the details and the local features can be better captured, so that the weight of the texture features and the color features is properly improved.

The satellite acquired image has wider coverage range, can comprehensively monitor the forest situation, but has relatively low resolution, and cannot obtain fine texture information, so that the weight of texture features is reduced.

The specific weight setting needs to be adjusted according to the data set, and is optimized through a cross-validation method.

By adopting the scheme, the image characteristics of multiple sources are integrated, the smoke and forest fire can be more comprehensively described, and the classification precision is improved. The image features of different sources have different advantages, such as infrared images can detect the heat radiation of the smoke or the smoke, color feature images can detect the spectrum information of the smoke or the surrounding of the smoke, texture feature images can detect the texture information of the smoke or the surrounding of the smoke, and the advantages of the features of different sources can be fully exerted by combining the features of different sources. For each source image, the CNN network model is used for extracting the characteristics of the images, the vectors output by each CNN network model can be in the same range through normalization processing, different classifiers such as logistic regression and SVM can be used for classifying the obtained total comprehensive characteristic vectors, the classifiers can be selected according to specific situations, and parameter adjustment is carried out according to actual requirements so as to improve classification accuracy. According to the scheme, different CNN network model structures can be selected according to actual conditions, so that a better classification effect is obtained.

In some possible embodiments, the image is enhanced for the color feature area, the RGB color space is converted into the HSV color space, the color histogram and the color statistical feature are extracted, and the color feature information is obtained;

extracting texture information of the image by adopting an angular second moment (ASR) and a Contrast (Contrast) for the enhanced image of the texture feature area to obtain texture feature information;

extracting heat radiation intensity and infrared radiation power characteristic information from the infrared enhanced image to obtain infrared characteristic information;

each CNN network model optimizes the CNN network model by using a confusion matrix and cross verification according to at least two kinds of information of color characteristic information, texture characteristic information and infrared characteristic information;

and carrying out feature extraction and position calibration on the post-input preprocessing image by using the optimized CNN network model.

According to the scheme, various image processing technologies are adopted to extract different characteristic information, the information is transmitted to a deep convolutional neural network model (CNN) model for comprehensive analysis, and forest fires can be effectively monitored and detected, especially under complex environmental conditions (such as water mist, smoke and the like).

In some possible embodiments, the first CNN network model and the third CNN network model are each configured as a CNN network model-based object detection model; the second CNN network model is configured as a lightweight model; carrying out remote sensing image processing on an image input into a second CNN network model in advance; or, the remote sensing image processing is performed in advance on the image input into any one or both of the first CNN network model and the third CNN network model and the image of the third CNN network model.

The method takes the factors such as the source, the processing speed and the real-time performance of data into consideration when selecting the model carrier. For a real-time monitoring task, selecting to deploy a model on the edge equipment so as to realize near-real-time monitoring and response; for offline processing tasks, the scheme can select to use a high-performance server or cloud computing resource so as to accelerate the training and reasoning process of the model.

In some possible embodiments, acquiring a real-time image of the monitoring area, wherein the real-time image data is obtained by combining two types of satellite image acquisition, unmanned aerial vehicle image acquisition and ground camera image acquisition, wherein the two types of ground camera image acquisition at least comprise ground camera image acquisition; determining a monitoring target area of the unmanned aerial vehicle or the ground camera, and acquiring an image of the monitoring target area through the unmanned aerial vehicle or the ground camera; when the smoke-like characteristics are monitored, a satellite image or/and an unmanned aerial vehicle image is called, and the satellite image or/and the unmanned aerial vehicle image and a ground camera acquisition image are preprocessed; inputting the preprocessed satellite image or/and unmanned aerial vehicle image into a CNN network model for feature extraction to obtain at least two feature vectors in infrared, color or texture; and fusing the at least two feature vectors with the ground camera image to obtain an output vector of the CNN network model, and completing feature extraction and position calibration of the monitoring target area according to the output vector.

According to the scheme, after the suspected area is determined through ground camera image acquisition, the satellite image or/and the unmanned aerial vehicle image is/are acquired, free occupation of image data processing is reduced, the ground camera image is preferentially selected for real-time and fixed full-range monitoring image, real-time image data and the approximate suspected area are preferentially acquired, real-time monitoring is achieved, timeliness and precision of forest fire monitoring are guaranteed, monitoring of the designated area in a full time period is achieved, when the suspected area is found, the unmanned aerial vehicle nearby is acquired for real-time image acquisition of the area, if the unmanned aerial vehicle nearby does not acquire an image, the satellite image can be selected for acquisition, or the satellite image can be acquired simultaneously when the unmanned aerial vehicle image is acquired, so that a multi-angle view angle is achieved, and image information of a forest fire target area can be acquired from different angles and view angles through acquisition and processing of the satellite image, so that the comprehensive performance and accuracy of monitoring are improved. According to the scheme, the CNN network model is used for extracting the characteristics of the satellite image or/and the unmanned aerial vehicle image, at least two characteristic vectors in infrared, color or texture are obtained, the characteristic vectors can help to determine the characteristics of a forest fire target area, and the accuracy of forest fire monitoring is improved. And fusing the feature vector of the satellite image or/and the unmanned aerial vehicle image with the ground camera image to obtain the output vector of the CNN network model, thereby improving the monitoring reliability.

In some possible embodiments, fusing the at least two feature vectors with the ground camera image includes: and fusing at least two feature vectors by adopting weighted average and feature splicing to obtain a sub-comprehensive feature vector, wherein the sub-comprehensive feature vector is an output vector of the CNN network model.

The feature vectors are combined together, so that classification performance can be improved, comprehensive feature information can be obtained, the feature vectors are fused into the sub-comprehensive feature vectors, the dimension of the feature vectors can be reduced, calculation and memory expenditure can be reduced, and excessive fitting of single feature vectors can be reduced by fusing a plurality of different feature vectors together, so that generalization capability of a model is improved, a computer has higher processing efficiency when processing a received suspected smoke area image, and the consumed time for data judgment is reduced before taking measures on forest fires is avoided.

In some possible embodiments, the texture weights of the camera and the unmanned aerial vehicle are set higher, but more than one camera with different directions which can shoot the target area and are distributed on the ground is often used, so that in order to further improve the accuracy of texture recognition, the texture features of the image acquisition sources of the ground cameras at least two different positions are registered and fused, and further enhancement of the texture features of a plurality of cameras corresponding to similar smoke areas is realized;

The method also comprises registering and fusing the feature vectors of the same type;

when a plurality of cameras on the ground all perform image acquisition and extract texture feature vectors, performing image fusion on a reference image and an image to be registered after registration by adopting an LPT (Laplacian pyramid) based method, wherein the image fusion comprises the following steps:

and selecting the registered two camera images, adopting an image fusion algorithm of the LPT to obtain a fusion image, then adopting the same fusion method to fuse the fusion result image with the other camera image, and then analogically, and fully fusing the callable camera shooting images in the similar smoke area to obtain the LPT fusion image.

Because at least two feature vectors (infrared, texture and color) are fused with the ground camera images to obtain the output vector of the CNN network model, in order to ensure texture quality, the LPT fusion image fusion method according to the embodiment obtains the fusion image of the smoke texture features shot by the ground camera, so that the advantages of a plurality of ground cameras distributed in a forest zone can be directly utilized, the texture features are further enhanced on the same image plane, the acquired smoke texture feature information is richer, and more reliable quality guarantee is provided for the subsequent splicing and fusion of the feature vectors acquired and extracted by other image sources.

In some feasible embodiments, when a similar smoke area is captured by a ground camera and the unmanned aerial vehicle is called, a virtual control field is established according to the position of a target area as a circle center, so that the unmanned aerial vehicle route entering the virtual control field is called to pass through the target area and continues to navigate after being judged to be mist;

generating a control instruction to the unmanned aerial vehicle entering the virtual control field, so that a virtual force field is established between the unmanned aerial vehicles entering the virtual control field, and an unmanned aerial vehicle image entering the virtual control field is acquired.

Therefore, when the similar smoke area is identified, the first unmanned aerial vehicle continues to navigate after being initially identified as the mist, but the identification error occurs, the fire extinguishing time is missed, so that the scheme ensures that the unmanned aerial vehicle image acquisition coverage is wide enough under a large-range forest area, unmanned aerial vehicle resource mobilization cannot be completed continuously by retaining the area range because the similar smoke area at a certain place is identified as the mist, the task of the preset cruising route cannot be completed continuously, the virtual control field is established, the cruising route can be continued according to the preset cruising route after the mist is judged, but in order to prevent the loss caused by the misjudgment, the cruising route of other unmanned aerial vehicles entering the virtual control field is also subjected to the similar smoke area, secondary judgment is performed after a period of time, double verification of the previous and later time and multiple verification of multiple unmanned aerial vehicles are performed, namely, the unmanned aerial vehicle resource mobilization can be ensured not to acquire the single suspected area and the large consumption, the reliability of the identification is matched with the unmanned aerial vehicle, the resource mobilization of the ground head and the satellite image can not be ensured, the smoke resource allocation is not carried out in the space range, the smoke is optimized, the smoke is not required, the smoke is not be optimized, the smoke is not required, and the smoke is not required to be obviously recognized, and the smoke is greatly is not required, and the smoke is not required.

In some possible embodiments, training a CNN network model to obtain the preset CNN network model; the training comprises the following steps:

collecting a forest fire smoke real image and a synthetic image for training a GAN network model to obtain a data set; training a generator and a discriminator model of the GAN network model to generate a synthetic image; using the synthetic image and the real image generated by the GAN network model to expand the data set; adopting an adaptive enhancement data set; and (3) joint training, namely mixing a synthetic image and a real image generated by the GAN network model together, and transmitting the mixed image and the real image to the CNN network model for training. To improve robustness and generalization ability of the pre-censored CNN network model.

The second aspect of the invention also provides a forest fire analysis and monitoring system, which comprises an image data acquisition and communication module, a processing module and an alarm module; the image data acquisition module is used for acquiring image data acquired by the camera, the unmanned aerial vehicle and the satellite; the processing module is connected with the image data acquisition communication module and is used for preprocessing the received image data, extracting the characteristics of the received image data through a CNN network model, distinguishing smoke areas from non-smoke areas, extracting the characteristics of the identified smoke areas, calibrating the positions of the identified smoke areas and outputting a judging result; the alarm module is connected with the processing module and used for sending out alarm information according to the judgment result; the processing module is further used for carrying out infrared enhancement, color feature area enhancement and texture feature area enhancement on the input image, the CNN network model is used for optimizing the CNN network model by using a confusion matrix and cross verification according to at least two of the infrared enhancement, the color feature area enhancement and the texture feature area enhancement images, and the optimized CNN network model is used for carrying out feature extraction and position calibration on the preprocessed image.

By adopting the system, the image set is preprocessed and various different characteristics are extracted, and the deep learning is utilized to identify and position the smoke area, so that accurate judgment, monitoring and early warning of fire conditions of forests are realized.

The third aspect of the present invention also provides an electronic device comprising a processor and a memory; the memory is used for storing processor executable instructions; the processor is configured to perform the forest fire analysis and monitoring method according to the first aspect and its modified version.

A fourth aspect of the present invention also provides a computer readable storage medium comprising a stored computer program which when run performs a forest fire analysis monitoring method of the first aspect and its modifications.

Drawings

The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention. In the drawings:

FIG. 1 is a schematic diagram for explaining a forest fire analysis monitoring method in an embodiment;

fig. 2 is a schematic diagram for explaining a multi-image source processing flow in a forest fire analysis and monitoring method according to an embodiment;

FIG. 3 is a schematic diagram for illustrating a forest fire analysis monitoring system in an embodiment;

The system comprises a 1-image data acquisition and communication module, a 2-processing module, a 3-alarm module, a 4-first CNN network model, a 5-second CNN network model, a 6-third CNN network model, a 7-unmanned aerial vehicle, an 8-satellite, a 9-ground camera and a 10-classifier.

Detailed Description

For the purpose of making apparent the objects, technical solutions and advantages of the present invention, the present invention will be further described in detail with reference to the following examples and the accompanying drawings, wherein the exemplary embodiments of the present invention and the descriptions thereof are for illustrating the present invention only and are not to be construed as limiting the present invention.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that: no such specific details are necessary to practice the invention. In other instances, well-known structures, circuits, materials, or methods have not been described in detail in order not to obscure the invention.

Throughout the specification, references to "one embodiment," "an embodiment," "one example," or "an example" mean: a particular feature, structure, or characteristic described in connection with the embodiment or example is included within at least one embodiment of the invention. Thus, the appearances of the phrases "in one embodiment," "in an example," or "in an example" in various places throughout this specification are not necessarily all referring to the same embodiment or example. Furthermore, the particular features, structures, or characteristics may be combined in any suitable combination and/or sub-combination in one or more embodiments or examples. Moreover, those of ordinary skill in the art will appreciate that the illustrations provided herein are for illustrative purposes and that the illustrations are not necessarily drawn to scale. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

In the description of the present invention, it should be understood that the terms "front", "rear", "left", "right", "upper", "lower", "vertical", "horizontal", "high", "low", "inner", "outer", etc. indicate orientations or positional relationships based on the drawings, are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the scope of the present invention.

Example 1

Referring to fig. 1, a forest fire analysis monitoring method includes the following operations:

s1, acquiring a real-time image of a monitoring area;

s2, preprocessing an image set, including removing noise, adjusting the size and the color of the image;

s3, transmitting the preprocessed image to a preset CNN network model for feature extraction, distinguishing smoke and non-smoke areas, wherein the method comprises the steps of carrying out infrared enhancement, color feature area enhancement and texture feature area enhancement on an input image and carrying out corresponding feature extraction, and optimizing the CNN network model by using a confusion matrix and cross verification according to at least two of the infrared enhancement, the color feature area enhancement and the texture feature area enhancement images;

S4, extracting features of the preprocessed image and calibrating positions of the identified smoke areas by using the optimized CNN network model;

s5, judging whether alarm information is sent out.

When the monitoring distance of the image acquisition equipment is too far or the monitoring range is larger, smoke appearing in a forest causes certain trouble and confusion to the current machine identification or manual identification, if the time window for controlling the fire as early as possible is missed when the smoke and water vapor characteristics are obvious, the method can help to enhance the characteristics of the image, improve the distinction of the smoke and non-smoke areas by preprocessing the image comprising infrared enhancement, color characteristic area enhancement and texture characteristic area enhancement multidimensional processing, and further utilize a CNN network model to perform characteristic extraction and distinguish the smoke and non-smoke areas on the preprocessed image, effectively identify the smoke and distinguish the fire from the non-smoke areas, ensure the characteristic extraction and the position calibration to the identified smoke area, help to determine the position and range of the fire, and can improve the accuracy and the robustness of the model by optimizing the CNN network model through confusion matrix and cross verification, thereby better identifying the smoke area and the fire, reducing the false alarm rate and the leakage rate, and timely and early notifying related personnel of the smoke area, thereby reducing the loss caused by countermeasure.

The smoke area and the smoke area identified by the convolutional neural network model are subjected to feature extraction and position calibration, and the following operations can be adopted:

extracting the characteristics of a smoke area or a smoke area; and extracting the characteristics of the smoke area or the smoke area from the output of the convolutional neural network model. Different feature extraction methods may be employed, such as global feature extraction using a global average pooling layer (globalagepooling) of the convolutional neural network model, or local feature extraction using a convolutional layer output of the convolutional neural network model.

Performing data processing; the data needs to be preprocessed before feature extraction and position calibration can be performed. Image enhancement techniques such as contrast enhancement, histogram equalization, etc. can be employed to improve image quality; or performing operations such as adjustment, normalization and the like of the image size, and guaranteeing the consistency and stability of the data.

Calibrating the position of a smoke area or a smoke area; and determining the position of the smoke area in the image according to the characteristics of the smoke area or the smoke area. Different location calibration methods may be used, such as calibrating the location based on the center point coordinates of the smoke zone or smoke region, or calibrating the location based on the boundary box of the smoke zone or smoke region.

Extracting static and dynamic characteristics of smoke and calibrating the position; for smoke areas, the static and dynamic characteristics of the smoke areas can be extracted, and the positions of the smoke areas can be calibrated. Static features may include information on the size, shape, etc. of the smoke; the dynamic characteristics may include information on the motion trajectory, speed, etc. of the smoke. The position calibration can be performed using the same method.

For example, feature extraction and position calibration are performed on smoke areas in forest fire monitoring images, and a convolutional neural network model such as VGG (virtual ground generator) can be adopted to extract global features from global average pooling layer output of the network model. And determining the position of the smoke area in the image by calculating the coordinates of the central points of the feature map. For static characteristics of the smoke, the information of the area, the aspect ratio and the like of the smoke can be calculated; for the dynamic characteristics of the smoke, the motion trail, speed and other information of the smoke can be calculated. According to the characteristic information and the position information, the forest fire smoke can be monitored and analyzed in all directions, and a scientific basis is provided for preventing and controlling forest fire.

In addition to the embodiments described above, there is one embodiment:

for the position calibration of the smoke area or the smoke area, the following steps can be adopted:

extracting a binary image of a smoke area or a smoke area: in the convolutional neural network model, a smoke area or a smoke area is divided into a smoke area and a non-smoke area, a smoke area and a non-smoke area through a two-classification network model. Therefore, the position calibration can be performed according to the smoke area or the binary image of the smoke area. For example, a forest fire monitoring image including smoke and smoke areas. And obtaining binary images of the smoke area and the smoke area through a classification model of the convolutional neural network model.

Calculating the center point coordinates of the smoke area or the smoke area: for binary images of smoke areas or smoke areas, the image processing libraries such as OpenCV may be used to calculate their contours and to calculate their center point coordinates to determine their location in the image.

Position calibration based on bounding boxes: for smoke zones or smoke regions, the bounding box may be used to demarcate its location. The bounding box information can be extracted by using an image processing library such as OpenCV, and the information such as the center point coordinates, the length and the width of the bounding box can be calculated. Before the position calibration is carried out, morphological processing can be carried out on the binary image, such as expansion, corrosion and other operations are used, so that connectivity of a smoke area or a smoke area is enhanced, and noise is reduced.

Data analysis and processing: the position calibration of the smoke area or the smoke area can be further analyzed and processed based on the position information and the characteristic information, such as calculating the information of the size, the shape and the like of the smoke area or analyzing the information of the movement track, the speed and the like of the smoke area. Data analysis and processing may be performed using a programming language such as Python.

For example, the bounding box information of the image processing library such as OpenCV is extracted, so that a smoke area is obtained, and then the smoke area is further processed to be calibrated. The image is converted into a black-and-white image through a threshold-based binarization process, black represents the background, and white represents the foreground (i.e., the calibrated area). An Otsu algorithm may be used to automatically select a suitable threshold for binarization based on the distribution of the gray values of the image.

The binarized image of the smoke is obtained by binarization processing, and then further processing is performed to eliminate noise and to connect adjacent areas. Morphological treatments are used here, comprising two steps of expansion and etching, wherein expansion can increase the size of the target area, filling the voids in the target area; the etching may reduce the size of the target area, removing small pieces in the target area. By a combination of these two operations, a more accurate smoke zone can be obtained.

And then adopting an OpenCV and other image processing libraries to carry out contour detection and position calibration. For example, contours are found using a cv2.findcontours function, and for each contour, its bounding rectangle can be calculated using a cv2.boundingrect function. And (5) preserving the left upper corner coordinates and the right lower corner coordinates of the surrounding rectangle, so as to obtain the positions of the smoke and the flame. Finally, the positions of the smoke and the flame can be calibrated in the original image, so that the static and dynamic characteristic extraction and the position calibration of the smoke are realized.

As shown in fig. 2, the ground camera acquired image, the unmanned aerial vehicle acquired image and the satellite acquired image are respectively configured with CNN network models with different full-connection layer output functions, wherein the full-connection layer output functions comprise infrared weights, texture weights and color weights; configuring a first CNN network model for an unmanned aerial vehicle acquired image, extracting infrared, color and texture characteristics, wherein the texture weight and the color weight are greater than the infrared weight in a full-connection layer output function of the first CNN network model; configuring a second CNN network model for the satellite acquired image, and extracting infrared, color and texture characteristics, wherein in the output function of a full-connection layer of the second CNN network model, the infrared weight and the color weight are greater than the texture weight; configuring a third CNN network model for the ground camera acquired image, extracting infrared, color and texture characteristics, wherein the texture weight and the color weight are greater than the infrared weight in the output function of the full-connection layer of the third CNN network model; the texture weight of the first CNN network model and the texture weight of the third CNN network model are larger than the texture weight of the second CNN network model; setting a global average pooling layer at the tail of each CNN network model, and converting each feature map into a vector with a fixed length; carrying out normalization processing on the output vector (output dimension) of the CNN network model; and (c) performing any operation of a or b:

a. Splicing the output vector 1 of the first CNN network model 4 or the output vector 2 of the second CNN network model 5 with the output vector 3 of the third CNN network model 6 to obtain a total comprehensive feature vector; the total comprehensive feature vector is 2 times of the output vector of each CNN network model after normalization processing;

b. splicing the output vector 1 of the first CNN network model 4, the output vector 2 of the second CNN network model 5 and the output vector 3 of the third CNN network model 6 to obtain a total comprehensive feature vector; the total comprehensive feature vector is 3 times of the output vector of each CNN network model after normalization processing;

and inputting the obtained total comprehensive feature vector into a classifier 10 for classification to obtain a judging result.

The infrared, texture and color vectors can be input into one fully connected layer and then three different weights are output by the activation function. Specifically, assume that the full connection layer output isf(x)，f _IR (x)、f _Texture (x) Andf _Colo r (x) is three output branches of the full connection layer respectively, and then the weights of the infrared, texture and color vectors are respectivelyw _IR 、w _Texture Andw _Color then it can be expressed as:

By implementing the method, a fire monitoring satellite can be adopted, and the fire monitoring satellite can capture fire related data such as smoke, smog, hot spots and the like, so that large-area forest fires can be monitored in real time; the satellite images of the corresponding areas may also be acquired in real time by sending a request from a real-time satellite image database.

Unmanned aerial vehicle 7 can carry on equipment such as high definition camera and infrared camera, patrol the forest district, catches data such as smog, fire and flue gas in the forest, and the real-time transmission data in the sky lets the server or the command center at terminal know the conflagration condition in time, is favorable to quick response.

The monitoring station can be used for monitoring in real time by using equipment such as a meteorological instrument and an infrared camera, and acquiring image data of a designated area in real time and in the whole process by using the camera for monitoring in the whole process. Of course, the camera image data acquisition can use a mobile application program to enable tourists to register and install the application program when entering a forest, so that the method of the tourists for monitoring the area in real time is realized, and timely ground image data is acquired.

The method can also be combined with arranging a sensor network model in the forest, monitoring environmental parameters such as temperature, humidity, air pressure and the like of the forest, and carrying out comprehensive early warning.

Specifically, three images from different sources are respectively input into three different CNN network models, and their infrared, color and texture features are extracted.

For each CNN network model, each feature map is converted into a fixed length vector by adding a global average pooling layer (globalargePooling) at its end. Thus, each CNN network model can output a fixed length vector.

For the output vector of each CNN network model, normalization processing may be performed using MinMaxScaler or standard scaler.

And splicing the vectors output by each CNN network model to obtain a comprehensive feature vector. The three vectors can be stitched into one vector of three times the length.

For the resulting composite feature vector, a classifier 10 (e.g., logistic regression, SVM, etc.) may be used to classify it and determine if it is a forest fire.

By adopting the scheme, the image characteristics of multiple sources are integrated, the smoke and forest fire can be more comprehensively described, and the classification precision is improved. The image features of different sources have different advantages, such as infrared images can detect the heat radiation of the smoke or the smoke, color feature images can detect the spectrum information of the smoke or the surrounding of the smoke, texture feature images can detect the texture information of the smoke or the surrounding of the smoke, and the advantages of the features of different sources can be fully exerted by combining the features of different sources. For each source image, the CNN network model is used for extracting the characteristics, and for the vectors output by each CNN network model, the vectors can be in the same range through normalization processing, and for the obtained total comprehensive characteristic vector, different classifiers 10 can be used for classification, such as logistic regression, SVM and the like, the classifiers 10 can be selected according to specific situations, and parameter adjustment can be carried out according to actual requirements so as to improve classification accuracy. According to the scheme, different CNN network model structures can be selected according to actual conditions, so that a better classification effect is obtained.

Besides the steps of the fusion method, images from different sources can be fused at different levels, for example, an infrared image and a color feature image are fused to obtain an intermediate feature, and then the intermediate feature and a texture feature image are fused, so that the dependency relationship between the features can be better explored, and a better and more possible optimization basis is provided for the follow-up CNN model optimization.

Further, the image is enhanced for the color feature area, the RGB color space is converted into the HSV color space, the color histogram and the color statistical feature are extracted, and the color feature information is obtained;

For the texture feature region enhanced image, extracting texture information of the image by adopting an angular second moment (ASR) and a Contrast (Contrast), wherein the calculation formula is as follows:

angular Second Moment (ASM):

contrast (Contrast):

wherein,,p _i，j representing the probability of co-occurrence of pixels having gray values i and j.

Specifically, during texture extraction, converting an image into a gray image;

dividing the image into a plurality of small areas, say small blocks of 16x16 pixels;

for each patch, calculating an angular second moment and a contrast;

the angular second moment and contrast of all patches are averaged.

The average angular second moment and the average contrast are used as a part of the texture feature vector to distinguish smoke from fog.

For example, an 8×4 gray image acquired by a camera is obtained;

first, the image is converted into a grayscale image. Then, the angular second moment sum and contrast are calculated.

Assuming a gray value range of 0-255, the pixel value may be divided by 255 to obtain a normalized gray value. The formula for calculating the angular second moment sum is:

wherein x is _i Is the gray value, is the average gray value, and N is the total number of pixels.

The average gray value is calculated to be 1.5 from the acquired gray image.

Thus, the angular second moment sum is:

next, the contrast can be calculated as:

wherein,,σis the standard deviation of the pixel gray values.σTo simplify the calculation, the average value of the absolute differences of the pixel values may be used instead of the standard deviation. Assuming that the standard deviation of the image isσ=0.7, then the contrast is:

thus, for this image region, the angular second moment sum of its texture features is 0.25 and the contrast is 0.714. And transmitting the characteristics to a third CNN model according to a pre-trained model, and matching texture weights to identify whether the characteristics are smoke or fog.

According to the scheme, various image processing technologies are adopted to extract different characteristic information, the information is transmitted to a deep convolutional neural network model (CNN) model for comprehensive analysis, and forest fires can be effectively monitored and detected, especially under complex environmental conditions (such as water mist, smoke and the like). For color features, converting the RGB color space to the HSV color space may more accurately capture color information. Through extracting the color histogram and the statistical characteristics, the expressive force of the color characteristics can be further enhanced, so that objects with different colors such as water mist, smoke and the like can be better distinguished. For texture features, texture information of an image can be extracted using an angular second moment (ASR), contrast (Contrast) technique. Such texture information is useful for distinguishing between water mist, water vapor, smoke and other objects. For the infrared enhanced image, the characteristic information such as heat radiation intensity, infrared radiation power and the like is extracted, so that a fire source and other objects can be better distinguished, and the infrared enhanced image is still suitable for use particularly at night or under the condition of low visibility. And finally, transmitting all the characteristic information to a CNN model for comprehensive analysis. By optimizing the CNN model using confusion matrix and cross-validation techniques, forest fires can be more accurately identified. In addition, the forest fire position can be determined more accurately from multiple directions through multi-source image acquisition, and the model calibrates the forest fire position, so that the monitoring effect is further improved.

Specifically, the infrared image is used for distinguishing the smoke and the water mist, the infrared image can reflect the heat radiation of different objects or scenes, and the heat radiation characteristics of the smoke and the water mist are different. At night or during morning fog, the water mist may exhibit relatively bright heat radiation, while the smoke may exhibit darker heat radiation. The smoke and the water mist can be primarily and accurately distinguished by extracting and classifying CNN features of the infrared images.

The smoke and the water mist are distinguished by utilizing the color characteristics, and the smoke and the water mist have obvious difference in color. Smoke is usually grey or black in colour, whereas mist may be white or light grey in colour. The smoke and the water mist can be further effectively distinguished by performing color space conversion on the image and classifying the extracted color features by using a CNN network model.

The texture features are used for distinguishing smoke and water mist, and the smoke and the water mist are different in texture features. Smoke often exhibits a relatively hazy texture, while water mist may exhibit a relatively sharp texture. The smoke and the water mist are further accurately distinguished by extracting the texture features of the image and classifying the extracted texture features by using CNN.

The different characteristics are fused, a comprehensive CNN model is established and applied, and even if the shooting distance is far, the image quality of a single dimension is insufficient, and the smoke quantity is still small, the recognition and judgment are carried out through multi-dimension and multi-angle image information, so that the more accurate and robust smoke and water mist distinguishing effect is achieved, and a more reliable alarm is sent when the fire is in a small even sprouting state.

In the scheme, the model is evaluated and optimized by using the confusion matrix and the cross-validation method, which are important steps for improving the accuracy of smoke and water mist classification, and the confusion matrix compares the real classification result with the prediction classification result, so that evaluation indexes such as the classification accuracy, precision, recall rate and the like are obtained. The confusion matrix includes four classification results: true positive (TruePositive, TP), false positive (FalsePositive, FP), true negative (TrueNegative, TN) and false negative (FalseNegative, FN). Where TP represents the number of samples actually being positive classes that are correctly predicted as positive classes, FP represents the number of samples actually being negative classes that are incorrectly predicted as positive classes, TN represents the number of samples actually being negative classes that are correctly predicted as negative classes, and FN represents the number of samples actually being positive classes that are incorrectly predicted as negative classes.

Cross-validating it can effectively evaluate the performance and generalization ability of the model. Common cross-validation methods include K-fold cross-validation, leave-one-out cross-validation, and the like. The method comprises the steps of carrying out cross validation on a data set by K folds, dividing the data set into K parts, taking one part as a validation set each time, taking the rest part as a training set, repeating K times to obtain K models, and finally averaging the results of the K models. And (3) carrying out one-leave-one-method cross validation, taking each sample as a validation set, taking the rest samples as training sets, repeating N times to obtain N models, and finally averaging the results of the N models. The operation steps are as follows:

(1) Partitioning data sets

The data set is divided into a training set and a testing set according to a certain proportion, wherein the training set is used for model training, and the testing set is used for model evaluation.

(2) Feature extraction and model training

And classifying the smoke and the water mist by using a convolutional neural network model, and training and optimizing the model. Different feature extraction methods and network model structures can be used, so that a better classification effect is obtained.

(3) Cross-validation and confusion matrix

And (5) evaluating and optimizing the model by adopting a K-fold cross-validation or leave-one-out cross-validation method. In each round of cross verification, the training set and the testing set are divided into a training set and a verification set according to a certain proportion, and evaluation indexes such as accuracy, precision, recall rate and the like of classification are calculated by using the confusion matrix. The model may be optimized according to the results of the cross-validation and confusion matrix, and the present embodiment may be the following two methods:

Adjusting the network model structure and super parameters: the network model structure can be adjusted by increasing or decreasing the number of layers of the network model, adjusting the size of the filter, modifying the activation function and the like, so that the classification performance of the model is improved.

Data enhancement: the training set can be subjected to data enhancement, such as random overturning, rotating, cutting and other operations, so that the scale of the training set is enlarged, and the generalization of the model is improved.

Further optimization is possible on the basis of the present embodiment. The first CNN network model 4 and the third CNN network model 6 are both configured as a target detection model based on the CNN network model; the second CNN network model 5 is configured as a lightweight model; carrying out remote sensing image processing on an image input into the second CNN network model 5 in advance; alternatively, the remote sensing image processing is performed in advance on the image input to either or both of the first CNN network model 4 and the third CNN network model 6, and the image of the third CNN network model 6.

Aiming at different data sources and tasks, the scheme selects a proper CNN network model for distinguishing. Generally, the images collected by the ground camera 9 and the unmanned aerial vehicle 7 and the images collected by the satellite 8 have the characteristics of different resolutions, visual angles, illumination conditions and the like, and for the images collected by the ground camera 9, the target detection model based on the convolutional neural network model, such as FasterR-CNN, yolo and the like, is used in the scheme to detect the characteristics of fire sources, fire wires, smoke and the like. For images acquired by the drone 7, using a model similar to the ground camera 9, the input resolution and dimensions of the model can be appropriately adjusted to accommodate different viewing angles and heights. For images acquired by satellite 8, it may be desirable to use a lighter weight model, such as MobileNet, shuffleNet, due to its lower resolution. Meanwhile, methods using remote sensing image processing, such as image enhancement, segmentation, etc., can be considered to improve the accuracy of the model.

On the basis of the embodiment, the acquiring the real-time image of the monitoring area can be further optimized, and the acquiring the real-time image of the monitoring area further comprises acquiring satellite images, unmanned aerial vehicle images and ground camera images, wherein the acquiring of the ground camera images at least comprises combining the two types of the ground camera images to obtain real-time image data; determining a monitoring target area of the unmanned aerial vehicle 7 or the ground camera 9, and acquiring an image of the monitoring target area through the unmanned aerial vehicle 7 or the ground camera 9; when the smoke-like characteristics are monitored, a satellite image or/and an image of the unmanned aerial vehicle 7 is called, and the satellite image or/and the image of the unmanned aerial vehicle 7 and the image acquired by the ground camera 9 are preprocessed; inputting the preprocessed satellite image or/and the unmanned aerial vehicle 7 image into a CNN network model for feature extraction to obtain at least two feature vectors in infrared, color or texture; and fusing the at least two feature vectors with the image of the ground camera 9 to obtain an output vector of the CNN network model, and completing feature extraction and position calibration of the monitoring target area according to the output vector.

According to the scheme, after the suspected area is determined through ground camera image acquisition, the satellite image or/and the unmanned aerial vehicle 7 image is/are acquired, free occupation of image data processing is reduced, the ground camera image is preferentially selected for real-time and fixed full-range monitoring image, real-time image data and the approximate suspected area are preferentially acquired, real-time monitoring is achieved, timeliness and precision of forest fire monitoring are guaranteed, monitoring of the designated area in a full time period is achieved, when the suspected area is found, the unmanned aerial vehicle 7 nearby performs real-time image acquisition on the area, if the unmanned aerial vehicle 7 nearby does not acquire an image, the satellite image can be selected for acquisition, or the satellite image can be acquired simultaneously when the unmanned aerial vehicle 7 image is acquired, so that a multi-angle view angle is achieved, and image information of a forest fire target area can be acquired from different angles and view angles through acquisition and processing of the ground camera image, and the comprehensive and accuracy of monitoring are improved. According to the scheme, the CNN network model is used for carrying out feature extraction on the satellite image or/and the unmanned aerial vehicle 7 image, at least two feature vectors in infrared, color or texture are obtained, the feature vectors can help to determine the features of a forest fire target area, and the accuracy of forest fire monitoring is improved. The feature vector of the satellite image or/and the unmanned aerial vehicle 7 image is fused with the ground camera 9 image to obtain the output vector of the CNN network model, the monitoring reliability is improved,

Fusing the at least two feature vectors with the ground camera 9 image includes: and fusing at least two feature vectors by adopting weighted average and feature splicing to obtain a sub-comprehensive feature vector, wherein the sub-comprehensive feature vector is an output vector of the CNN network model.

Training a CNN network model to obtain the preset CNN network model; the training comprises the following steps:

When the convolutional neural network model (CNN) is used for carrying out feature extraction on the acquired image as training, so as to identify smoke and non-smoke areas, smoke and water vapor areas in the image, the method can be carried out according to the following steps:

and (3) data processing: firstly, preprocessing an acquired image, including image denoising, image enhancement, target area extraction, image registration, data normalization and the like.

And (3) data marking: and labeling the acquired images, classifying and labeling the smoke and non-smoke areas, and generating a training data set.

Constructing a convolutional neural network model: according to the characteristics of the data set and the target requirements, a proper convolutional neural network model structure is selected, and the convolutional neural network model structure generally comprises a convolutional layer, a pooling layer, a full-connection layer, a Softmax layer and the like.

Model training: and training the convolutional neural network model by using the marked training data set, and optimizing parameters by adopting a random gradient descent method.

Model evaluation: and evaluating the trained convolutional neural network model by using a test data set, and calculating indexes such as accuracy, recall rate, F1 value and the like.

And (3) predicting: and classifying and predicting the real-time monitoring image by using the trained convolutional neural network model, and judging whether a smoke area exists in the current image, wherein the smoke area is smoke or water mist.

For example, a gray scale image of 256 x 256 pixels in size is acquired, wherein the range of pixel values for the smoke region is between [0, 128] and the range of pixel values for the non-smoke region is between [128, 255 ].

First, preprocessing of the image set is required, including denoising, enhancement, extraction of the target region, and the like, to improve the image quality and the visibility of the target region. Then, the smoke area and the non-smoke area can be classified and marked according to the data marking, and a training data set is generated. Then, constructing a convolutional neural network model, training a training data set, and generally adopting a random gradient descent method to perform parameter optimization. And finally, using a trained convolutional neural network model to classify and predict the real-time monitoring image, and judging whether a smoke area exists in the current image. In the feature extraction process of the convolutional neural network model, a convolutional layer and a pooling layer can be used for carrying out feature extraction and dimension reduction operation so as to improve the accuracy and generalization capability of the model.

In constructing the convolutional neural network model, leNet, alexNet, VGG, resNet, for example, may be employed. The method takes VGG model as an example.

The VGG model comprises a plurality of convolution layers and full connection layers, and can be used for extracting and classifying depth features of images. The local features of the image can be extracted by utilizing the convolution layer of the VGG model, and then global feature extraction and classification can be performed by the full connection layer.

The specific operation is as follows:

a gray scale image of 256 x 256 pixels in size is input into the convolutional neural network model. And extracting local features of the image by using a convolution layer of the VGG model to obtain a series of feature images. Each feature map represents a response of a convolution kernel to the original image, which can be understood as the extraction of features such as edges, textures, etc. in different directions and at different scales. The pooling layer performs pooling operation on each feature map, reduces dimension and retains main features, and uses maximum pooling or average pooling. The full connection layer expands the pooled feature map into a one-dimensional vector, and inputs the one-dimensional vector into the full connection layer for global feature extraction and classification. The fully connected layer typically employs a multi-layer perceptron (MLP) model, which contains multiple neurons for feature transformation and classification. The output layer finally sends the output of the full connection layer to the output layer, and different activation functions, such as sigmoid function, softmax function and the like, are selected according to the requirements of classification tasks.

By means of feature extraction of the convolutional neural network model, key information can be effectively extracted from the image, smoke and non-smoke areas and smoke and water mist areas can be identified, and powerful support is provided for forest fire monitoring and early warning.

Feature extraction is performed on the image data after forest fire monitoring pretreatment, and as smoke and non-smoke are slightly different, the smoke and the mist are slightly different, and a model with higher accuracy is required to be selected for feature extraction. Since forest fire monitoring requires processing of a large number of images, computing resources are also an important consideration. Therefore, VGG, resNet and the like can be selected, the weight is relatively light, the consumption of computing resources is relatively small, and the method is suitable for feature extraction of forest fire monitoring image data.

On the basis of the training, the method adopts a mode of jointly training the GAN network model and the CNN network model to train under the condition that the smoke amount is small and the image quality is possibly low, namely, the synthetic image generated by the GAN network model and the real image are used for training the CNN network model, and the steps can be carried out in the following mode:

a dataset is prepared comprising the composite image and the real image. Some composite images may be generated using the GAN network model and the data set may be supplemented using the real images acquired by the drone 7, satellite and terrestrial cameras 9. It is ensured that the dataset contains a sufficient number of real images and composite images.

Training a GAN network model: the generator and the discriminant model are trained using the GAN network model. The generator model receives the random vector as input and generates a composite image that is similar to the real image. The discriminant model receives the composite image and the real image as inputs and determines whether the image is real or composite. By iterating the training of the generator and the discriminant, the generator will gradually generate a more realistic composite image.

The real image classification model is trained using the CNN network model. This can be done by using a dataset containing real images. Standard CNN architectures, such as AlexNet, VGG, resNet, etc., may be used.

The synthetic image generated by the GAN network model is used with the real image for training of the CNN network model. This can be achieved by blending the composite image and the real image together and passing them to the CNN network model. The composite image generated by the GAN network model may be mixed proportionally with the real image, e.g., 70% real image and 30% composite image.

Finally, a separate test data set is used to evaluate the model. It is ensured that the test dataset contains real images and composite images and that these images are classified using a trained CNN network model. The evaluation index may be an accuracy rate or other metric, such as accuracy, recall, etc.

In this embodiment, the above-mentioned camera and unmanned aerial vehicle may be further optimized, where the texture weights of the camera and unmanned aerial vehicle are set higher, and more than one camera distributed on the ground in different directions capable of capturing the target area is often used, so that in order to further improve accuracy of texture recognition, the present scheme registers and fuses texture features of at least two different image acquisition sources, and thus enhancement of texture features by using multiple cameras corresponding to similar smoke areas may be further realized.

when the ground cameras acquire images and extract texture feature vectors, the reference image and the image to be registered are registered by adopting an LPT (Laplacian pyramid) based method, and then the images are fused.

The method specifically comprises the steps of collecting a plurality of camera images with different directions of a similar smoke area as images to be registered, and performing image registration transformation on the preprocessed images by using an SIFT-based image registration algorithm to obtain a multi-camera registration result image under the same coordinate system as the smoke area, so as to finish registration.

The LPT image fusion algorithm mainly comprises three parts of pyramid decomposition, fusion and reconstruction. And the finer information of the image to be fused is reserved by carrying out up-sampling and Gaussian convolution operation on residual information between the current layer of Gaussian pyramid decomposition and the image of the previous layer.

Equation (1) is the calculation process of Gaussian pyramid image, laplacian characterThe structure of the tower is LP ₀ ，LP ₁ ，LP ₂ …LP _N The composition is formed. Subtracting the predicted image after upsampling and Gaussian convolution of the previous image by each image of the Gaussian pyramid to obtain a series of difference images, namely LP decomposition images. The calculation formula is as follows:

（1）

（2）

（3）

in the above, G for Gaussian pyramid image of the kth layer _k Representing that k is an integer, w (m, n) is a two-dimensional separable 5*5 window function, expanding the image by using an expansion operator shown in formula (2), and G _k * Representing the expanded k-th layer image, and finally calculating according to formula (3) to obtain a decomposed image, LP _k A k-th layer image decomposed for the Laplacian pyramid; expandad is an enlargement operator.

The source images to be fused are decomposed into a multi-scale pyramid image sequence shown in a formula (4), then a certain fusion rule is selected, such as taking large, taking small and the like, pyramid images of corresponding layers are fused, and then the pyramid images are reversely pushed to a Gaussian pyramid according to a formula (5), and a fusion result image is obtained through reconstruction.

（4）

（5）/>

Wherein N is the layer number of the top layer of the Laplacian pyramid.

And selecting the registered two camera images, adopting an image fusion algorithm of the LPT to obtain a fusion image, then adopting the same fusion method to fuse the fusion result image with the other camera image, and then analogically, and fully fusing the callable camera shooting images which can be shot in the same smoke-like area to obtain the LPT fusion image.

Because at least two feature vectors (infrared, texture and color) are fused with the ground camera images to obtain the output vector of the CNN network model, in order to ensure texture quality, the LPT fusion image fusion method according to the embodiment obtains the fusion image of the smoke texture features shot by the ground camera, so that the advantages of a plurality of ground cameras distributed in a forest zone can be directly utilized, the texture feature enhancement is further realized on the same image plane, the acquired smoke texture feature information is richer, the discrimination of the texture of mist or smoke can be better, and more reliable quality guarantee is provided for the subsequent splicing and fusion of the feature vectors acquired and extracted by other image sources.

Example 2

Referring to fig. 3, a forest fire analysis and monitoring system comprises an image data acquisition and communication module 1, a processing module 2 and an alarm module 3; the image data acquisition module is used for acquiring image data acquired by the camera, the unmanned aerial vehicle 7 and the satellite; the processing module 2 is connected with the image data acquisition and communication module 1 and is used for preprocessing the received image data, extracting the characteristics through a CNN network model, distinguishing smoke and non-smoke areas, extracting the characteristics and calibrating the positions of the recognized smoke areas and outputting a judging result; the alarm module 3 is connected with the processing module 2 and is used for sending out alarm information according to the judgment result; the processing module 2 is further configured to perform infrared enhancement, color feature region enhancement, and texture feature region enhancement on the input image, and the CNN network model optimizes the CNN network model by using a confusion matrix and cross-validation according to at least two of the infrared enhancement, the color feature region enhancement, and the texture feature region enhancement, and performs feature extraction and position calibration on the preprocessed image by using the optimized CNN network model.

In practice, the system may be implemented using essentially the following devices and components.

The high-definition cameras and the monitoring unmanned aerial vehicle 7 are arranged and used for shooting and collecting data in real time on a site or an area to be monitored; the convolutional neural network model CNN which is trained is preset and is used for classifying and identifying the acquired images. The method is used for preparing communication with a remote sensing database and comprises the steps of acquiring multi-source remote sensing data such as real-time satellite images, high-precision topographic data and the like. Communication equipment is configured, including satellite communication and alarm communication. The required server is modulated for storing and processing the acquired data.

The ground camera 9 (arranged at the highest building position of the monitoring area as much as possible), and the unmanned aerial vehicle 7 transmit the acquired images and data to the server for storage and processing through wireless signals. The convolutional neural network model is connected with the ground camera 9, the unmanned aerial vehicle 7 and the remote sensing database through a server, so that data classification and recognition are realized. The communication device is connected with the server and used for transmitting and receiving alarm information. The remote sensing data acquisition is to extract features from satellite and high-precision topographic data, and the multisource remote sensing technology is used for identifying and extracting features of the land used in forests or the topography which is distributed in a staggered manner with the land used in grasslands or the agricultural lands. And classifying and identifying images acquired by the ground camera 9 and the unmanned aerial vehicle 7 and acquired satellite images by using a CNN network model, and judging whether smoke exists in the images. The smoke and mist difference identification is performed by analyzing the characteristic differences of the smoke and the mist to classify and identify. And carrying out feature extraction and position calibration on the smoke image through CNN, so as to realize static and dynamic feature extraction and position calibration of the smoke.

The scheme adopts the combination of multisource remote sensing and a CNN network model, can comprehensively and accurately analyze and monitor forest fires, and improves the prediction and control capability of the forest fires. By utilizing satellite communication and alarm communication, alarm information can be timely sent to related personnel, and loss caused by forest fires is reduced.

When the unmanned aerial vehicle is called for image acquisition, when a similar smoke area is captured by a ground camera, a virtual control field is established according to the position of a target area as a circle center, so that the unmanned aerial vehicle route entering the virtual control field is regulated to pass through the target area and continues to navigate after being judged to be mist; generating a control instruction to the unmanned aerial vehicle entering the virtual control field, so that a virtual force field is established between the unmanned aerial vehicles entering the virtual control field, and an unmanned aerial vehicle image entering the virtual control field is acquired.

When the unmanned aerial vehicle 7 entering the virtual control field is centralized and enters from multiple directions, a false force field of the unmanned aerial vehicle is established in the virtual control field, and the current position coordinate of the unmanned aerial vehicle is set as X= (X, y, z) ^T The center coordinate of the control field is X _g =(x _g ，y _g ，z _g ) ^T Attraction potential field generated by target point

（6）

In the formula (1), d (X, X) _g ）=‖X _g -X‖，d（X，X _g ) K is Euclidean distance from unmanned plane to target point _att Is the constant of the gravitational potential field. The attractive force Fatt (X) to which the unmanned aerial vehicle is subjected is a negative gradient of the attractive potential field Uatt (X), i.e

（7）

The direction of the traction force is along a straight line between the unmanned plane and the target point and points to the target point.

Let the obstacle coordinate be X _o =(x _o ，y _o ，z _o ) ^T Repulsive potential field generated by obstacle borne by unmanned aerial vehicle

（8）

In the formula (3), d (X, X) _o ）=‖X _o -X‖，d（X，X _g ) K is Euclidean distance from unmanned plane to target point _rep To repulsive force potential field constant, d _o In order to repel the radius of the influence range of the potential field, the direction of the repulsive force is along the straight line between the unmanned aerial vehicle and the obstacle and points to the unmanned aerial vehicle.

Repulsive force F of unmanned aerial vehicle in repulsive force field _rep (X) is U _rep Negative gradients of (X), i.e

（9）

Synthesizing potential field functions:

U(X)=U _att (X)+U _rep (X) （10）

when the unmanned plane moves to the target point, the unmanned plane can be under the combined action of a gravitational field and a repulsive field according to the following strips

F(X)=−∇U(X)=−∇U _att (X)−∇U _rep (X)=F _att (X)+F _rep (X) （11）

F _rep Is the total repulsive force of all obstacles, F _att The attractive force generated by the target point is obtained by the force superposition principle, and the resultant force F is the resultant force of repulsive force and attractive force born by the unmanned aerial vehicle.

The distance correction factor is added to the repulsive force field function, and the distance correction factor can balance the changes of the two forces, particularly when the repulsive force borne by the unmanned aerial vehicle is rapidly increased, so that the repulsive force borne by the unmanned aerial vehicle can be gradually reduced when the unmanned aerial vehicle approaches to the center point of the target area, namely the center position of the virtual control field (in practice, the center point can be an accessory proper observation point in the center of the smoke suspected area). The distance correction factor ensures that the resultant potential field of the target point is globally minimal, in the case of unchanged attractive potential field, the repulsive potential field function can be defined as

（12）

Formula (11) increases d based on formula (8) ⁿ （X，Xg），d ⁿ (X, xg) is the distance of the unmanned aerial vehicle to the target position to the power n, n is a real number which is arbitrarily larger than zero and is used for parameter adjustment. After entering the virtual control field, not only other unmanned aerial vehicles but also trees on the path can be concentrated, two barriers are arranged, F _rep1 And F _rep2 The repulsive forces generated by the two barriers are respectively F _rep Is the total repulsive force of all obstacles. Along with the approach of the unmanned aerial vehicle to the distance between the target points, the repulsive force borne by the unmanned aerial vehicle tends to 0 instead of infinity, and the improved repulsive force potential field function is that

（13）

F _rep1 (X) and F _rep2 (X) is

（14）

（15）

（16）

According to the formula, firstly, calculating attractive force, repulsive force and resultant force of the unmanned plane at the current position;

when F _rep Far greater than F _att When the value is fixed, the unmanned aerial vehicle moves to the next position towards the central coordinate of the target area;

when F _rep Greater than F _att When the current position of the unmanned plane is not more than a certain value, continuing to conduct guiding control, including calculating attractive force, repulsive force and resultant force suffered by the unmanned plane at the current position;

when F _rep =F _att When the current coordinate is the center coordinate of the target area, namely the current coordinate is the center coordinate of the target area; if the current coordinate is not the center coordinate of the target area, continuing to conduct guiding control, including calculating attractive force, repulsive force and resultant force suffered by the unmanned aerial vehicle at the current position.

When the drone approaches the target point, d (X, X _g ) Reduce and approach zero, the force F applied by the unmanned aerial vehicle _rep1 Also tends to zero, and k _rep Being constant, the drone moves in the target direction only under the action of attraction.

When n is>1, as the unmanned aerial vehicle approaches the target point, d ⁿ (X，X _g ) And d ^n-1 (X，X _g ) The unmanned aerial vehicle is close to zero, the total repulsive force suffered by the unmanned aerial vehicle is close to zero, under the action of the repulsive force, the unmanned aerial vehicle can arrive at a target area according to order, and unmanned aerial vehicle is prevented from being concentrated and out of control due to too dense forest obstacles during navigation.

After the virtual control field is established according to the method, even if smoke is complex, the tree is dense or the unmanned aerial vehicle is called to be concentrated from different directions, the problem that unmanned aerial vehicle accidents occur because unmanned aerial vehicles are called to be concentrated suddenly in a short time can be avoided.

Example 3

The embodiment provides an electronic device, which comprises a processor and a memory; the memory is used for storing processor executable instructions; the processor is configured to perform the forest fire analysis and monitoring method of embodiment 1.

The processor is configured to support the electronic device to perform the method operational steps of the above-described embodiments. The memory is used to support the electronic device to store a computer program that performs the methods in the above embodiments. The memory is coupled to the processor

In particular, the processor may include a Central Processing Unit (CPU), or a specific integrated circuit (ApplicationSpecialIntegratedCircuit, ASIC), or may be configured as one or more integrated circuits that implement the forest fire analysis monitoring method described above.

The memory may include mass storage for data, which may include for data or instructions. By way of example, and not limitation, the memory may comprise a hard disk drive (HardDiskDrive, HDD), floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or universal serial bus (UniversalSerialBus, USB) drive, or a combination of two or more of these. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is a non-volatile solid state memory. In a particular embodiment, the memory includes Read Only Memory (ROM). The ROM may be mask programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these, where appropriate.

The processor reads and executes the computer program instructions stored in the memory to realize the forest fire analysis and monitoring method.

In a further embodiment of the present electronic device, the present electronic device may further comprise a communication interface and a bus. The processor, the memory and the communication interface are connected through a bus and complete communication with each other.

The communication interface is mainly used for realizing communication among all modules, devices, units and/or equipment required by the forest fire analysis and monitoring method. The bus includes hardware, software, or both that couple the components of the present electronic device to one another. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. The bus may include one or more buses, where appropriate. Although a particular bus is described and illustrated, the present invention contemplates any suitable bus or interconnect.

Example 4

The present embodiment provides a computer readable storage medium including a stored computer program, which when executed performs a forest fire analysis monitoring method of the above embodiment 1.

A computer storage medium containing a computer program for executing the forest fire analysis monitoring method as in embodiment 1 is provided to the electronic apparatus as in embodiment 3, and the program for executing the forest fire analysis monitoring method in the computer storage medium is called to be applied to a database in the electronic apparatus or the electronic device, and data is stored in the database in real time during the execution of the program.

It will be appreciated from the description of the above embodiments that, in order to implement the above functions, the electronic device includes corresponding hardware structures and/or software modules that perform the respective functions. Those of skill in the art will readily appreciate that the various illustrative steps of a method of protecting an operating system described in connection with the embodiments disclosed herein may be implemented as hardware or a combination of hardware and computer software. Whether a function is implemented as hardware or electronic device software driven hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The functions implemented by the electronic device may be implemented in a form of hardware, or may be implemented in a form of software functional modules or functional units. In this embodiment of the present application, the division of the functional components or devices and apparatuses is merely a logic function division, and other division manners may be used in actual implementation.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, and the partitioning of an electronic apparatus or a computer storage medium, for example, is merely a logical function partitioning, and may be implemented in other ways, for example, multiple units or components may be combined or integrated into another apparatus, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The foregoing detailed description of the invention has been presented for purposes of illustration and description, and it should be understood that the invention is not limited to the particular embodiments disclosed, but is intended to cover all modifications, equivalents, alternatives, and improvements within the spirit and principles of the invention.

Claims

1. A forest fire analysis and monitoring method, which is characterized by comprising the following operations:

acquiring real-time images of a monitoring area from ground cameras distributed in different positions of a forest zone, and acquiring real-time images of the monitoring area from unmanned aerial vehicles or/and satellites to obtain a real-time image set to be preprocessed;

preprocessing an image set;

transmitting the preprocessed image to a preset CNN network model for feature extraction, and distinguishing smoke and non-smoke areas;

extracting features and calibrating positions of the identified smoke areas;

judging whether alarm information is sent out; wherein,,

transmitting the preprocessed image to the CNN network model for feature extraction comprises the following steps:

performing infrared enhancement, color feature area enhancement and texture feature area enhancement on an input image, and performing corresponding feature extraction, wherein the CNN network model optimizes the CNN network model by using a confusion matrix and cross verification according to at least two of the infrared enhancement, the color feature area enhancement and the texture feature area enhancement images so as to judge that the type of smoke in a smoke area is smoke or fog;

and carrying out feature extraction and position calibration on the preprocessed image by the optimized CNN network model.

2. A forest fire analysis and monitoring method according to claim 1, wherein,

configuring a third CNN network model for the ground camera acquired image, extracting infrared, color and texture characteristics, wherein the texture weight and the color weight are greater than the infrared weight in the output function of the full-connection layer of the third CNN network model; the texture weight of the first CNN network model and the texture weight of the third CNN network model are larger than the texture weight of the second CNN network model;

setting a global average pooling layer at the tail of each CNN network model, and converting each feature map into a vector with a fixed length;

Carrying out normalization processing on the output vector of the CNN network model;

and (c) performing any operation of a or b:

3. A forest fire analysis and monitoring method according to claim 2, wherein,

the method comprises the steps of enhancing an image for a color feature area, converting an RGB color space into an HSV color space, extracting a color histogram and color statistics features, and obtaining color feature information;

extracting texture information of the image by adopting an angular second moment and contrast for the texture feature area enhanced image to obtain texture feature information;

4. A forest fire analysis and monitoring method according to claim 2, wherein,

the first CNN network model and the second CNN network model are configured as target detection models based on the CNN network model;

the third CNN network model is configured as a lightweight model;

carrying out remote sensing image processing on an image input into a third CNN network model in advance; or,

and carrying out remote sensing image processing on the image input into any one or both of the first CNN network model and the second CNN network model and the image input into the third CNN network model in advance.

5. A forest fire analysis and monitoring method according to claim 1, wherein,

acquiring real-time images of a monitoring area, wherein the real-time images comprise two types of combination of satellite image acquisition, unmanned aerial vehicle image acquisition and ground camera image acquisition, wherein the two types of combination at least comprise ground camera image acquisition, so as to obtain real-time image data;

Determining a monitoring target area of the unmanned aerial vehicle or the ground camera, and acquiring an image of the monitoring target area through the unmanned aerial vehicle or the ground camera;

when the smoke-like characteristics are monitored, a satellite image or/and an unmanned aerial vehicle image is called, and the satellite image or/and the unmanned aerial vehicle image and a ground camera acquisition image are preprocessed;

inputting the preprocessed satellite image or/and unmanned aerial vehicle image into a CNN network model for feature extraction to obtain at least two feature vectors in infrared, color or texture;

and fusing the at least two feature vectors with the ground camera image to obtain an output vector of the CNN network model, and completing feature extraction and position calibration of the monitoring target area according to the output vector.

6. A forest fire analysis and monitoring method according to claim 5, wherein,

fusing the at least two feature vectors with the ground camera image includes:

and fusing at least two feature vectors by adopting weighted average and feature splicing to obtain a sub-comprehensive feature vector, wherein the sub-comprehensive feature vector is an output vector of the CNN network model.

7. A forest fire analysis and monitoring method according to claim 5, wherein,

When a similar smoke area is captured by a ground camera and the unmanned aerial vehicle is called, a virtual control field is established according to the position of a target area as a circle center, so that the unmanned aerial vehicle route entering the virtual control field is called to pass through the target area and continues to navigate after being judged to be mist;

8. A forest fire analysis monitoring system comprising:

the image data acquisition and communication module is used for acquiring image data acquired by the camera, the unmanned aerial vehicle and the satellite;

the processing module is connected with the image data acquisition and communication module and is used for preprocessing the received image data, extracting the characteristics of the received image data through a CNN network model, distinguishing smoke areas from non-smoke areas, extracting the characteristics of the identified smoke areas, calibrating the positions of the identified smoke areas and outputting a judging result;

the alarm module is connected with the processing module and used for sending out alarm information according to the judgment result; wherein,,

the processing module is also used for carrying out infrared enhancement, color feature area enhancement and texture feature area enhancement on the input image, the CNN network model optimizes the CNN network model by using a confusion matrix and cross verification according to at least two of the infrared enhancement, the color feature area enhancement and the texture feature area enhancement images, and the optimized CNN network model carries out feature extraction and position calibration on the preprocessed image.

9. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

the processor is configured to perform a forest fire analysis monitoring method as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that: comprising a stored computer program which when run performs a forest fire analysis monitoring method as claimed in any one of claims 1 to 7.