CN117876874A - Forest fire detection and positioning method and system based on high-point monitoring video - Google Patents

Forest fire detection and positioning method and system based on high-point monitoring video Download PDF

Info

Publication number
CN117876874A
CN117876874A CN202410055348.4A CN202410055348A CN117876874A CN 117876874 A CN117876874 A CN 117876874A CN 202410055348 A CN202410055348 A CN 202410055348A CN 117876874 A CN117876874 A CN 117876874A
Authority
CN
China
Prior art keywords
smoke
flame
forest fire
forest
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410055348.4A
Other languages
Chinese (zh)
Inventor
谢亚坤
朱庆
朱军
冯德俊
刘子琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN202410055348.4A priority Critical patent/CN117876874A/en
Publication of CN117876874A publication Critical patent/CN117876874A/en
Pending legal-status Critical Current

Links

Landscapes

  • Fire-Detection Mechanisms (AREA)

Abstract

The invention discloses a forest fire detection and positioning method and system based on a high-point monitoring video, belongs to the technical field of forest fire detection and positioning, and solves the problems of low accuracy and low efficiency of smoke extraction and positioning of forest fires caused by early-stage forest fires under the condition of complex background. The method comprises the steps of constructing a forest fire detection data set based on an acquired high-point monitoring video, training a constructed multi-scale and multi-dimensional feature extraction network which fuses video space features and time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, and finally obtaining the detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame image; and (3) accurately positioning the forest fire driven by the video stereoscopic grid based on the detection frames of the smoke object and the flame object and the central two-dimensional pixel coordinates of the flame object in each frame of image. The method is used for accurately detecting the early forest fire based on the high-point monitoring video.

Description

Forest fire detection and positioning method and system based on high-point monitoring video
Technical Field
A forest fire detection and positioning method and system based on high-point monitoring video are used for accurately detecting early forest fires based on the high-point monitoring video, and belong to the technical field of forest fire detection and positioning.
Background
Forest is a precious natural resource, and a forest ecosystem is one of important ecosystems on the earth, and has irreplaceable effects and important influences on human survival and development, biodiversity, maintenance and improvement of ecological environment, and response to global climate warming and the like.
Among the many forest natural disasters, forest fires are the most damaging to the forest ecosystem. Forest fires destroy forest structures, affect living environments of wild animals and plants, cause water and soil loss, and further cause other natural disasters such as debris flow. The forest fire has the characteristics of long delay time, large fire area, high spreading speed, large damage degree and the like, not only damages forestry resources, but also causes great harm to national economy, personnel life safety and ecological environment. Therefore, the forest fire can be effectively monitored and positioned, the forest fire can be timely found, and the system has important effects on rescue and fire extinguishing work and reduction of loss caused by the forest fire.
At present, with the continuous development of high-point video monitoring technology, computer vision technology and artificial intelligence technology, the utilization of high-point video monitoring to monitor forest fires has become one of important means for forest fire prevention, and the fire monitoring mode of the high-point video monitoring system has the unique advantages of strong timeliness, stability, reliability, flexible and various information and the like, and occupies important positions in forest fire detection. Mainly controls the fire disaster from the source and effectively controls the fire disaster not to occur or not to occur after the occurrence, the method is used for realizing early discovery, early prevention, early treatment and early extinguishing, and is an effective way for improving forest fire prevention. In recent years, with successful application of artificial intelligence technologies represented by big data, machine learning, and deep learning in fields such as speech recognition, computer vision, and recommendation systems, artificial intelligence has made great progress in algorithms, models, and architectures. The artificial intelligence technology represented by deep learning also provides a new thought for forest fire identification, and a forest fire identification algorithm serving as a core component of a forest fire high-point video monitoring system is widely paid attention to by researchers in recent years. Moreover, the deep learning algorithm avoids the manual complexity compared to traditional computer vision methods, and can learn complex representations from a large number of image datasets.
In addition, since the fire source is difficult to be directly observed in the early forest fire, a great amount of smoke generated by smoldering objects such as branches, and leaves of the forest are main characteristics of the early fire. The early warning technology for detecting smoke through the high-point monitoring video image to realize early forest fire is a hot spot of current research. However, the prior art has the following technical problems:
1. because the background is complex, aiming at the problems of easiness in causing forest fire smoke extraction and positioning precision and low efficiency of early forest fire;
2. under severe weather conditions, particularly in heavy fog and haze weather, the performance of the device is damaged, so that the problems of poor positioning accuracy and low efficiency are caused;
3. the existing forest fire detection method has poor detection effect on a long-distance small target;
4. most of the existing forest fire detection methods aim at daytime scenes, and high-efficiency and real-time monitoring under night conditions cannot be achieved.
Disclosure of Invention
Aiming at the problems of the researches, the invention aims to provide a forest fire detection and positioning method and system based on a high-point monitoring video, which solve the problems of low accuracy and low efficiency of extracting and positioning smoke of a forest fire aiming at the early forest fire under the condition of complex background in the prior art.
In order to achieve the above purpose, the invention adopts the following technical scheme:
a forest fire detection and positioning method based on high-point monitoring video comprises the following steps:
step 1, constructing a forest fire detection data set based on an acquired high-point monitoring video, and training a constructed multi-scale and multi-dimensional feature extraction network which fuses video space features and time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, so as to finally obtain a detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame image;
and 2, accurately positioning the forest fire driven by the video stereoscopic grid based on the detection frames of the smoke object and the flame object and the central two-dimensional pixel coordinates of the flame object in each frame of image.
Further, the multi-scale and multi-dimensional feature extraction network for fusing the video spatial features and the time sequence features constructed in the step 1 comprises an input layer, a Focus module, a first Conv module, a global-local feature extraction module, a second Conv module, a deep-shallow feature extraction module, a third Conv module, a time sequence neural unit, a fourth Conv module, a pooling module, a depth and receptive field enhancement module and an output layer which are sequentially connected, wherein the first Conv module, the second Conv module, the third Conv module and the fourth Conv module are convolution modules consisting of Conv2d, a BN layer and an SILV which are sequentially connected.
Further, the global-local feature extraction module comprises a local feature extraction module and a global feature extraction module which are respectively connected with the first Conv module, a 3×3 convolution layer which receives results obtained after the local feature extraction module and the global feature extraction module are output and processed, and a 1×1 convolution layer which is sequentially connected with the 3×3 convolution layer, wherein the local feature extraction module comprises a 3×3 convolution layer and a 1×1 convolution layer which are respectively processed by convolution on an input feature image, a batch standardization layer which is respectively connected with the 3×3 convolution layer and the 1×1 convolution layer, the results output by the two batches of standardization layers are processed to obtain local features, the global feature extraction module comprises the steps of sequentially carrying out size transformation, linearization operation and projection layer mapping operation on the input feature image to obtain a feature image Q, a feature image K and a feature image V, carrying out multiplication operation on results obtained after the feature image Q and the feature image K, carrying out size transformation on the obtained after the multiplication operation and carrying out size transformation on the feature image V, and carrying out equivalent operation on each pixel point obtained after the transformation on the feature image is obtained in the global feature image, and the global feature image is obtained by the final feature image;
The deep-shallow layer feature extraction module comprises a fifth Conv module connected with the second Conv module, two 3×3 convolution layers with step length of 1, 3×3 convolution layers with step length of 2 and 3×3 convolution layers with step length of 5 which are sequentially connected with the fifth Conv module respectively, the multiplication operation is carried out on the output results of the two 3×3 convolution layers with step length of 5 and the fifth Conv module respectively, then the addition operation is carried out on the result obtained by the multiplication operation and the output result of the fifth Conv module, and the output result is obtained by connecting the two results obtained by the addition operation in series, wherein the fifth Conv module is a convolution module consisting of Conv2d, BN layer and SILV which are sequentially connected with each other;
the time sequence neural unit receives the activated t-1 time hidden layer state vector a output by the third Conv module t -1 And a t moment vector x t And respectively a t-1 And x t And V is equal to ah And V xh Performing multiplication operation, and obtaining a result and b by the two multiplication operations h Performing addition operation to obtain hidden layer state h t Then the hidden layer state is processed by hyperbolic tangent function tanh to obtain an activated hidden layer state vector a at the time t t And output at the same time a t And V is equal to ao Performing a multiplication operation, the result obtained by the multiplication operation and b o Adding to obtain a state vector c of the output node t ,c t After softmax calculation, the label vector is converted into an output label vectorWherein V is xh Representing a weight matrix of K input nodes to N hidden nodes, V ah The representation is a weight matrix connecting N hidden nodes at t-1 time to N hidden nodes at t time, b h Representing a matrix of hidden layer weights, b, before activation o Representing the activated hidden layer weight matrix, V ao Representing the weight matrix of the activated input node to the hidden node.
Further, the specific steps of the step 1 are as follows:
step 1.1, based on the acquired high-point monitoring video, an all-weather forest fire database based on a forest fire classification system is established through rough labeling, rendering, training, feedback, fine tuning and enhancement, and the method comprises the following specific steps:
step 1.11, artificially analyzing the feature expression of smoke and fire in a forest fire scene in a high-point monitoring video in a video image, and preliminarily establishing a coarse annotation database through manual annotation, wherein the features comprise color features, shape features and texture features, the colors are obtained by analyzing corresponding color histograms, color sets, color moments and color aggregation vectors, the shapes are obtained by adopting a boundary feature method, a Fourier shape descriptor method, a shape geometric parameter method and a finite element method, and the textures are obtained by adopting a gray level symbiotic matrix, an energy spectrum function, a random field model, an autoregressive texture model and wavelet transformation analysis;
Step 1.12, artificially analyzing characteristic heterogeneity expression caused by different types of interference in forest fire scenes in a high-point monitoring video, and performing diversified rendering on data in a coarse marking database, wherein the different types are classified into conifer fires, needle-broad mixed forest fires and broadleaf forest fires according to different forest land types, and classified into surface fires, crown fires and underground fires according to different fire positions, and classified into forest fires, general forest fires, important forest fires and extra-large forest fires according to different damaged forest areas, and classified into daytime forest fires and night forest fires according to different time of occurrence of forest fires, wherein the characteristic heterogeneity comprises light intensity, scale difference, smoke concentration and smoke-like fire objects;
step 1.13, learning coarse annotation database knowledge by utilizing a neural network model, detecting unlabeled data, feeding back error division conditions, finely annotating the data in a fine adjustment mode, and enhancing the data by an album tool; after enhancement, if the requirements are met, a forest fire detection data set with diversified characteristics is obtained, otherwise, the step 1.11 is transferred to be executed again;
Step 1.2, training a multi-scale and multi-dimensional feature extraction network based on a forest fire detection data set, inputting a high-point monitoring video to be identified into the trained multi-scale and multi-dimensional feature extraction network to obtain a smoke and flame object detection frame, simultaneously calculating the vertex two-dimensional pixel coordinates of the flame detection frame, calculating the center two-dimensional pixel coordinates of the detection frame based on the vertex two-dimensional pixel coordinates of the flame, and finally obtaining the detection frame of the smoke object and the flame object and the center two-dimensional pixel coordinates of the flame object.
Further, the specific steps of the step 2 are as follows:
step 2.1, establishing different initial positioning methods of the fire points based on the diffusion characteristics of the smoke and the flame of different forest fires, wherein the specific steps are as follows:
step 2.11, analyzing the diffusion characteristics of smoke and flame in the forest fire detection data set by adopting an optical flow method, namely dividing a diffusion model of the forest fire smoke and flame into a triangular diffusion model, a diffuse diffusion model and a radiation diffusion model by calculating a square distribution diagram of optical flow intensity and optical flow direction angle;
the formulas of the optical flow intensity and the optical flow direction angle are as follows:
L(i,j)=u(i,j) 2 +v(i,j) 2
wherein L (i, j) represents optical flow intensity, alpha (i, j) represents optical flow direction angle, u (i, j) and v (i, j) represent transverse and longitudinal optical flow vectors on pixel points (i, j), respectively, the concentrated distribution of the optical flow intensity and the optical flow direction angle is a triangular diffusion model, the irregular distribution is a diffuse diffusion model, and the uniform distribution is a radiation diffusion model;
Step 2.12, establishing different initial positioning methods of the ignition point according to different diffusion models, wherein the initial positioning method comprises a boundary-central line characteristic line positioning method established by a triangular diffusion model, a mass center moving offset positioning method established by a diffuse diffusion model and a discrete seed point positioning method established by a radiation diffusion model;
step 2.2, providing a forest fire positioning method combining a forest fire smoke diffusion model and a video three-dimensional grid, wherein the method comprises the following specific steps of:
step 2.21, simulating and establishing different simulation motion models aiming at different diffusion models, mapping the diffusion of the smoke and flame in the real world to a digital three-dimensional model in terms of diffusion speed and direction, diffusion mode and space topological relation constraint, namely obtaining the diffusion speed and direction of the simulation motion model by combining mathematical calculation with an image processing technology, and then mapping the smoke and flame of the forest fire in the real world to the digital three-dimensional model by combining the diffusion mode and space topological relation constraint of different diffusion models;
step 2.22, analyzing the adjacent, associated and contained space topological relation between smoke and flame and the structure of the forest in the digital three-dimensional model based on a geographic space topological relation analysis method, and carrying out space topological semantic description on the space topological relation;
2.23, carrying out camera calibration and distortion correction according to high-point monitoring camera parameters, extracting boundaries of a smoke object and a flame object by adopting a Bwboundaries edge extraction function under the premise of restraining a detection frame of the smoke object and the flame object, acquiring a smoke center line by combining the boundary, and utilizing a diffusion model and different initial positioning methods of a fire point to realize initial positioning of the smoke object in a two-dimensional image space, and simultaneously carrying out initial positioning based on the two-dimensional coordinates of the center point of the flame object, wherein the camera parameters comprise a pitch angle, a yaw angle and a camera height of a camera, wherein the smoke center line is a center point of all edge points of a forest fire after extracting the smoke edge and determining a smoke diffusion motion trend, forming a straight line between the center point and the slope of the smoke motion direction, and separating the smoke edge points into left and right fitting edges to obtain a final straight line which is the center line of the smoke;
2.24, converting an image coordinate system by utilizing an imaging mechanism from a three-dimensional space to a two-dimensional plane of a high-point monitoring camera and combining a digital elevation model to obtain position information under a camera and a world coordinate system, setting a reference point in a physical space, establishing a coordinate back calculation model of a pixel-image-camera-world coordinate system through the reference coordinate, and combining constraint of space topology semantic description and preliminary positioning to obtain preliminary positioning of a smoke object and a flame object, thereby realizing three-dimensional space positioning of forest fire smoke and flame objects;
And 2.25, based on three-dimensional space positioning of a forest fire smoke object and a flame object, mapping from longitude and latitude, height and three-dimensional grid position codes of the smoke object and the flame object is established, namely, the corresponding relation and the mutual conversion from video pixel coordinates to three-dimensional position codes are realized, and after conversion, the forest fire driven by the video three-dimensional grid is accurately positioned.
A forest fire detection and positioning system based on a high-point monitoring video comprises the following steps:
and a network construction and detection module: constructing a forest fire detection data set based on the acquired high-point monitoring video, and training a constructed multi-scale and multi-dimensional feature extraction network integrating the video spatial features and the time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, so as to finally obtain a detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame of image;
accurate positioning module of forest fire: and (3) accurately positioning the forest fire driven by the video stereoscopic grid based on the detection frames of the smoke object and the flame object and the central two-dimensional pixel coordinates of the flame object in each frame of image.
Further, the multi-scale and multi-dimensional feature extraction network for fusing the video spatial features and the time sequence features constructed in the step 1 comprises an input layer, a Focus module, a first Conv module, a global-local feature extraction module, a second Conv module, a deep-shallow feature extraction module, a third Conv module, a time sequence neural unit, a fourth Conv module, a pooling module, a depth and receptive field enhancement module and an output layer which are sequentially connected, wherein the first Conv module, the second Conv module, the third Conv module and the fourth Conv module are convolution modules consisting of Conv2d, a BN layer and an SILV which are sequentially connected.
Further, the global-local feature extraction module comprises a local feature extraction module and a global feature extraction module which are respectively connected with the first Conv module, a 3×3 convolution layer which receives results obtained after the local feature extraction module and the global feature extraction module are output and processed, and a 1×1 convolution layer which is sequentially connected with the 3×3 convolution layer, wherein the local feature extraction module comprises a 3×3 convolution layer and a 1×1 convolution layer which are respectively processed by convolution on an input feature image, a batch standardization layer which is respectively connected with the 3×3 convolution layer and the 1×1 convolution layer, the results output by the two batches of standardization layers are processed to obtain local features, the global feature extraction module comprises the steps of sequentially carrying out size transformation, linearization operation and projection layer mapping operation on the input feature image to obtain a feature image Q, a feature image K and a feature image V, carrying out multiplication operation on results obtained after the feature image Q and the feature image K, carrying out size transformation on the obtained after the multiplication operation and carrying out size transformation on the feature image V, and carrying out equivalent operation on each pixel point obtained after the transformation on the feature image is obtained in the global feature image, and the global feature image is obtained by the final feature image;
The deep-shallow layer feature extraction module comprises a fifth Conv module connected with the second Conv module, two 3×3 convolution layers with step length of 1, 3×3 convolution layers with step length of 2 and 3×3 convolution layers with step length of 5 which are sequentially connected with the fifth Conv module respectively, the multiplication operation is carried out on the output results of the two 3×3 convolution layers with step length of 5 and the fifth Conv module respectively, then the addition operation is carried out on the result obtained by the multiplication operation and the output result of the fifth Conv module, and the output result is obtained by connecting the two results obtained by the addition operation in series, wherein the fifth Conv module is a convolution module consisting of Conv2d, BN layer and SILV which are sequentially connected with each other;
the time sequence neural unit receives the activated t-1 time hidden layer state vector a output by the third Conv module t -1 And a t moment vector x t And respectively a t-1 And x t And V is equal to ah And V xh Performing multiplication operation, and obtaining a result and b by the two multiplication operations h Performing addition operation to obtain hidden layer state h t Then the hidden layer state is processed by hyperbolic tangent function tanh to obtain an activated hidden layer state vector a at the time t t And output at the same time a t And V is equal to ao Performing a multiplication operation, the result obtained by the multiplication operation and b o Adding to obtain a state vector c of the output node t ,c t After softmax calculation, the label vector is converted into an output label vectorWherein V is xh Representing a weight matrix of K input nodes to N hidden nodes, V ah The representation is a weight matrix connecting N hidden nodes at t-1 time to N hidden nodes at t time, b h Representing a matrix of hidden layer weights, b, before activation o Representing the activated hidden layer weight matrix, V ao Representing the weight matrix of the activated input node to the hidden node.
Further, the specific implementation steps of the network construction and detection module are as follows:
step 1.1, based on the acquired high-point monitoring video, an all-weather forest fire database based on a forest fire classification system is established through rough labeling, rendering, training, feedback, fine tuning and enhancement, and the method comprises the following specific steps:
step 1.11, artificially analyzing the feature expression of smoke and fire in a forest fire scene in a high-point monitoring video in a video image, and preliminarily establishing a coarse annotation database through manual annotation, wherein the features comprise color features, shape features and texture features, the colors are obtained by analyzing corresponding color histograms, color sets, color moments and color aggregation vectors, the shapes are obtained by adopting a boundary feature method, a Fourier shape descriptor method, a shape geometric parameter method and a finite element method, and the textures are obtained by adopting a gray level symbiotic matrix, an energy spectrum function, a random field model, an autoregressive texture model and wavelet transformation analysis;
Step 1.12, artificially analyzing characteristic heterogeneity expression caused by different types of interference in forest fire scenes in a high-point monitoring video, and performing diversified rendering on data in a coarse marking database, wherein the different types are classified into conifer fires, needle-broad mixed forest fires and broadleaf forest fires according to different forest land types, and classified into surface fires, crown fires and underground fires according to different fire positions, and classified into forest fires, general forest fires, important forest fires and extra-large forest fires according to different damaged forest areas, and classified into daytime forest fires and night forest fires according to different time of occurrence of forest fires, wherein the characteristic heterogeneity comprises light intensity, scale difference, smoke concentration and smoke-like fire objects;
step 1.13, learning coarse annotation database knowledge by utilizing a neural network model, detecting unlabeled data, feeding back error division conditions, finely annotating the data in a fine adjustment mode, and enhancing the data by an album tool; after enhancement, if the requirements are met, a forest fire detection data set with diversified characteristics is obtained, otherwise, the step 1.11 is transferred to be executed again;
Step 1.2, training a multi-scale and multi-dimensional feature extraction network based on a forest fire detection data set, inputting a high-point monitoring video to be identified into the trained multi-scale and multi-dimensional feature extraction network to obtain a smoke and flame object detection frame, simultaneously calculating the vertex two-dimensional pixel coordinates of the flame detection frame, calculating the center two-dimensional pixel coordinates of the detection frame based on the vertex two-dimensional pixel coordinates of the flame, and finally obtaining the detection frame of the smoke object and the flame object and the center two-dimensional pixel coordinates of the flame object.
Further, the specific implementation steps of the forest fire accurate positioning module are as follows:
step 2.1, establishing different initial positioning methods of the fire points based on the diffusion characteristics of the smoke and the flame of different forest fires, wherein the specific steps are as follows:
step 2.11, analyzing the diffusion characteristics of smoke and flame in the forest fire detection data set by adopting an optical flow method, namely dividing a diffusion model of the forest fire smoke and flame into a triangular diffusion model, a diffuse diffusion model and a radiation diffusion model by calculating a square distribution diagram of optical flow intensity and optical flow direction angle;
the formulas of the optical flow intensity and the optical flow direction angle are as follows:
L(i,j)=u(i,j) 2 +v(i,j) 2
wherein L (i, j) represents optical flow intensity, alpha (i, j) represents optical flow direction angle, u (i, j) and v (i, j) represent transverse and longitudinal optical flow vectors on pixel points (i, j), respectively, the concentrated distribution of the optical flow intensity and the optical flow direction angle is a triangular diffusion model, the irregular distribution is a diffuse diffusion model, and the uniform distribution is a radiation diffusion model;
Step 2.12, establishing different initial positioning methods of the ignition point according to different diffusion models, wherein the initial positioning method comprises a boundary-central line characteristic line positioning method established by a triangular diffusion model, a mass center moving offset positioning method established by a diffuse diffusion model and a discrete seed point positioning method established by a radiation diffusion model;
step 2.2, providing a forest fire positioning method combining a forest fire smoke diffusion model and a video three-dimensional grid, wherein the method comprises the following specific steps of:
step 2.21, simulating and establishing different simulation motion models aiming at different diffusion models, mapping the diffusion of the smoke and flame in the real world to a digital three-dimensional model in terms of diffusion speed and direction, diffusion mode and space topological relation constraint, namely obtaining the diffusion speed and direction of the simulation motion model by combining mathematical calculation with an image processing technology, and then mapping the smoke and flame of the forest fire in the real world to the digital three-dimensional model by combining the diffusion mode and space topological relation constraint of different diffusion models;
step 2.22, analyzing the adjacent, associated and contained space topological relation between smoke and flame and the structure of the forest in the digital three-dimensional model based on a geographic space topological relation analysis method, and carrying out space topological semantic description on the space topological relation;
2.23, carrying out camera calibration and distortion correction according to high-point monitoring camera parameters, extracting boundaries of a smoke object and a flame object by adopting a Bwboundaries edge extraction function under the premise of restraining a detection frame of the smoke object and the flame object, acquiring a smoke center line by combining the boundary, and utilizing a diffusion model and different initial positioning methods of a fire point to realize initial positioning of the smoke object in a two-dimensional image space, and simultaneously carrying out initial positioning based on the two-dimensional coordinates of the center point of the flame object, wherein the camera parameters comprise a pitch angle, a yaw angle and a camera height of a camera, wherein the smoke center line is a center point of all edge points of a forest fire after extracting the smoke edge and determining a smoke diffusion motion trend, forming a straight line between the center point and the slope of the smoke motion direction, and separating the smoke edge points into left and right fitting edges to obtain a final straight line which is the center line of the smoke;
2.24, converting an image coordinate system by utilizing an imaging mechanism from a three-dimensional space to a two-dimensional plane of a high-point monitoring camera and combining a digital elevation model to obtain position information under a camera and a world coordinate system, setting a reference point in a physical space, establishing a coordinate back calculation model of a pixel-image-camera-world coordinate system through the reference coordinate, and combining constraint of space topology semantic description and preliminary positioning to obtain preliminary positioning of a smoke object and a flame object, thereby realizing three-dimensional space positioning of forest fire smoke and flame objects;
And 2.25, based on three-dimensional space positioning of a forest fire smoke object and a flame object, mapping from longitude and latitude, height and three-dimensional grid position codes of the smoke object and the flame object is established, namely, the corresponding relation and the mutual conversion from video pixel coordinates to three-dimensional position codes are realized, and after conversion, the forest fire driven by the video three-dimensional grid is accurately positioned.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the forest fire smoke dimension difference and scale change characteristics are fully considered from three dimensions of depth, width and resolution, and a target recognition network model combining multi-dimensional characteristic extraction and multi-scale characteristic fusion is established to realize high-precision detection of early forest fire smoke under complex interference conditions, and the method is specifically characterized in that:
(1) The establishment of the all-weather forest fire database (namely a forest fire detection data set) based on the forest fire classification system is characterized in that a cyclic iteration optimization mechanism of rough labeling, rendering, training, feedback, fine tuning and enhancement is adopted; meanwhile, a cyclic iteration optimization mechanism is utilized to establish a large-scale forest fire disaster information detection database with all-weather scenes and diversified characteristics;
(2) The establishment of a multi-mode target detection network model (i.e. a multi-scale and multi-dimensional feature extraction network integrating video space features and time sequence features) integrating multi-scale features is a core algorithm for accurately identifying forest fires driven by video data, the core of the technology is to comprehensively consider the features of global local, deep shallow and video time sequences, realize a multi-mode information extraction method for multi-scene multi-mode forest fire smoke and flame objects, the technical model respectively establishes a multi-mode target detection network model comprising a global-local feature extraction module, a deep-shallow feature extraction module and a time sequence neural unit from the aspects of video global local, deep shallow and video time sequences, and compared with other methods, the method comprehensively acquires multi-mode features of the forest fire smoke and flame objects in videos, realizes comprehensive exploration of significance characterization of the smoke objects and flame objects in complex forest fire scenes, can obviously describe the objects when the forest fires occur, rapidly accurately detects the objects and flame objects, wherein the model realizes improvement on an original data set, the accuracy of the model is improved, the SOTA is improved by more than the first 37, the SOTA is improved by 1.35.25%, the SO1.35.1%, the SO1.35.35.1%, the SO1.720.1% is better than the CPU, the full-accuracy is improved, and the full-accuracy is improved by the comparison of the 1.35.1%, making it well suited for real-time deployment on edge devices.
2. According to the method, the spatial topological relation between the geographic position and the camera gesture is analyzed, the diffusion characteristics of early forest fire smoke are considered to construct the forest fire positioning method combining the smoke diffusion characteristics and the video three-dimensional grid, the limitation of the topography condition and the visibility of the observation point is broken through, the accurate positioning of the early forest fire of the video image is realized, and the method is specifically characterized in that:
(1) According to the type of the diffusion model of the smoke and the flame, a preliminary positioning method suitable for different conditions is formulated, different diffusion models of the smoke and the flame are developed so as to adapt to various terrain and climatic conditions, for example, different models are designed aiming at the influence of variables such as wind speed, humidity and temperature;
(2) The method for carrying out three-dimensional accurate positioning by combining the video three-dimensional grids of the smoke and flame diffusion model is characterized in that a high-precision three-dimensional space model is constructed by utilizing a video three-dimensional grid technology so as to more accurately simulate and analyze the diffusion of the fire smoke and flame, and the smoke and flame diffusion model is combined with the three-dimensional grid technology to realize more accurate three-dimensional positioning.
Drawings
FIG. 1 is a general idea of the present invention;
FIG. 2 is a diagram of the logical relationship in the present invention;
FIG. 3 is a schematic diagram of the precise positioning of a forest fire driven by a video stereoscopic grid based on the detection frames of smoke objects and flame objects and the central two-dimensional pixel coordinates of the flame objects in each frame of image in the invention;
FIG. 4 is a schematic structural diagram of a multi-scale and multi-dimensional feature extraction network integrating video spatial features and time sequence features, wherein a Focus module performs slicing operation on a picture, an original 640×640×3 image is input into the Focus module, the slicing operation is adopted to firstly change the image into a 320×320×12 feature map, a Conv module is a convolution module consisting of Conv2d (i.e. a two-dimensional convolution layer), BN layer (i.e. a batch standardization layer) and SILV (i.e. an activation function), SSP is a pooling module, and C3 is a depth and receptive field enhancement module;
FIG. 5 is a schematic diagram of a global-local feature extraction module in the present invention, where D, D respectively represent depth before and after transformation, H, H respectively represent height before and after transformation, W represents width, N represents Batch size, i.e. number of pictures in one Batch, depthNorm represents depth normalization processing, reshape is used for transforming the size of the pictures, liner represents linearization operation, project represents Projection layer mapping, scale represents that each pixel point of the layer feature map is equivalent to several pixel points in the original input image, conv1×1 represents convolution kernel with size of 1×1, conv3×3 represents convolution kernel with size of 3×3, and Batch Norm represents Batch normalization;
FIG. 6 is a schematic structural diagram of a deep-shallow feature extraction module in the invention, wherein C represents the number of channels, H represents the height, W represents the width, BN represents batch normalization, reLU is a common activation function for deep learning, s represents the step size, and Concate represents serial implementation of multi-feature extraction fusion;
FIG. 7 is a schematic diagram of a time-series neural unit structure according to the present invention, wherein the number of input nodes of the cyclic neural network is K, the number of hidden layer nodes is N, the number of output nodes is L, and for time t, the input vector is assumed to be x t Hidden layer state h t After transformation by the activation function and the full connection layer, the output is obtained by the softmax function, V xh Weight matrix representing K input nodes and N hidden nodes, V ah The representation is a weight matrix connecting N hidden nodes at the time (t-1) to N hidden nodes at the time t, a t Is an implicit layered vector after activation, c t Is the state vector of the output node, and is converted into the output label vector after being calculated by softmax
FIG. 8 is an exemplary diagram of the present invention employing boundary-centerline feature line localization;
FIG. 9 is an exemplary diagram of an application of centroid shifting in the present invention;
FIG. 10 is an exemplary diagram of the application of the discrete seed point method in the present invention.
Detailed Description
The invention will be further described with reference to the drawings and detailed description.
Aiming at the requirements of automatic detection and positioning of early forest fires and aiming at the problems of poor forest fire smoke extraction and positioning precision and low efficiency under complex background conditions, a forest fire smoke efficient recognition algorithm combining multi-dimensional feature extraction and multi-dimensional feature fusion is designed, and a video three-dimensional grid forest fire point accurate positioning method based on forest fire smoke diffusion features is provided, so that accurate recognition and positioning of early forest fires are realized, and forest fire emergency response and disaster relief are supported.
As shown in fig. 1 and 2, the device comprises: 1) Constructing a forest fire detection data set based on the acquired high-point monitoring video, and training a constructed multi-scale and multi-dimensional feature extraction network integrating the video spatial features and the time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, so as to finally obtain a detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame of image; 2) And (3) accurately positioning the forest fire driven by the video stereoscopic grid based on the detection frames of the smoke object and the flame object and the central two-dimensional pixel coordinates of the flame object in each frame of image. In the data set construction stage, various complex background conditions are fully considered: the forest fire smoke has visual similarity with other objects such as cloud, fog and the like, small targets at long distance in early stage, different illumination conditions, weather conditions, dynamic background and the like. In the forest fire smoke characteristic analysis stage, the triangular diffusion characteristics, the movement direction and the speed characteristics and the like of the forest fire smoke are fully considered and analyzed. In a forest fire smoke recognition stage under a complex background condition, taking into consideration that a high-point monitoring video is photographed at a certain distance, the detection precision of the existing target detection network for a long-distance small target is to be improved, cloud, fog and the like are easily recognized as fire smoke by mistake, and a multi-scale and multi-dimensional feature extraction network is constructed aiming at the problems. In a forest fire accurate positioning stage, aiming at the problem that the positioning of forest fires based on infrared images is affected by seasons and climates and the accuracy of a forest fire positioning method based on single points and double points is low, a forest fire detection and positioning method based on high-point monitoring videos is provided, and the positioning accuracy of the forest fires is improved.
The forest fire video data background is complex, and the high-point monitoring video is influenced by cloud rain, foggy days, illumination and the like, so that the smoke characteristics of the early forest fire are seriously interfered. Therefore, how to reveal the characteristics of dimension difference, dimension change and the like of early forest fire smoke in a video image and realize high-precision recognition of the forest fire smoke is a key problem. According to the scheme, the forest fire smoke dimension difference and the dimension change characteristics are fully considered from three dimensions of depth, width and resolution, and a target recognition network model combining multi-dimension characteristic extraction and multi-dimension characteristic fusion is established to realize high-precision detection of early forest fire smoke under complex interference conditions. The method comprises the steps of constructing a forest fire detection data set based on an acquired high-point monitoring video, training a constructed multi-scale and multi-dimensional feature extraction network which fuses video space features and time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, and finally obtaining the detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame of image. The method comprises the following specific steps:
step 1.1, based on the acquired high-point monitoring video, an all-weather forest fire database based on a forest fire classification system is established through rough labeling, rendering, training, feedback, fine tuning and enhancement, and the method comprises the following specific steps:
Step 1.11, artificially analyzing the feature expression of smoke and fire in a forest fire scene in a high-point monitoring video in a video image, and preliminarily establishing a coarse annotation database through manual annotation, wherein the features comprise color features, shape features and texture features, the colors are obtained by analyzing corresponding color histograms, color sets, color moments and color aggregation vectors, the shapes are obtained by adopting a boundary feature method, a Fourier shape descriptor method, a shape geometric parameter method and a finite element method, and the textures are obtained by adopting a gray level symbiotic matrix, an energy spectrum function, a random field model, an autoregressive texture model and wavelet transformation analysis;
step 1.12, artificially analyzing characteristic heterogeneity expression caused by different types of interference in forest fire scenes in a high-point monitoring video, and performing diversified rendering on data in a coarse marking database, wherein the different types are classified into conifer fires, needle-broad mixed forest fires and broadleaf forest fires according to different forest land types, and classified into surface fires, crown fires and underground fires according to different fire positions, and classified into forest fires, general forest fires, important forest fires and extra-large forest fires according to different damaged forest areas, and classified into daytime forest fires and night forest fires according to different time of occurrence of forest fires, wherein the characteristic heterogeneity comprises light intensity, scale difference, smoke concentration and smoke-like fire objects;
Step 1.13, learning coarse annotation database knowledge by utilizing a neural network model, detecting unlabeled data, feeding back error division conditions, finely annotating the data in a fine adjustment mode, and enhancing the data by an album tool; after enhancement, if the requirements are met, a forest fire detection data set with diversified characteristics is obtained, otherwise, the step 1.11 is transferred to be executed again;
step 1.2, training a multi-scale and multi-dimensional feature extraction network based on a forest fire detection data set, inputting a high-point monitoring video to be identified into the trained multi-scale and multi-dimensional feature extraction network to obtain a smoke and flame object detection frame, simultaneously calculating the vertex two-dimensional pixel coordinates of the flame detection frame, calculating the center two-dimensional pixel coordinates of the detection frame based on the vertex two-dimensional pixel coordinates of the flame, and finally obtaining the detection frame of the smoke object and the flame object and the center two-dimensional pixel coordinates of the flame object.
As shown in fig. 4, the constructed multi-scale and multi-dimensional feature extraction network for fusing video spatial features and time sequence features comprises an input layer, a Focus module, a first Conv module, a global-local feature extraction module, a second Conv module, a deep-shallow feature extraction module, a third Conv module, a time sequence neural unit, a fourth Conv module, a pooling module, a depth and receptive field enhancement module and an output layer which are sequentially connected, wherein the first Conv module, the second Conv module, the third Conv module and the fourth Conv module are convolution modules consisting of a Conv2d, a BN layer and a SILU which are sequentially connected.
As shown in fig. 5, the global-local feature extraction module includes a local feature extraction module and a global feature extraction module which are respectively connected with the first Conv module, a 3×3 convolution layer which receives the results output by the local feature extraction module and the global feature extraction module and performs the addition operation, and a 1×1 convolution layer which is sequentially connected with the 3×3 convolution layer, wherein the local feature extraction module includes a 3×3 convolution layer and a 1×1 convolution layer which respectively perform the convolution treatment on the input feature image, a batch standardization layer which is respectively connected with the 3×3 convolution layer and the 1×1 convolution layer, the addition operation is performed on the results output by the two batches of standardization layers to obtain local features, the global feature extraction module sequentially performs the size conversion, the linearization operation and the projection layer mapping operation on the input feature image to obtain a feature image Q, a feature image K and a feature image V, performs the multiplication operation on the results obtained after the depth normalization treatment on the feature image Q and the feature image K, performs the size conversion on the feature image V, and performs the dimension conversion on the feature image V, and performs the equivalent operation on each pixel point of the feature image obtained after the conversion, and the global feature image is obtained in the final feature image;
As shown in fig. 6, the deep-shallow feature extraction module includes a fifth Conv module connected to the second Conv module, two 3×3 convolution layers with step length of 1, 3×3 convolution layers with step length of 2, and 3×3 convolution layers with step length of 5, where the two 3×3 convolution layers with step length of 5 are multiplied by the output result of the fifth Conv module, then the result obtained by the multiplication is added to the output result of the fifth Conv module, and the two added results are connected in series to obtain an output result, where the fifth Conv module is a convolution module composed of Conv2d, BN layer, and SILU that are sequentially connected;
as shown in FIG. 7, the timing neural unit receives the activated t-1 time hidden layer state vector a output by the third Conv module t-1 And a t moment vector x t And respectively a t-1 And x t And V is equal to ah And V xh Performing multiplication operation, and obtaining a result and b by the two multiplication operations h Performing addition operation to obtain hidden layer state h t Then the hidden layer state is processed by hyperbolic tangent function tanh to obtain an activated hidden layer state vector a at the time t t And output at the same time a t And V is equal to ao Performing a multiplication operation, the result obtained by the multiplication operation and b o Adding to obtain a state vector c of the output node t ,c t Calculated by softmaxPost-conversion to output tag vectorWherein V is xh Representing a weight matrix of K input nodes to N hidden nodes, V ah The representation is a weight matrix connecting N hidden nodes at t-1 time to N hidden nodes at t time, b h Representing a matrix of hidden layer weights, b, before activation o Representing the activated hidden layer weight matrix, V ao Representing the weight matrix of the activated input node to the hidden node.
Finally, the efficient identification of the early forest fire under the complex background condition is realized, the related information of the smoke object and the flame object is obtained at the same time, and the comprehensive description of the early forest fire condition is finally obtained.
Forest fire location accuracy is low based on single-point or double-point mobile shooting, and forest fire location of infrared images is affected by seasons and climates. Therefore, how to ascertain the spatial topological relation between the geographic position and the camera gesture, and constructing a monitoring video stereoscopic grid to improve the positioning accuracy of forest fires is a key problem. By analyzing the spatial topological relation between the geographic position and the camera gesture, the forest fire positioning method combining the smoke diffusion characteristics and the video three-dimensional grid is constructed by considering the diffusion characteristics of the early forest fire smoke, the limitation of the topography condition and the visibility of the observation point is broken through, and the accurate positioning of the early forest fire of the video image is realized. The method is characterized in that the forest fire driven by the video stereoscopic grid is accurately positioned based on the central two-dimensional pixel coordinates of the smoke object and the flame object in each frame of image. As shown in fig. 3, the specific steps are as follows:
Step 2.1, establishing different initial positioning methods of the fire points based on the diffusion characteristics of the smoke and the flame of different forest fires, wherein the specific steps are as follows:
step 2.11, analyzing the diffusion characteristics of smoke and flame in the forest fire detection data set by adopting an optical flow method, namely dividing a diffusion model of the forest fire smoke and flame into a triangular diffusion model, a diffuse diffusion model and a radiation diffusion model by calculating a square distribution diagram of optical flow intensity and optical flow direction angle;
the formulas of the optical flow intensity and the optical flow direction angle are as follows:
L(i,j)=u(i,j) 2 +v(i,j) 2
wherein L (i, j) represents optical flow intensity, alpha (i, j) represents optical flow direction angle, u (i, j) and v (i, j) represent transverse and longitudinal optical flow vectors on pixel points (i, j), respectively, the concentrated distribution of the optical flow intensity and the optical flow direction angle is a triangular diffusion model, the irregular distribution is a diffuse diffusion model, and the uniform distribution is a radiation diffusion model;
step 2.12, establishing different initial positioning methods of the ignition point aiming at different diffusion models, including a triangular diffusion model establishing boundary-central line characteristic line positioning method, a diffuse diffusion model establishing centroid movement offset positioning method and a radiation diffusion model establishing discrete seed point positioning method, wherein the method is shown in figures 8-10;
Step 2.2, providing a forest fire positioning method combining a forest fire smoke diffusion model and a video three-dimensional grid, wherein the method comprises the following specific steps of:
step 2.21, simulating and establishing different simulation motion models aiming at different diffusion models, mapping the diffusion of the smoke and flame in the real world to a digital three-dimensional model in terms of diffusion speed and direction, diffusion mode and space topological relation constraint, namely obtaining the diffusion speed and direction of the simulation motion model by combining mathematical calculation with an image processing technology, and then mapping the smoke and flame of the forest fire in the real world to the digital three-dimensional model by combining the diffusion mode and space topological relation constraint of different diffusion models;
step 2.22, analyzing the adjacent, associated and contained space topological relation between smoke and flame and the structure of the forest in the digital three-dimensional model based on a geographic space topological relation analysis method, and carrying out space topological semantic description on the space topological relation;
2.23, calibrating a camera and correcting distortion according to high-point monitoring camera parameters (the high-point monitoring camera parameters comprise pitch angle, yaw angle, camera height and the like), extracting boundaries of a smoke object and a flame object by adopting a Bwboundaries edge extraction function on the premise of restraining a detection frame of the smoke object and the flame object, acquiring a smoke center line by combining, and utilizing a diffusion model and different initial positioning methods of ignition points, realizing initial positioning of the smoke object in a two-dimensional image space, and simultaneously performing initial positioning based on the two-dimensional coordinates of the center point of the flame object, wherein the camera parameters comprise the pitch angle, yaw angle and camera height of the camera, wherein the smoke center line is the middle point of all edge points of a forest fire after extracting the smoke edge and determining a smoke diffusion movement trend, forming a straight line between the middle point and the slope of the smoke movement direction, and separating the smoke edge point into left fitting edges and right fitting edges to obtain a final straight line which is the center line of smoke;
2.24, converting an image coordinate system by utilizing an imaging mechanism from a three-dimensional space to a two-dimensional plane of a high-point monitoring camera and combining a digital elevation model to obtain position information under a camera and a world coordinate system, setting a reference point in a physical space, establishing a coordinate back calculation model of a pixel-image-camera-world coordinate system through the reference coordinate, and combining constraint of space topology semantic description and preliminary positioning to obtain preliminary positioning of a smoke object and a flame object, thereby realizing three-dimensional space positioning of forest fire smoke and flame objects;
and 2.25, based on three-dimensional space positioning of a smoke object and a flame object of the forest fire, mapping from longitude and latitude and altitude of the smoke object and the flame object to a three-dimensional grid position code is established, namely, the corresponding relation and the mutual conversion from video pixel coordinates to the three-dimensional position code are realized (after mapping is completed, the longitude and latitude and altitude information can be converted into video three-dimensional grid codes which can be identified by high-point monitoring equipment and are easy to store), and the forest fire driven by the video three-dimensional grid is accurately positioned after conversion.
In conclusion, the method is applicable to complex background and severe weather conditions, and can improve the accuracy and efficiency of smoke extraction and positioning of forest fires and the like aiming at early forest fires and the like; meanwhile, the method is suitable for detecting a long-distance small target, and the detection effect is good; the invention not only can measure the daytime scene, but also can realize high-efficiency and real-time monitoring under the night condition.
In order to further improve the detection and positioning stability of the system, the system also combines a real-time monitoring data set (wind speed, temperature, wind direction, humidity, air pressure and the like) of the surrounding environment, dynamically adjusts model parameters in real time by means of the parameters so as to achieve the optimal monitoring and positioning effect, and timely updates the three-dimensional visualization effect of the forest fire in the system. The method comprises the following steps:
real-time data integration: environmental data (such as wind speed, humidity, etc.) are monitored in real time, and model parameters are adjusted according to the data so as to improve the accuracy of fire detection.
Dynamic tracking and analysis: by tracking the dynamic changes of smoke and flame in real time and dynamically adjusting the three-dimensional grid, continuous and accurate fire source positioning is realized.
Visualization and aid decision making: an intuitive three-dimensional visual interface is provided, so that a decision maker is helped to quickly understand the development of fire and to formulate an effective coping strategy.
The above is merely representative examples of numerous specific applications of the present invention and should not be construed as limiting the scope of the invention in any way. All technical schemes formed by adopting transformation or equivalent substitution fall within the protection scope of the invention.

Claims (10)

1. A forest fire detection and positioning method based on a high-point monitoring video is characterized by comprising the following steps:
Step 1, constructing a forest fire detection data set based on an acquired high-point monitoring video, and training a constructed multi-scale and multi-dimensional feature extraction network which fuses video space features and time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, so as to finally obtain a detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame image;
and 2, accurately positioning the forest fire driven by the video stereoscopic grid based on the detection frames of the smoke object and the flame object and the central two-dimensional pixel coordinates of the flame object in each frame of image.
2. The forest fire detection and positioning method based on the high-point surveillance video according to claim 1, wherein the multi-scale and multi-dimensional feature extraction network for fusing the spatial features and the time sequence features of the video constructed in the step 1 comprises an input layer, a Focus module, a first Conv module, a global-local feature extraction module, a second Conv module, a deep-shallow feature extraction module, a third Conv module, a time sequence neural unit, a fourth Conv module, a pooling module, a depth and receptive field enhancement module and an output layer which are sequentially connected, wherein the first Conv module, the second Conv module, the third Conv module and the fourth Conv module are convolution modules consisting of a Conv2d, a BN layer and a SILU which are sequentially connected.
3. The forest fire detection and positioning method based on the high-point monitoring video is characterized in that the global-local feature extraction module comprises a local feature extraction module and a global feature extraction module which are respectively connected with a first Conv module, a 3X 3 convolution layer which receives the results output by the local feature extraction module and the global feature extraction module and carries out the addition operation, and a 1X 1 convolution layer which is sequentially connected with the 3X 3 convolution layer, wherein the local feature extraction module comprises a 3X 3 convolution layer and a 1X 1 convolution layer which respectively carry out convolution processing on an input feature image, a batch standardization layer which is respectively connected with the 3X 3 convolution layer and the 1X 1 convolution layer, the results output by the two batches of standardization layers are added to obtain local features, the global feature extraction module sequentially carries out size conversion, linearization operation and projection layer mapping operation on the input feature image to obtain a feature image Q, a feature image K and a feature image V, carrying out the multiplication operation on the results obtained by carrying out depth normalization processing on the feature image Q and the feature image K, carrying out the multiplication operation on the obtained result and carrying out the feature image V and carrying out the size conversion operation on the feature image V to obtain a final feature image, and a plurality of equivalent points after the feature image is subjected to the size conversion and the global image point conversion;
The deep-shallow layer feature extraction module comprises a fifth Conv module connected with the second Conv module, two 3×3 convolution layers with step length of 1, 3×3 convolution layers with step length of 2 and 3×3 convolution layers with step length of 5 which are sequentially connected with the fifth Conv module respectively, the multiplication operation is carried out on the output results of the two 3×3 convolution layers with step length of 5 and the fifth Conv module respectively, then the addition operation is carried out on the result obtained by the multiplication operation and the output result of the fifth Conv module, and the output result is obtained by connecting the two results obtained by the addition operation in series, wherein the fifth Conv module is a convolution module consisting of Conv2d, BN layer and SILV which are sequentially connected with each other;
the time sequence neural unit receives the activated t-1 time hidden layer state vector a output by the third Conv module t-1 And a t moment vector x t And respectively a t-1 And x t And V is equal to ah And V xh Performing multiplication operation, and obtaining a result and b by the two multiplication operations h Performing addition operation to obtain hidden layer state h t Then the hidden layer state is processed by hyperbolic tangent function tanh to obtain an activated hidden layer state vector a at the time t t And output at the same time a t And V is equal to ao Performing a multiplication operation, the result obtained by the multiplication operation and b o Adding to obtain a state vector c of the output node t ,c t After softmax calculation, the label vector is converted into an output label vectorWherein V is xh Representing a weight matrix of K input nodes to N hidden nodes, V ah The representation is a weight matrix connecting N hidden nodes at t-1 time to N hidden nodes at t time, b h Representing a matrix of hidden layer weights, b, before activation o Representing the activated hidden layer weight matrix, V ao Representing the weight matrix of the activated input node to the hidden node.
4. The forest fire detection and positioning method based on the high-point monitoring video according to claim 3, wherein the specific steps of the step 1 are as follows:
step 1.1, based on the acquired high-point monitoring video, an all-weather forest fire database based on a forest fire classification system is established through rough labeling, rendering, training, feedback, fine tuning and enhancement, and the method comprises the following specific steps:
step 1.11, artificially analyzing the feature expression of smoke and fire in a forest fire scene in a high-point monitoring video in a video image, and preliminarily establishing a coarse annotation database through manual annotation, wherein the features comprise color features, shape features and texture features, the colors are obtained by analyzing corresponding color histograms, color sets, color moments and color aggregation vectors, the shapes are obtained by adopting a boundary feature method, a Fourier shape descriptor method, a shape geometric parameter method and a finite element method, and the textures are obtained by adopting a gray level symbiotic matrix, an energy spectrum function, a random field model, an autoregressive texture model and wavelet transformation analysis;
Step 1.12, artificially analyzing characteristic heterogeneity expression caused by different types of interference in forest fire scenes in a high-point monitoring video, and performing diversified rendering on data in a coarse marking database, wherein the different types are classified into conifer fires, needle-broad mixed forest fires and broadleaf forest fires according to different forest land types, and classified into surface fires, crown fires and underground fires according to different fire positions, and classified into forest fires, general forest fires, important forest fires and extra-large forest fires according to different damaged forest areas, and classified into daytime forest fires and night forest fires according to different time of occurrence of forest fires, wherein the characteristic heterogeneity comprises light intensity, scale difference, smoke concentration and smoke-like fire objects;
step 1.13, learning coarse annotation database knowledge by utilizing a neural network model, detecting unlabeled data, feeding back error division conditions, finely annotating the data in a fine adjustment mode, and enhancing the data by an album tool; after enhancement, if the requirements are met, a forest fire detection data set with diversified characteristics is obtained, otherwise, the step 1.11 is transferred to be executed again;
Step 1.2, training a multi-scale and multi-dimensional feature extraction network based on a forest fire detection data set, inputting a high-point monitoring video to be identified into the trained multi-scale and multi-dimensional feature extraction network to obtain a smoke and flame object detection frame, simultaneously calculating the vertex two-dimensional pixel coordinates of the flame detection frame, calculating the center two-dimensional pixel coordinates of the detection frame based on the vertex two-dimensional pixel coordinates of the flame, and finally obtaining the detection frame of the smoke object and the flame object and the center two-dimensional pixel coordinates of the flame object.
5. The forest fire detection and positioning method based on the high-point monitoring video according to claim 4, wherein the specific steps of the step 2 are as follows:
step 2.1, establishing different initial positioning methods of the fire points based on the diffusion characteristics of the smoke and the flame of different forest fires, wherein the specific steps are as follows:
step 2.11, analyzing the diffusion characteristics of smoke and flame in the forest fire detection data set by adopting an optical flow method, namely dividing a diffusion model of the forest fire smoke and flame into a triangular diffusion model, a diffuse diffusion model and a radiation diffusion model by calculating a square distribution diagram of optical flow intensity and optical flow direction angle;
The formulas of the optical flow intensity and the optical flow direction angle are as follows:
L(i,j)=u(i,j) 2 +v(i,j) 2
wherein L (i, j) represents optical flow intensity, alpha (i, j) represents optical flow direction angle, u (i, j) and v (i, j) represent transverse and longitudinal optical flow vectors on pixel points (i, j), respectively, the concentrated distribution of the optical flow intensity and the optical flow direction angle is a triangular diffusion model, the irregular distribution is a diffuse diffusion model, and the uniform distribution is a radiation diffusion model;
step 2.12, establishing different initial positioning methods of the ignition point according to different diffusion models, wherein the initial positioning method comprises a boundary-central line characteristic line positioning method established by a triangular diffusion model, a mass center moving offset positioning method established by a diffuse diffusion model and a discrete seed point positioning method established by a radiation diffusion model;
step 2.2, providing a forest fire positioning method combining a forest fire smoke diffusion model and a video three-dimensional grid, wherein the method comprises the following specific steps of:
step 2.21, simulating and establishing different simulation motion models aiming at different diffusion models, mapping the diffusion of the smoke and flame in the real world to a digital three-dimensional model in terms of diffusion speed and direction, diffusion mode and space topological relation constraint, namely obtaining the diffusion speed and direction of the simulation motion model by combining mathematical calculation with an image processing technology, and then mapping the smoke and flame of the forest fire in the real world to the digital three-dimensional model by combining the diffusion mode and space topological relation constraint of different diffusion models;
Step 2.22, analyzing the adjacent, associated and contained space topological relation between smoke and flame and the structure of the forest in the digital three-dimensional model based on a geographic space topological relation analysis method, and carrying out space topological semantic description on the space topological relation;
2.23, carrying out camera calibration and distortion correction according to high-point monitoring camera parameters, extracting boundaries of a smoke object and a flame object by adopting a Bwboundaries edge extraction function under the premise of restraining a detection frame of the smoke object and the flame object, acquiring a smoke center line by combining the boundary, and utilizing a diffusion model and different initial positioning methods of a fire point to realize initial positioning of the smoke object in a two-dimensional image space, and simultaneously carrying out initial positioning based on the two-dimensional coordinates of the center point of the flame object, wherein the camera parameters comprise a pitch angle, a yaw angle and a camera height of a camera, wherein the smoke center line is a center point of all edge points of a forest fire after extracting the smoke edge and determining a smoke diffusion motion trend, forming a straight line between the center point and the slope of the smoke motion direction, and separating the smoke edge points into left and right fitting edges to obtain a final straight line which is the center line of the smoke;
2.24, converting an image coordinate system by utilizing an imaging mechanism from a three-dimensional space to a two-dimensional plane of a high-point monitoring camera and combining a digital elevation model to obtain position information under a camera and a world coordinate system, setting a reference point in a physical space, establishing a coordinate back calculation model of a pixel-image-camera-world coordinate system through the reference coordinate, and combining constraint of space topology semantic description and preliminary positioning to obtain preliminary positioning of a smoke object and a flame object, thereby realizing three-dimensional space positioning of forest fire smoke and flame objects;
and 2.25, based on three-dimensional space positioning of a forest fire smoke object and a flame object, mapping from longitude and latitude, height and three-dimensional grid position codes of the smoke object and the flame object is established, namely, the corresponding relation and the mutual conversion from video pixel coordinates to three-dimensional position codes are realized, and after conversion, the forest fire driven by the video three-dimensional grid is accurately positioned.
6. A forest fire detection and positioning system based on a high-point monitoring video is characterized by comprising the following steps:
and a network construction and detection module: constructing a forest fire detection data set based on the acquired high-point monitoring video, and training a constructed multi-scale and multi-dimensional feature extraction network integrating the video spatial features and the time sequence features to identify a detection frame of smoke and flame in the high-point monitoring video to be identified, so as to finally obtain a detection frame of a smoke object and a flame object and a central two-dimensional pixel coordinate of the flame object in each frame of image;
Accurate positioning module of forest fire: and (3) accurately positioning the forest fire driven by the video stereoscopic grid based on the detection frames of the smoke object and the flame object and the central two-dimensional pixel coordinates of the flame object in each frame of image.
7. The forest fire detection and positioning system based on the high-point surveillance video according to claim 6, wherein the multi-scale and multi-dimensional feature extraction network for fusing the spatial features and the time sequence features of the video constructed in the step 1 comprises an input layer, a Focus module, a first Conv module, a global-local feature extraction module, a second Conv module, a deep-shallow feature extraction module, a third Conv module, a time sequence neural unit, a fourth Conv module, a pooling module, a depth and receptive field enhancement module and an output layer which are sequentially connected, wherein the first Conv module, the second Conv module, the third Conv module and the fourth Conv module are convolution modules consisting of a Conv2d, a BN layer and a SILU which are sequentially connected.
8. The forest fire detection and positioning system based on the high-point monitoring video according to claim 7, wherein the global-local feature extraction module comprises a local feature extraction module and a global feature extraction module which are respectively connected with the first Conv module, a 3×3 convolution layer which receives the results output by the local feature extraction module and the global feature extraction module and performs the addition operation, and a 1×1 convolution layer which is sequentially connected with the 3×3 convolution layer, wherein the local feature extraction module comprises a 3×3 convolution layer and a 1×1 convolution layer which respectively perform convolution processing on an input feature image, a batch standardization layer which is respectively connected with the 3×3 convolution layer and the 1×1 convolution layer, the results output by the two batches of standardization layers are subjected to the addition operation to obtain local features, the global feature extraction module sequentially performs the size conversion, the linearization operation and the projection layer mapping operation on the input feature image to obtain a feature image Q, a feature image K and a feature image V, performs the multiplication operation on the results obtained by performing the depth normalization processing on the feature image Q and the feature image K, performs the size conversion operation on the feature image V, and performs the feature image size conversion on the feature image V, and performs the feature point conversion and the global equivalent to obtain final images;
The deep-shallow layer feature extraction module comprises a fifth Conv module connected with the second Conv module, two 3×3 convolution layers with step length of 1, 3×3 convolution layers with step length of 2 and 3×3 convolution layers with step length of 5 which are sequentially connected with the fifth Conv module respectively, the multiplication operation is carried out on the output results of the two 3×3 convolution layers with step length of 5 and the fifth Conv module respectively, then the addition operation is carried out on the result obtained by the multiplication operation and the output result of the fifth Conv module, and the output result is obtained by connecting the two results obtained by the addition operation in series, wherein the fifth Conv module is a convolution module consisting of Conv2d, BN layer and SILV which are sequentially connected with each other;
the time sequence neural unit receives the activated t-1 time hidden layer state vector a output by the third Conv module t-1 And a t moment vector x t And respectively a t-1 And x t And V is equal to ah And V xh Performing multiplication operation, and obtaining a result and b by the two multiplication operations h Performing addition operation to obtain hidden layer state h t Then the hidden layer state is processed by hyperbolic tangent function tanh to obtain an activated hidden layer state vector a at the time t t And output at the same time a t And V is equal to ao Performing a multiplication operation, the result obtained by the multiplication operation and b o Adding to obtain a state vector c of the output node t ,c t After softmax calculation, the label vector is converted into an output label vectorWherein V is xh Representing a weight matrix of K input nodes to N hidden nodes, V ah The representation is a weight matrix connecting N hidden nodes at t-1 time to N hidden nodes at t time, b h Representing a matrix of hidden layer weights, b, before activation o Representing the activated hidden layer weight matrix, V ao Representing the weight matrix of the activated input node to the hidden node.
9. The forest fire detection and positioning system based on the high-point monitoring video according to claim 8, wherein the specific implementation steps of the network construction and detection module are as follows:
step 1.1, based on the acquired high-point monitoring video, an all-weather forest fire database based on a forest fire classification system is established through rough labeling, rendering, training, feedback, fine tuning and enhancement, and the method comprises the following specific steps:
step 1.11, artificially analyzing the feature expression of smoke and fire in a forest fire scene in a high-point monitoring video in a video image, and preliminarily establishing a coarse annotation database through manual annotation, wherein the features comprise color features, shape features and texture features, the colors are obtained by analyzing corresponding color histograms, color sets, color moments and color aggregation vectors, the shapes are obtained by adopting a boundary feature method, a Fourier shape descriptor method, a shape geometric parameter method and a finite element method, and the textures are obtained by adopting a gray level symbiotic matrix, an energy spectrum function, a random field model, an autoregressive texture model and wavelet transformation analysis;
Step 1.12, artificially analyzing characteristic heterogeneity expression caused by different types of interference in forest fire scenes in a high-point monitoring video, and performing diversified rendering on data in a coarse marking database, wherein the different types are classified into conifer fires, needle-broad mixed forest fires and broadleaf forest fires according to different forest land types, and classified into surface fires, crown fires and underground fires according to different fire positions, and classified into forest fires, general forest fires, important forest fires and extra-large forest fires according to different damaged forest areas, and classified into daytime forest fires and night forest fires according to different time of occurrence of forest fires, wherein the characteristic heterogeneity comprises light intensity, scale difference, smoke concentration and smoke-like fire objects;
step 1.13, learning coarse annotation database knowledge by utilizing a neural network model, detecting unlabeled data, feeding back error division conditions, finely annotating the data in a fine adjustment mode, and enhancing the data by an album tool; after enhancement, if the requirements are met, a forest fire detection data set with diversified characteristics is obtained, otherwise, the step 1.11 is transferred to be executed again;
Step 1.2, training a multi-scale and multi-dimensional feature extraction network based on a forest fire detection data set, inputting a high-point monitoring video to be identified into the trained multi-scale and multi-dimensional feature extraction network to obtain a smoke and flame object detection frame, simultaneously calculating the vertex two-dimensional pixel coordinates of the flame detection frame, calculating the center two-dimensional pixel coordinates of the detection frame based on the vertex two-dimensional pixel coordinates of the flame, and finally obtaining the detection frame of the smoke object and the flame object and the center two-dimensional pixel coordinates of the flame object.
10. The forest fire detection and positioning system based on high-point monitoring video according to claim 9, wherein the specific implementation steps of the forest fire accurate positioning module are as follows:
step 2.1, establishing different initial positioning methods of the fire points based on the diffusion characteristics of the smoke and the flame of different forest fires, wherein the specific steps are as follows:
step 2.11, analyzing the diffusion characteristics of smoke and flame in the forest fire detection data set by adopting an optical flow method, namely dividing a diffusion model of the forest fire smoke and flame into a triangular diffusion model, a diffuse diffusion model and a radiation diffusion model by calculating a square distribution diagram of optical flow intensity and optical flow direction angle;
The formulas of the optical flow intensity and the optical flow direction angle are as follows:
L(i,j)=u(i,j) 2 +v(i,j) 2
wherein L (i, j) represents optical flow intensity, alpha (i, j) represents optical flow direction angle, u (i, j) and v (i, j) represent transverse and longitudinal optical flow vectors on pixel points (i, j), respectively, the concentrated distribution of the optical flow intensity and the optical flow direction angle is a triangular diffusion model, the irregular distribution is a diffuse diffusion model, and the uniform distribution is a radiation diffusion model;
step 2.12, establishing different initial positioning methods of the ignition point according to different diffusion models, wherein the initial positioning method comprises a boundary-central line characteristic line positioning method established by a triangular diffusion model, a mass center moving offset positioning method established by a diffuse diffusion model and a discrete seed point positioning method established by a radiation diffusion model;
step 2.2, providing a forest fire positioning method combining a forest fire smoke diffusion model and a video three-dimensional grid, wherein the method comprises the following specific steps of:
step 2.21, simulating and establishing different simulation motion models aiming at different diffusion models, mapping the diffusion of the smoke and flame in the real world to a digital three-dimensional model in terms of diffusion speed and direction, diffusion mode and space topological relation constraint, namely obtaining the diffusion speed and direction of the simulation motion model by combining mathematical calculation with an image processing technology, and then mapping the smoke and flame of the forest fire in the real world to the digital three-dimensional model by combining the diffusion mode and space topological relation constraint of different diffusion models;
Step 2.22, analyzing the adjacent, associated and contained space topological relation between smoke and flame and the structure of the forest in the digital three-dimensional model based on a geographic space topological relation analysis method, and carrying out space topological semantic description on the space topological relation;
2.23, carrying out camera calibration and distortion correction according to high-point monitoring camera parameters, extracting boundaries of a smoke object and a flame object by adopting a Bwboundaries edge extraction function under the premise of restraining a detection frame of the smoke object and the flame object, acquiring a smoke center line by combining the boundary, and utilizing a diffusion model and different initial positioning methods of a fire point to realize initial positioning of the smoke object in a two-dimensional image space, and simultaneously carrying out initial positioning based on the two-dimensional coordinates of the center point of the flame object, wherein the camera parameters comprise a pitch angle, a yaw angle and a camera height of a camera, wherein the smoke center line is a center point of all edge points of a forest fire after extracting the smoke edge and determining a smoke diffusion motion trend, forming a straight line between the center point and the slope of the smoke motion direction, and separating the smoke edge points into left and right fitting edges to obtain a final straight line which is the center line of the smoke;
2.24, converting an image coordinate system by utilizing an imaging mechanism from a three-dimensional space to a two-dimensional plane of a high-point monitoring camera and combining a digital elevation model to obtain position information under a camera and a world coordinate system, setting a reference point in a physical space, establishing a coordinate back calculation model of a pixel-image-camera-world coordinate system through the reference coordinate, and combining constraint of space topology semantic description and preliminary positioning to obtain preliminary positioning of a smoke object and a flame object, thereby realizing three-dimensional space positioning of forest fire smoke and flame objects;
and 2.25, based on three-dimensional space positioning of a forest fire smoke object and a flame object, mapping from longitude and latitude, height and three-dimensional grid position codes of the smoke object and the flame object is established, namely, the corresponding relation and the mutual conversion from video pixel coordinates to three-dimensional position codes are realized, and after conversion, the forest fire driven by the video three-dimensional grid is accurately positioned.
CN202410055348.4A 2024-01-15 2024-01-15 Forest fire detection and positioning method and system based on high-point monitoring video Pending CN117876874A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410055348.4A CN117876874A (en) 2024-01-15 2024-01-15 Forest fire detection and positioning method and system based on high-point monitoring video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410055348.4A CN117876874A (en) 2024-01-15 2024-01-15 Forest fire detection and positioning method and system based on high-point monitoring video

Publications (1)

Publication Number Publication Date
CN117876874A true CN117876874A (en) 2024-04-12

Family

ID=90576980

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410055348.4A Pending CN117876874A (en) 2024-01-15 2024-01-15 Forest fire detection and positioning method and system based on high-point monitoring video

Country Status (1)

Country Link
CN (1) CN117876874A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118097058A (en) * 2024-04-26 2024-05-28 应急管理部沈阳消防研究所 Fire point positioning method based on digital twinning
CN118135424A (en) * 2024-05-06 2024-06-04 中科星图智慧科技有限公司 Ecological environment supervision method and device based on remote sensing image and GIS
CN118097058B (en) * 2024-04-26 2024-07-09 应急管理部沈阳消防研究所 Fire point positioning method based on digital twinning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114399719A (en) * 2022-03-25 2022-04-26 合肥中科融道智能科技有限公司 Transformer substation fire video monitoring method
CN114399734A (en) * 2022-01-17 2022-04-26 三峡大学 Forest fire early warning method based on visual information
CN116563699A (en) * 2023-03-28 2023-08-08 西南交通大学 Forest fire positioning method combining sky map and mobile phone image
CN116824335A (en) * 2023-06-26 2023-09-29 中国科学院上海微系统与信息技术研究所 YOLOv5 improved algorithm-based fire disaster early warning method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114399734A (en) * 2022-01-17 2022-04-26 三峡大学 Forest fire early warning method based on visual information
CN114399719A (en) * 2022-03-25 2022-04-26 合肥中科融道智能科技有限公司 Transformer substation fire video monitoring method
CN116563699A (en) * 2023-03-28 2023-08-08 西南交通大学 Forest fire positioning method combining sky map and mobile phone image
CN116824335A (en) * 2023-06-26 2023-09-29 中国科学院上海微系统与信息技术研究所 YOLOv5 improved algorithm-based fire disaster early warning method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118097058A (en) * 2024-04-26 2024-05-28 应急管理部沈阳消防研究所 Fire point positioning method based on digital twinning
CN118097058B (en) * 2024-04-26 2024-07-09 应急管理部沈阳消防研究所 Fire point positioning method based on digital twinning
CN118135424A (en) * 2024-05-06 2024-06-04 中科星图智慧科技有限公司 Ecological environment supervision method and device based on remote sensing image and GIS

Similar Documents

Publication Publication Date Title
CN106356757B (en) A kind of power circuit unmanned plane method for inspecting based on human-eye visual characteristic
US11995886B2 (en) Large-scale environment-modeling with geometric optimization
CN111080794B (en) Three-dimensional reconstruction method for farmland on-site edge cloud cooperation
CN117876874A (en) Forest fire detection and positioning method and system based on high-point monitoring video
Shen et al. Biomimetic vision for zoom object detection based on improved vertical grid number YOLO algorithm
CN115527123B (en) Land cover remote sensing monitoring method based on multisource feature fusion
CN113537180B (en) Tree obstacle identification method and device, computer equipment and storage medium
CN116645321B (en) Vegetation leaf inclination angle calculation statistical method and device, electronic equipment and storage medium
CN116206223A (en) Fire detection method and system based on unmanned aerial vehicle edge calculation
CN112101189A (en) SAR image target detection method and test platform based on attention mechanism
Gao et al. Large-scale synthetic urban dataset for aerial scene understanding
CN117409339A (en) Unmanned aerial vehicle crop state visual identification method for air-ground coordination
CN115512247A (en) Regional building damage grade assessment method based on image multi-parameter extraction
Dong et al. Real-time survivor detection in UAV thermal imagery based on deep learning
Yang et al. Flood detection based on unmanned aerial vehicle system and deep learning
CN113033386B (en) High-resolution remote sensing image-based transmission line channel hidden danger identification method and system
CN112581301B (en) Detection and early warning method and system for residual quantity of farmland residual film based on deep learning
CN116704309A (en) Image defogging identification method and system based on improved generation of countermeasure network
CN116246187A (en) Unmanned aerial vehicle high-altitude shooting image detection method based on dynamic convolutional neural network
Nedevschi A Critical Evaluation of Aerial Datasets for Semantic Segmentation
Subramaniam et al. Real Time Monitoring of Forest Fires and Wildfire Spread Prediction
Qiao et al. FireFormer: an efficient Transformer to identify forest fire from surveillance cameras
Paulin et al. Person localization and distance determination using the raycast method
CN117739925B (en) Intelligent image analysis method for unmanned aerial vehicle
CN114842286B (en) Large-scale remote sensing data set generation method based on real topography

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination