CN116503763A

CN116503763A - Unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback

Info

Publication number: CN116503763A
Application number: CN202310483913.2A
Authority: CN
Inventors: 张晖; 马博文; 赵海涛; 朱洪波
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-04-27
Filing date: 2023-04-27
Publication date: 2023-07-28
Also published as: JP7475745B1

Abstract

The invention provides an unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback, which comprises two parts of binary detection network construction and a cooperative feedback mechanism based on a smoke-fire relationship, wherein in the aspect of the binary detection network, an FCOS network is improved firstly, an FPN layer regression range self-adaptive training determination method is provided, and the regression range of each FPN layer is adaptively adjusted by combining target semantic characteristics, so that various targets can perform gradient inversion learning on proper FPN layers in a training stage; in the aspect of a cooperative feedback mechanism, the correlation characteristic between smoke and fire is fully utilized as knowledge priori, and the provided binary cooperative network is continuously optimized and strengthened, so that continuous and accurate detection and tracking of fire situations in a scene of rapid change of smoke and flame are realized. In a forest fire inspection scene, the technical scheme of the invention can realize accurate identification and positioning of the fire position, and greatly reduce the investment of manpower and material resources in forest fire control management.

Description

Unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback

Technical Field

The invention relates to an unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback, and belongs to the fields of target identification, target identification and computer vision.

Background

The reasons for causing the fire disaster are various due to the changeable environment in the forest, and great difficulty is brought to the prevention and detection work of the forest fire disaster. Early forest fire detection mainly relies on the manual work to accomplish, not only needs to input a large amount of manpower, material resources, financial resources, still has detection efficiency and security's problem, along with target detection technical development and unmanned aerial vehicle product cost reduction, utilize unmanned aerial vehicle to carry out forest fire and patrol and examine and become main means, however smoke detection and flame detection in the forest fire owing to have different dynamic characteristics and spatial characteristics and two kinds of targets adhesion each other and shelter from seriously in taking photo by plane video sequence image, traditional target detection model can't carry out effective detection and tracking to smog and flame.

In summary, how to accurately and effectively detect smoke and flame in aerial forest images in the prior art is a problem to be solved by those skilled in the art.

Disclosure of Invention

The invention aims to solve the technical problem of providing a binary cooperative feedback unmanned aerial vehicle inspection forest fire detection method, and solves the problems of the prior art that in the unmanned aerial vehicle inspection process, the target scale difference is large, and the target recognition accuracy rate in an unmanned aerial vehicle inspection image is low due to the fact that smoke and flame targets are blocked.

The invention adopts the following technical scheme for solving the technical problems:

an unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback comprises the following steps:

step 1, aerial photographing of an obtained target area image by an unmanned aerial vehicle, and respectively carrying out smoke labeling and flame labeling on the obtained target area image to construct a smoke training set and a flame training set;

step 2, constructing and training a binary detection network by utilizing the smoke training set and the flame training set in the step 1 and aiming at the characteristics of two targets of smoke and flame in the target area image; wherein the binary detection network comprises a smoke detection branch and a flame detection branch, each referenced to a modified FCOS network;

step 3, detecting a current frame of a target area video acquired in real time by using a trained binary detection network, and outputting a fused detection result of a smoke detection branch and a flame detection branch;

And 4, repeating the step 3, and detecting the next frame of the target area video until the target area video is finished.

Further, the step 2 uses the improved FCOS network as a reference, wherein the improved FCOS network construction process is as follows: adding a CBAM attention module between C3 and C4 and between C4 and C5 of a backbone network ResNet respectively; the original convolution kernel is replaced by the deformable convolution in the regression branch.

Further, the improved FCOS network adopts an FPN layer regression adaptive training determination method during training, and the specific process is as follows:

step 2.1, creating a FPN hierarchical regression vector set ψ= [ ψ ] ₁ ,Ψ ₂ ,…,Ψ _i ,…,Ψ _N ]N is the total number of label categories; psi _i Is a seven-tuple, the first 5 positions are the default regression range of the FPN hierarchy, the 6 th position is the label category, and the 7 th position is the label categoryThe position is a range modification zone bit;

step 2.2, for a given target of pixel size k×k in the current frame, determining its default level P according to the default regression scale range _l′ ；

And 2.3, carrying out minimum value judgment on the loss values of different FPN levels, wherein a judgment formula is as follows:

wherein,,for the minimum loss between levels, L represents the index parameter of FPN from 3 to 7 layers, L _all (l-2)、L _all (l-1)、L _all (l)、L _all (l+1)、L _all (l+2) represents the level P _l-2 、P _l-1 、P _l- 、P _l+1 、P _l+2 ，L _all (. Cndot.) is a loss function at the FPN level;

step 2.4, ifAnd level P _l′ Setting the 7 th position of the hierarchical regression vector corresponding to the given target class as 1, and directly performing gradient inversion learning on different input scales of the given target according to a default regression scale range;

step 2.5, ifAnd level P _l′-1 Or P _l′+1 Setting the 7 th position of the hierarchical regression vector corresponding to the given target class to 0, and returning to step 2.2 after modifying the default regression scale range as follows:

if it isAnd level P _l′-1 Is consistent with the loss value of (1), level P _l′-1 The corresponding default regression scale range is expanded, i.e., (2) ^8(l′-3) K) from level P _l′ Stripped out of the corresponding default regression scale range and incorporated into level P _l′-1 Corresponding default regression scale ranges;

if it isAnd level P _l′+1 Is consistent with the loss value of (1), level P _l′+1 The corresponding default regression scale range is expanded, i.e., (k, 2) ^8(l′-2) ) From level P _l′ The corresponding default regression scale range is stripped out and incorporated into level P _l′+1 Corresponding default regression scale ranges;

step 2.6, ifAnd level P _l′-2 Or level P _l′-2 And setting the 7 th position of the hierarchical regression vector corresponding to the given target class as-1, and automatically selecting the hierarchy by directly utilizing the loss degree of each hierarchy of the FPN for the target class to perform gradient inversion learning.

Further, the fusing in the step 3 specifically includes: and marking the position frame information and the identified target probability in the detection results of the smoke detection branch and the flame detection branch at the corresponding positions of the current frame.

Further, before detecting the next frame of the target area video in the step 4, the detection result of the current frame is used as priori knowledge to perform selection of two feedback mechanisms, namely collaborative optimization feedback and collaborative reinforcement feedback, according to the following selection criteria:

if no target is detected by the smoke detection branch and the flame detection branch in the detection result of the current frame, namely, the smoke and fire free condition is detected, cooperative reinforcement feedback is selected;

if the smoke detection branch detects the target in the detection result of the current frame and the flame detection branch does not detect the target, namely the condition of smoke and no fire is detected, selecting cooperative optimization feedback;

if the smoke detection branch does not detect the target, and the flame detection branch detects the target, namely, the condition of fire and no smoke is detected, cooperative reinforcement feedback is selected;

if the smoke detection branch and the flame detection branch detect targets, the minimum smoke target area detected in the smoke detection branch is recorded asThe minimum flame target area detected in the flame detection branch is denoted +. >According to And a set threshold value eta _min Is selected according to the size relation of:

if it isNamely, the condition of big and small smoke and fire is adopted, and the cooperative optimization feedback and the cooperative reinforcement feedback are selected at the same time;

if it isNamely, the equivalent condition of smoke and fire is adopted, and cooperative optimization feedback is selected;

if it isAnd selecting cooperative reinforcement feedback if the condition of small smoke and big fire is the condition.

Further, the specific process of the collaborative optimization feedback is as follows:

1) According to the current frame I _t-1 (x, y) detecting information of smoke and flame in the detection result, and predicting the next frame I by using a Kalman filtering method _t The smoke and flame target areas in (x, y), respectively, are denoted as a set of smoke target areasAnd flame target area set->Wherein (1)>And->Respectively is I _t-1 (x, y) detecting the number of target areas of smoke and flame in the result;

2) Constructing a first pixel discriminant function, and eliminating A _sm And A _fir The first pixel in the medium smoke and flame target area judges the pixel point with the function value of 0, obtain the new smoke target area setAnd flame target area set->Wherein the expression of the first pixel discriminant function is as follows:

wherein,,and->Images of flames and smoke, respectivelyThe pixel discriminant function, (x, y) represents the pixel coordinates, f _H (x,y)、f _S (x, y) and f _I (x, y) represent the value of pixel (x, y) on the HSI spatial channel, respectively;

3) Using the inter-frame difference method for I _t A 'in (x, y)' _sm And A' _fir The outer smoke and flame targets are located as follows:

s31, respectively obtaining I _t (x, y) and I _t-1 (x, y) does not contain A' _sm Image of (a)And->Does not contain A' _fir Image +.>And->

S32, acquiring a frame difference region D _sm (x, y) and D _fir (x,y)：

S33, for the obtained D _sm (x, y) and D _fir (x, y), constructing a second pixel discriminant function, and rejecting D _sm (x, y) and D _fir A pixel point with a second pixel discrimination function value of 0 in (x, y) to obtain a deviation A' _sm Or A' _fir Smoke and flame zone U of zone _sm And U _fir The calculation process of the second pixel discriminant function is as follows:

wherein T is a motion area judgment threshold, U _sm And U _fir Respectively deviate from A' _sm And A' _fir Is a smoke and flame region of (1);

s34, at I _t In (x, y) for A ', respectively' _sm ∪U _sm Region and A' _fir ∪U _fir Masking the region to obtain I _t (x, y) corresponding defogging mask imageAnd image of flame mask ∈>

4) If I _t-1 When (x, y) is smoke or fire, then I is performed _t (x, y) detectionAs input to the flame detection branch in a binary detection network, I will be _t (x, y) as input to a smoke detection branch;

if I _t-1 When (x, y) is the condition of big smoke and small fire, then I is carried out _t (x, y) detectionAs input to the smoke detection branch in a binary detection network, will +. >As input to the flame detection branch;

if I _tv1 When (x, y) is equivalent to a pyrotechnic, then I is performed _t (x, y) detectionAs input to the smoke detection branch in a binary detection network, will +.>As input to the flame detection branch.

Further, according to the current frame I _tv1 Detection information of smoke and flame in (x, y) detection results, and predicting the next frame I by using a Kalman filtering method based on speed correction _t The smoke and flame target area in (x, y), wherein the calculation process of the Kalman filtering speed correction is as follows:

wherein v is _uav The speed of the unmanned aerial vehicle is represented, w 'and h' represent the width and the height of the image of the target area, L represents the receptive field diameter of the aerial lens of the unmanned aerial vehicle,represents a scaling parameter, Δh represents a height difference in the ascending or descending process of the unmanned aerial vehicle, and Δt _h Representing the time taken by the unmanned aerial vehicle to ascend and descend, w _t-1 And h _t-1 Is I _tv1 The width and height of the target detection frame in (x, y), Δt is the frame spacing, v _tx ，v _ty ，v _tw ，v _th The center coordinates of the target detection frame at the time t and the speed value of the width and height of the detection frame are respectively represented.

Further, the specific process of the collaborative reinforcement feedback is as follows:

(1) If I _t-1 (x, y) is smoke-free and fire-free, then in the process of I _t When (x, y) detecting, self-adaptive adjustment is carried out on characteristic weights omega of all layers of FPN in a smoke detection branch and a flame detection branch in a binary detection network, and the adjusted weights are as follows:

wherein,,for the level weight adjustment factor, +.>h _uav Representing the current unmanned aerial vehicle aerial photographing altitude, < >>Representing the average aerial photographing height of the corresponding unmanned aerial vehicle during binary detection network training, wherein l represents the number of levels of the current FPN and is +.>The total number of FPN layers required to be subjected to feature map fusion is calculated;

(2) If I _t-1 (x, y) is the case of fire or smoke, then I is performed _t (x, y) detecting, weighting characteristic omega of each layer of FPN in flame detection branch in binary detection network _l The self-adaptive adjustment is carried out, and the specific steps are as follows:

first, for I _t-1 (x, y) sorting the identification probabilities of the detected flame targets, and marking the flame target with the smallest identification probability as S _fir Its corresponding recognition probability is denoted as p _fir ；

Second, S is a regression range according to the current smoke target scale _fir Performing scale positioning of the FPN layer, and marking the positioning to the FPN layer as

Subsequently, the recognition probability p is used _fir And setting a target expected probability p _E Correcting the characteristic weights of all layers of FPN in a flame detection branch in a binary detection network, wherein the corrected weights are as follows:

(3) If I _t-1 When (x, y) is the condition of big smoke and small fire, then I is carried out _t (x, y) detection ofFusion weight omega of characteristics of each layer of FPN (Fabry-Perot) in smoke detection branch and flame detection branch in binary detection network _l And Gaussian weighting function G _x,y,l The standard deviation of (2) is respectively adjusted, and the specific adjustment steps are as follows:

in smoke detection branch G _x,y,l Correction: a 'obtained by synergistic feedback reinforcement' _sm ∪U _sm The regional standard deviation of (2) replaces the original standard deviation, thereby realizing G _x,y,l Is corrected by the correction of (a);

omega in smoke detection branch _l Correction: first, a smoke target S with the smallest recognition probability is obtained _sm Probability of correspondence p _sm And performing corresponding FPN layersPositioning; when p is _sm ≥p _E When the weight is not needed to be adjusted, if p _sm ＜p _E Weight adjustment was performed using the following formula:

finally, the corrected G is utilized _x,y,l And the adjusted fusion weights omega 'of all the FPN layers' _l Adjusting the smoke detection branch;

based on the same adjustment steps as the smoke detection branch, G in the flame detection branch _x,y,l And omega' _l Adjusting;

(4) If I _t-1 When (x, y) is a smoke or fire, then I is performed _t In (x, y) detection, the weight ω is fused to the smoke detection branch level in a binary detection network _l For omega in flame detection branch under the condition of fire and no smoke _l The modification is adjusted to omega in the flame detection branch _l Detecting omega in branch according to flame under the condition of big fire and small fire of smoke _l The modification mode is adjusted.

The present invention provides a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform the method as described above.

The invention also provides an unmanned aerial vehicle cruising forest fire detection device based on binary cooperative feedback, comprising one or more processors, one or more memories and one or more programs, wherein the one or more programs are stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the method as described above.

Compared with the prior art, the binary collaborative feedback unmanned aerial vehicle inspection forest fire detection method has the following technical effects:

1. the improved FCOS model provided by the invention has better recognition effect on smoke and flame targets;

2. the invention respectively detects smoke detection and flame detection as two independent problems based on a binary detection network built by taking an improved FCOS model as a reference, so as to avoid the problems of target missed detection and inaccurate positioning caused by mutual adhesion and shielding of the smoke and the flame;

3. The invention improves the robustness of the binary detection network under different cruising heights and different forest fire inspection scenes through two more cooperative feedback mechanisms.

Drawings

FIG. 1 is a flow chart of a method of forest fire detection by a binary collaborative network;

FIG. 2 is a regression branch improvement based on deformable convolution;

fig. 3 is a modified FCOS network.

Detailed Description

The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.

The invention discloses a binary cooperative feedback unmanned aerial vehicle cruising forest fire detection method, the whole structure is shown in figure 1, and the specific steps are as follows:

step 1, because the FCOS network has poor detection effect on smoke and flame targets, the invention improves the backbone network, regression branches and training strategies of the FCOS model according to the target characteristics of the smoke and flame.

The FOCS algorithm employs a pixel-by-pixel regression strategy. For a certain point (x, y) on the FPN feature map, mapping the corresponding coordinates of the input image according to the step size s is as followsWhen the mapping point of the point (x, y) falls into any real frame marked by the input image, the point (x, y) is judged to be a positive sample. Otherwise, the negative sample is marked. If the sample point (x, y) maps back to the input image point (x ', y') falling within the real box, this point is brought to the real box +. >The distances of the upper, lower, left and right sides of (2) are recorded as target regression offset (l) ^* ,t ^* ,r ^* ,b ^* ) The calculations are as follows:

if (x ', y') falls within multiple real frames simultaneously, then this point will be marked as a blurred sample, to avoid this, FCOS mitigates the overlap of detection frames by adding a frame regression range limit to the feature layer for each scale, as follows:

(1) Computing regression targets in the current hierarchy ^* 、t ^* 、r ^* 、b ^* ；

(2) Judgment max (l) ^* 、t ^* 、r ^* 、b ^* ) > m or max (l ^* 、t ^* 、r ^* 、b ^* )＜m _F -1 is satisfied;

(3) If so, regression prediction is not performed on the bounding box.

Wherein m is _F For the maximum regression range of the current feature layer, (-1, 64), (64,128), (128, 256), (256, 512), (512, +_A) are sequentially given, and the objects with different sizes can be separated into different feature layers for regression learning by the limitation, so that the generation of excessive fuzzy samples is avoided.

The overall architecture of the improved FCOS network is shown in fig. 3, and the improvement with respect to the FCOS network is specifically as follows:

adding a CBAM attention module between C3 and C4 and between C4 and C5 of the backbone network ResNet respectively for increasing the feature extraction capability of smoke or flame targets with smaller scale; secondly, the problem of inaccurate target positioning of partial flame is solved by using deformable convolution instead of conventional convolution kernels in regression branching, so that the target detection effect is better, and the specific structure is shown in figure 2.

In the aspect of training strategies, a FPN layer regression adaptive training determination method is provided, and the related steps are as follows:

step 1.1, creating a hierarchical regression vector ψ= [ ψ ] ₁ ,Ψ ₂ ,…,Ψ _i ,Ψ _N ]N is the total number of label categories, ψ _i The method is a seven-tuple, the first 5 positions are default regression ranges of the FPN level, the 6 th position is a label category, the 7 th position is a range modification flag bit, and the default is NULL;

step 1.2, for a given target of pixel size k×k in the current frame, determining its default level Pl according to the scale range of the default regression _′ 。

Step 1.3, carrying out minimum value judgment on the loss values of different FPN levels, wherein a judgment formula is as follows:

wherein,,for the minimum loss between levels, l represents the index parameter of FPN from 3 to 7 layers，L _all (l-2)、L _all (l-1)、L _all (l)、L _all (l+1)、L _all (l+2) represents the level P _l-2 、P _l-1 、P _l 、P _l+1 、P _l+2 ，L _all (. Cndot.) is a loss function at the FPN level.

Wherein, level P _l The degree of loss of (2) is calculated as follows:

L _all (l)＝L _cls (l)+L _reg (l)+L _cnt (l)

wherein,,and->Respectively represent the level P _l Classification score of prediction at upper position (x, y), prediction frame position and centrality of prediction, N _pos Positive sample number, +.>For classification purposes, < >>For marking frame coordinates>Is the centrality of the real frame. />As a sample discriminant function, the positive sample is 1 and the negative sample is 0.L (L) _all (l) At level P for a given target pixel point (x, y) _l Is the total loss value of L _cls (l) To classify loss, focal loss is used to solve the problem of imbalance of positive and negative samples, L _cnt (l) For center-less loss, use two classes of cross entropy, L _reg (l) For regression loss, the cross-ratio loss is used for representation.

Step 1.4, ifAnd level P _l′ If the loss values are consistent, the semantic feature extraction performance of the target on different FPN levels is considered to be strongly related to the scale regression range of different levels of the FPN, and the level regression vector ψ corresponding to the input target category is yielded _i The 7 th position of (1) is set to 1, and the subsequent different input scales for this target directly perform gradient inversion learning according to the FPN regression hierarchy range.

Step 1.5, ifAnd level P _l′ If the loss values are inconsistent, the judgment needs to be carried out separately:

when (when)And level P _l′-1 Or P _l′-1 If the loss values of the target are consistent, regarding that the target is weakly related to different level scale regression of the FPN, and determining a level regression vector ψ of the corresponding target class _i Is set to 0 and the scale limit is modified as follows:

if it isAnd P _l′-1 Is consistent with the loss value of P _l′-1 The corresponding regression range is expanded, i.e. (2) ^8(l-3) K) from P _e′ Stripping out and incorporation in regression Range To P _l′-1 In (a) and (b);

if it isAnd P _l′+1 Is consistent with the loss value of P _l′+1 The corresponding regression range is expanded, i.e. (k, 2) ^8(l-2) ) From P _l′ Stripping out of regression Range and incorporating into P _l′-1 Is a kind of medium.

Step 1.6, ifAnd level P _l′-2 Or P _l′-2 If the loss value of the FPN is consistent, i.e. the interval between the feature layer where the minimum loss degree is located and the current feature layer is greater than 1, the hierarchical scale regression strategy of the FPN is considered not suitable for such a target, so that the hierarchical regression vector ψ of the corresponding target class is obtained _i Is set to-1, and for such targets gradient inversion learning is performed directly using the degree of loss of each level of FPN to automatically select the level.

Step 1.7, for the target (step 1.5) determined to be weakly correlated, the scale range is repeatedly corrected according to steps 1.2, 1.3 and 1.5 until the iteration is terminated after the conditions of steps 1.4 and 1.6 occur.

Step 2, building and training a binary detection network by taking the improved FCOS network as a reference: the binary detection network smoke detection branch and flame detection branch are each an improved FCOS network as described in step 1; in the aspect of training, the smoke detection branch only trains aiming at the smoke training set, namely only carries out smoke detection after the training is finished, and similarly, the flame detection branch only trains aiming at the flame training set, namely only carries out flame detection after the training is finished.

And step 3, detecting the current frame by using a trained binary detection network, and respectively inputting the obtained target frame information and the class probability of the two detection branches into the step 4 and the step 5 for subsequent processing.

And step 4, selecting two feedback mechanisms, namely collaborative optimization feedback and collaborative reinforcement feedback, according to the target relation situation of the smoke and the flame obtained in the step 3. The collaborative feedback optimization mainly uses the position information of a plurality of smoke frames or flame frames obtained in the step 3 as knowledge priori, and combines a Kalman filtering method based on speed correction and an inter-frame difference method to jointly predict the approximate positions of smoke and flame in the next frame image, and masks the predicted approximate smoke or flame target positions in advance in the next frame aiming at different detection branches of the binary collaborative network, so that the detection precision of the binary detection network on smoke or flame is improved. And the cooperative feedback reinforcement is used as post-processing operation, and is mainly used for adjusting the weight of the feature map in the detection branch of the binary network, so that the detection accuracy of the next frame aiming at the same type of targets is enhanced by using the transmission result of the step 3.

In two feedback mechanism choices: if no target is detected by the smoke detection branch and the flame detection branch in the binary detection network, namely, the smoke and fire free condition is achieved, and cooperative reinforcement feedback is selected; if the smoke detection branch detects the target and the flame detection branch does not detect the target, namely the smoke condition and the fire condition are detected, the cooperative optimization feedback is selected; if the smoke detection branch does not detect the target and the flame detection branch detects the target, namely the condition of fire and smoke, cooperative reinforcement feedback is selected; if the smoke detection branch and the flame detection branch detect targets, the minimum smoke target area detected in the smoke detection branch is recorded as The minimum flame target area detected in the flame detection branch is similarly denoted +.>According to->And a set threshold value eta _min The following determination is made as to the magnitude relation of (a):

if it isNamely, the big fire and the small fire of the cigaretteIn this case, both feedback mechanisms are selected;

if it isNamely, the equivalent situation of smoke and fire is adopted, and at the moment, cooperative optimization feedback is selected;

if it isNamely, the condition of small smoke and big fire is adopted, and cooperative reinforcement feedback is selected at the moment.

The collaborative optimization feedback mainly takes the position information of a plurality of smoke frames or flame frames detected in the current frame as knowledge priori, and combines a Kalman filtering based on speed correction and an inter-frame difference method to jointly predict the approximate positions of smoke and flame in the image of the next frame, and masks the predicted approximate smoke or flame target positions in the next frame in advance according to different detection branches of the binary collaborative network, so that the detection precision of the binary detection network to the smoke or flame is improved. The method comprises the following specific steps:

step one: a kalman filter model based on velocity correction is constructed to predict where smoke or flame targets may occur for the next frame.

Kalman filtering models the motion of a target as uniform motion, and the motion state of the target is expressed as (p _t ,v _t ) Wherein p is _t Represents the target position at time t, v _t Represents p _t The speed of each parameter, the target state can be expressed as the following vector form:

γ _t ＝(x _tc ,y _tc ,w _t ,h _t ,v _tx ,v _ty ,v _tw ,v _th ) ^T

wherein x is _tc ，y _tc The central coordinate of the target detection frame at the t moment is represented; w (w) _t Representing the width of the detection frame; h is a _t Representing the height of the detection frame; v _tx ，v _ty ，v _tw ，v _th The respective speed change values are represented.

However, in actual forest fire inspectionIn the process, the unmanned plane always has the condition of sudden acceleration and deceleration, so that a target prediction frame has larger deviation when a uniform speed model is used, and the current unmanned plane aerial speed and the altitude change rate are utilized to calculate the velocity v _t The speed parameters of the device are corrected, and the related calculation is as follows:

in the formula, v _uav The speed of the unmanned aerial vehicle is represented, w 'and h' represent the width and the height of the aerial photo, L represents the diameter of the receptive field of the aerial photo lens of the unmanned aerial vehicle,represents a scaling parameter, Δh represents a height difference in the ascending or descending process of the unmanned aerial vehicle, and Δt _h Representing the time taken by the unmanned aerial vehicle to ascend and descend, w _t-1 And h _t-1 For the width and height of the last frame target detection frame, Δt is the frame interval.

After the speed correction, the covariance prediction equation of the target state and the target state by using the Kalman filtering is as follows:

In the method, in the process of the invention,representing the predicted state of the target at time t +.>Represents the optimal estimated value at time t-1, F _s Representing a state transition matrix, i.e. a motion parameter matrix of the target, W _t-1|t-1 As the motion noise at time t-1, gaussian white noise with a mean value of zero is generally used. P (P) _t|t-1 、P _t-1|t-1 、Q _W Respectively->And W is _t-1 Is a covariance of (c).

In the process of updating the track state of the Kalman filtering, based on the detection of the current moment, the associated track state is corrected to obtain a more accurate state estimation value, and the state updating equation is as follows:

in the method, in the process of the invention,representing the optimal estimation of the target at the moment t, z _t ＝(x _z ,y _z ,w _z ,h _z ) For detecting the mean vector at time t, H _× Representing the observation transfer matrix, K _t For the Kalman gain, the following is specifically calculated:

wherein R is _× The noise matrix representing the detector is a 4 x 4 diagonal matrix with values on the diagonal being the center point coordinates and the wide and high noise, respectively. Kalman filtering the above formula to estimate the optimal rotation state of the current target motion estimation, and updatingCovariance matrix P of (2) _t|t And sequentially and reciprocally realizing the completion of the position estimation of the target at the next moment based on the target detection information at the current moment.

The present invention marks the image of the current frame as I _t-1 (x, y), the image of the next frame is denoted as I _t (x, y) velocity-based modified Kalman filtering followed by I _t The set of predicted smoke and flame target areas in (x, y) are respectively noted asAnd->Wherein->And->Respectively at I _t-1 Smoke and flame targets detected in (x, y). However, unlike rigid objects such as trees, smoke and flame are likely to be deformed greatly in a short time as non-rigid objects, i.e. it is likely that part of smoke or flame objects deviate from the track predicted by the kalman filtering due to the non-rigid deformation, so that the predicted area needs to be refined and compensated, and specific details are shown in the second step and the third step.

Step two: aiming at the color space characteristics of smoke and flame, a first pixel discriminant function is constructed, and the predicted position obtained by Kalman filtering is refined, and the specific steps are as follows:

2.1 Defining a first pixel discriminant function number, specifically as follows:

in forest scenes, the color characteristics of smoke and flames are prominent, smoke generally appears in white, gray and black colors, and flames generally appear in brown, orange. Therefore, according to the color space characteristics of the smoke and the flame, the invention provides a first pixel discriminant function aiming at the flame and the smoke target, which is calculated as follows:

in the method, in the process of the invention, And->The discriminant functions of flame and smoke targets, respectively, (x, y) represent pixel coordinates, f _H (x,y)、f _S (x, y) and f _I (x, y) represents the value of the pixel on the HSI space channel, respectively.

2.2 Traversing a set of regionsAnd->Judging the pixel points in each region by using the first pixel discriminant function in the step 2.1, and eliminating A _sm And A _fir The first pixel in the medium smoke and flame target area judges the pixel point with the function value of 0, thus obtaining a new smoke and fire target areaAnd->

Step three: for deviations from A 'using the inter-frame difference method' _sm Or A' _fir The location of a portion of the smoke or flame target in the area is as follows:

3.1 Based on A 'obtained in step two' _sm And A' _fir Acquisition of I _t (x, y) and I _t-1 (x, y) does not contain A' _sm Image of regionAnd->Does not contain A' _fir Image of region->And->

3.2 Acquiring frame difference region D) _sm (x, y) and D _fir (x, y), the correlation is calculated as follows:

3.3 Based on the obtained D) _sm (x, y) and D _fir (x, y), constructing a second pixel discriminant function through threshold judgment and the pixel judgment rule proposed in the step 2.1, and eliminating D _sm (x, y) and D _fir A pixel point with a second pixel discrimination function value of 0 in (x, y) to obtain a deviation A' _sm Or A' _fir Smoke and flame zone U of zone _sm And U _fir The calculation process of the second pixel discriminant function is as follows:

/>

Wherein T is a motion area judgment threshold, U _sm And U _fir Respectively deviate from A' _sm And A' _fir Is provided.

Step four: at I _t (x, y) vs A' _sm ∪U _sm Region or A' _fir ∪U _fir Masking the region to obtain I _t (x, y) corresponding smoke-removed imageOr a flameless image->

The specific application of the collaborative optimization feedback flow under the conditions of smoke and fire, big and small smoke and fire and equivalent smoke and fire is as follows:

(1) Smoke or fire, which is shown in I _t-1 In (x, y), only smoke targets are detected, but no flame targets are detected, which is true in the actual scene, but flame targets may be missed due to smoke interference. In order to avoid the above situation, in the process I _t In (x, y) image detection, the image is detectedAs input to the flame detection branch in a binary network, while the smoke detection branch remains I _t The (x, y) image input is unchanged.

(2) The big smoke and the small fire are shown in I _t-1 In (x, y), detecting smoke target and flame target simultaneously, wherein the area of the smoke target is larger than that of the flame target, and the smoke target is matched with the smoke relationship in the actual scene, and in order to ensure the subsequent detection stability, I is carried out _t In (x, y) image detection, the image is detected As input to the smoke detection branch in a binary network +.>As a flame detection branch input.

(3) The pyrotechnically equivalent, this case is shown in I _t-1 In (x, y), a smoke target and a flame target are detected at the same time, but the area of the area occupied by the smoke target is not greatly different from the area of the flame target, the situation does exist in an actual scene, but the situation of inaccurate positioning of the smoke target can also occur, in order to strengthen the accuracy of identifying the smoke target, and simultaneously ensure the subsequent detection stability of the flame target, I is carried out _t The (x, y) image detection is performed in the same manner as in the case of the smoke being large or small.

The collaborative reinforcement feedback is used as post-processing operation, and is mainly used for adjusting the weight of the feature map in the detection branch of the binary network, so that the error result possibly existing in the next frame aiming at the same type of target is corrected by utilizing the result transmission of the current frame. In the target detection process, the FCOS network performs weighted fusion on a plurality of FPN feature graphs, and a feature fusion method based on Gaussian weighting is adopted. The specific fusion process is as follows:

in the method, in the process of the invention,for the fused feature map, < >>Total number of fused FPN layers, +.>Is F _l The eigenvector, ω, of the layer at (x, y) _l Is a trainable weight, is generated deterministically by training strategies and is used for controlling the importance of each feature map, G _x,y,l Is a gaussian weighting function for weighting each feature map, the gaussian weighting function being of the form:

wherein x is _l And y is _l Is P _l Center position, sigma, of layer feature map in spatial dimension _G Is the standard deviation of the gaussian distribution. The function represents that feature maps farther away from the location (x, y) will be given less weight, while feature maps closer to the location will be given more weight.

The specific application of the cooperative feedback reinforcement under the four conditions of no smoke, no fire, no smoke, big smoke, little smoke and big fire is as follows:

(1) If I _t-1 When (x, y) is determined to be smoke-free and fire-free, then I is performed _t In (x, y) image detection, it is necessary to divide smoke detection in a binary detection networkThe characteristic weights omega l of all layers of FPN generated by network training in the branch and the flame detection branch are adaptively adjusted according to the inspection height condition of the unmanned aerial vehicle, and the specific adjustment is as follows:

first, a hierarchical weight adjustment factor based on altitude variation is definedThe specific calculation process is as follows:

wherein h is _uav Representing the current unmanned aerial vehicle aerial photographing altitude,and (3) representing the average acquisition height of the binary collaborative network training unmanned aerial vehicle, wherein l represents the number of levels of the current FPN.

Then, according to the obtainedThe layer characteristics fusion weight omega in the smoke detection branch and the flame detection branch of the binary detection network _l From the new weighting, the specific calculation is as follows:

wherein,,the total number of FPN layers needed for feature map fusion.

Finally, the binary detection network with the adjusted weight is used for detecting smoke or flame targets of the subsequent frames until the subsequent frames have the conditions of no smoke, big smoke, small smoke and big fire;

(2) If I _t-1 When (x, y) is determined to be fire or smoke, if the relationship with the smoke in the actual scene is not satisfiedGo through I _t When (x, y) image detection is performed, feedback reinforcement is required to be performed on flame branches in a binary detection network according to the detection result of a current flame target, and the related steps are as follows:

first, the identification probabilities of the detected flame targets are ranked, and the flame target mark with the smallest identification probability is S _fir Its corresponding recognition probability is denoted as p _fir ；

Second, S is based on the regression range of the target scale of the smoke _fir Performing scale positioning of the FPN layer, and marking the positioning to the FPN layer as

Subsequently, the recognition probability p is used _fir And a target expected probability p _E And (4) carrying out weight correction, and calculating the following correlation:

finally, the corrected FPN level fusion weight omega 'is utilized' _l The weight in the original flame detection branch is replaced, so that the detection capability of the flame detection branch on the flame target of the next frame is enhanced.

(3) If I _t-1 When (x, y) is determined to be a smoke big or small, then I is performed _t When (x, y) image detection is performed, the characteristics of each layer of FPN in a smoke detection branch and a flame detection branch in a binary detection network are required to be fused with weights omega x and Gaussian weighting functions G _x,y,l The standard deviation of (2) is respectively adjusted as follows:

in smoke detection branch G _x,y,l Correction: firstly, obtaining I by utilizing a cooperative feedback reinforcement module _t Smoke probable area a 'in (x, y) frame' _sm ∪U _sm Then, based on A' _sm ∪U _sm Region standard deviation sigma of region _SM Calculating, finally, sigma _SM Instead of the original standard deviation sigma _G Thereby realizing G to _x,y,l Is a modification of (a).

In smoke detectionOmega of branching _l Correction: first, as in the case of fire and smoke, a smoke target S with the smallest recognition probability is obtained _sm Probability of correspondence p _sm And performing corresponding FPN layersPositioning; but based on the expected probability p _E Identification probability p _sm For weight omega _l When p is different from the correction of (a) _sm ≥p _E When the smoke recognition worst condition is higher than the expected probability, namely the fusion weight of each layer of the FPN layer is suitable, no additional adjustment is needed, if _sm ＜p _E Expressed as current feature layer->The fusion effect of the feature map of the main body still needs to be enhanced, so the following formula is used for adjustment:

Finally, use modified G _x,y,l Fusion weights omega 'of all layers of FPN layer' _l The smoke detection branch is adjusted.

Similarly, G in the flame detection branch is modified according to the smoke detection branch modification mode _x,y,l And omega' _l Modifications are made.

(4) If I _t-1 When (x, y) is determined to be a smoke or fire condition, which is usually caused by inaccurate smoke positioning, i.e. insufficient detection, I is performed _t In (x, y) image detection, it is necessary to fuse weights ω to smoke detection branch levels in a binary detection network _l For omega in flame detection branch under the condition of fire and no smoke _l The modification is adjusted to omega in the flame detection branch _l Detecting omega in branch according to flame under the condition of big fire and small fire of smoke _l The modification mode is adjusted.

And 5, fusing the detection results in the two detection branches obtained in the step 3, outputting the detection category and the corresponding probability of the current frame, and then circulating the steps 3 to 5 for operation of the subsequent frame until the video is finished.

Based on the same technical scheme, the invention also discloses a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to execute the unmanned aerial vehicle cruising forest fire detection method based on binary collaborative feedback.

The invention also provides an unmanned aerial vehicle cruising forest fire detection device based on binary collaborative feedback, comprising one or more processors, one or more memories and one or more programs, wherein the one or more programs are stored in the one or more memories and configured to be executed by the one or more processors, and the one or more programs comprise instructions for executing the unmanned aerial vehicle cruising forest fire detection method based on binary collaborative feedback.

The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the spirit of the present invention.

Claims

1. The unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback is characterized by comprising the following steps of:

2. The method for detecting the forest fire of the unmanned aerial vehicle cruising based on the binary cooperative feedback according to claim 1, wherein in the step 2, an improved FCOS network is taken as a reference, and the improved FCOS network construction process is as follows: adding a CBAM attention module between C3 and C4 and between C4 and C5 of a backbone network ResNet respectively; the original convolution kernel is replaced by the deformable convolution in the regression branch.

3. The unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback according to claim 2, wherein the improved FCOS network adopts an FPN layer regression adaptive training determination method during training, and the specific process is as follows:

step 2.1, creating a FPN hierarchical regression vector set ψ= [ ψ ] ₁ ,Ψ ₂ ,...,Ψ _i ,...,Ψ _N ]N is the total number of label categories; psi _i The method is a seven-tuple, the first 5 positions are default regression ranges of the FPN level, the 6 th position is a label category, and the 7 th position is a range modification zone bit;

Step 2.2, for a given target of pixel size k×k in the current frame, determining its default level according to the default regression scale range

wherein,,is the minimum of loss degree between the hierarchies, < ->Index parameter indicating FPN from 3 to 7 layers, < >> Respectively represent the hierarchy->L _all (. Cndot.) is a loss function at the FPN level;

step 2.4, ifAnd level->Setting the 7 th position of the hierarchical regression vector corresponding to the given target class as 1, and directly performing gradient inversion learning on different input scales of the given target according to a default regression scale range;

step 2.5, ifAnd level->Or->Is consistent with the loss value of (2)Setting the 7 th position of the hierarchical regression vector corresponding to the given target category to be 0, and returning to the step 2.2 after modifying the default regression scale range as follows:

if it isAnd level->The loss value of (2) is consistent, the hierarchy +.>The corresponding default regression scale range is expanded, namelyFrom the level->Stripping out and incorporating into the hierarchy in the corresponding default regression scale range +.>Corresponding default regression scale ranges;

If it isAnd level->The loss value of (2) is consistent, the hierarchy +.>The corresponding default regression scale range is expanded, namelyFrom the level->The corresponding default regression scale range is stripped out and incorporated into the hierarchy +.>Corresponding default regression scale ranges;

step 2.6, ifAnd level->Or level->And setting the 7 th position of the hierarchical regression vector corresponding to the given target class as-1, and automatically selecting the hierarchy by directly utilizing the loss degree of each hierarchy of the FPN for the target class to perform gradient inversion learning.

4. The method for detecting the forest fire cruising by the unmanned aerial vehicle based on the binary cooperative feedback according to claim 1, wherein the fusion in the step 3 is specifically as follows: and marking the position frame information and the identified target probability in the detection results of the smoke detection branch and the flame detection branch at the corresponding positions of the current frame.

5. The method for detecting the forest fire of the unmanned aerial vehicle cruising based on the binary cooperative feedback according to claim 1, wherein before the next frame of the target area video is detected in the step 4, the detection result of the current frame is used as priori knowledge to select two feedback mechanisms of cooperative optimization feedback and cooperative reinforcement feedback, and the selection is based on the following steps:

if the smoke detection branch and the flame detection branch detect targets, the minimum smoke target area detected in the smoke detection branch is recorded asThe minimum flame target area detected in the flame detection branch is denoted +.>According to-> And a set threshold value eta _min Is selected according to the size relation of:

6. The unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback according to claim 5, wherein the specific process of the cooperative optimization feedback is as follows:

wherein,,and->The pixel discriminant functions of flame and smoke respectively, (x, y) represent pixel coordinates, f _H (x,y)、f _S (x, y) and f _I (x, y) represent the value of pixel (x, y) on the HSI spatial channel, respectively;

3) Using the inter-frame difference method for I _t A in (x, y) _s ' _m And A' _fir The outer smoke and flame targets are located as follows:

s31, respectively obtaining I _t (x, y) and I _t-1 (x, y) does not contain A _s ' _m Image of (a)And->Does not contain A' _fir Image +.>And->

S32, acquiring a frame difference region D _sm (x, y) and D _fir (x,y)：

S33, for the obtained D _sm (x, y) and D _fir (x, y), constructing a second pixel discriminant function, and rejecting D _sm (x, y) and D _fir A pixel point with a second pixel discrimination function value of 0 in (x, y) to obtain a deviation A _s ' _m Or A' _fir Smoke and flame zone U of zone _sm And U _fir The calculation process of the second pixel discriminant function is as follows:

if I _t-1 When (x, y) is the condition of big smoke and small fire, then I is carried out _t (x, y) detectionAs input to the smoke detection branch in a binary detection network, will +.>As input to the flame detection branch;

if I _t-1 When (x, y) is equivalent to a pyrotechnic, then I is performed _t (x, y) detectionAs input to the smoke detection branch in a binary detection network, will +.>As input to the flame detection branch.

7. The unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback according to claim 6, wherein the specific process of cooperative reinforcement feedback is as follows:

(1) If I _t-1 (x, y) is smoke-free and fire-free, then in the process of I _t (x, y) in the detection, weighting each layer of characteristics of FPN in smoke detection branches and flame detection branches in a binary detection networkPerforming self-adaptive adjustment, wherein the weight after adjustment is as follows:

wherein,,for the level weight adjustment factor, +.>h _uav Representing the current unmanned aerial vehicle aerial photographing altitude, < >>Representing corresponding average aerial photographing height of unmanned aerial vehicle during binary detection network training, and performing +.>Represents the number of levels of the current FPN, +.>The total number of FPN layers required to be subjected to feature map fusion is calculated;

(2) If I _t-1 (x, y) is the case of fire or smoke, then I is performed _t (x, y) during detection, each layer of FPN in flame detection branches in a binary detection network is weightedThe self-adaptive adjustment is carried out, and the specific steps are as follows:

(3) If I _t-1 When (x, y) is the condition of big smoke and small fire, then I is carried out _t When (x, y) detecting, the characteristics of each layer of FPN in a smoke detection branch and a flame detection branch in a binary detection network are fused with weightsAnd Gaussian weighting function->The standard deviation of (2) is respectively adjusted, and the specific adjustment steps are as follows:

in the smoke detection branchCorrection: a obtained by synergistic feedback reinforcement _s ' _m ∪U _sm The local standard deviation of (2) replaces the original standard deviation, thereby realizing the +.>Is corrected by the correction of (a);

in smoke detection branchesCorrection: first, a smoke target S with the smallest recognition probability is obtained _sm Probability of correspondence p _sm And performing corresponding FPN layer +>Positioning; when p is _sm ≥p _E When the weight is not needed to be adjusted, if p _sm ＜p _E Weight adjustment was performed using the following formula:

finally, the corrected is utilizedAnd adjusted fusion weights of each layer of FPN layers +.>Adjusting the smoke detection branch;

based on the same adjustment steps as the smoke detection branch, for the flame detection branchAnd->Adjusting;

(4) If I _t-1 When (x, y) is a smoke or fire, then I is performed _t (x, y) detection, fusing weights to smoke detection branch levels in a binary detection networkAccording to the fire without smoke in the flame detection branch +. >The modification is adjusted to the +.>According to the flame detection branch in the case of big fire of smoke +.>The modification mode is adjusted.

8. The unmanned aerial vehicle cruising forest fire detection method based on binary cooperative feedback according to claim 6, wherein the method is based on the current frame I _t-1 (x, y) detection information of smoke and flame in the detection result usingKalman filtering method based on speed correction, and predicting next frame I _t The smoke and flame target area in (x, y), wherein the calculation process of the Kalman filtering speed correction is as follows:

wherein v is _uav The speed of the unmanned aerial vehicle is represented, w 'and h' represent the width and the height of the image of the target area, L represents the receptive field diameter of the aerial lens of the unmanned aerial vehicle,represents a scaling parameter, Δh represents a height difference in the ascending or descending process of the unmanned aerial vehicle, and Δt _h Representing the time taken by the unmanned aerial vehicle to ascend and descend, w _t-1 And h _t-1 Is I _t-1 The width and height of the target detection frame in (x, y), Δt is the frame spacing, v _tx ，v _ty ，v _tw ，v _th The center coordinates of the target detection frame at the time t and the speed value of the width and height of the detection frame are respectively represented.

9. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform the method of any of claims 1-8.

10. An unmanned aerial vehicle cruising forest fire detection apparatus based on binary co-feedback, comprising one or more processors, one or more memories and one or more programs, wherein the one or more programs are stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1 to 8.