CN109409224B - Method for detecting flame in natural scene - Google Patents

Method for detecting flame in natural scene Download PDF

Info

Publication number
CN109409224B
CN109409224B CN201811108891.7A CN201811108891A CN109409224B CN 109409224 B CN109409224 B CN 109409224B CN 201811108891 A CN201811108891 A CN 201811108891A CN 109409224 B CN109409224 B CN 109409224B
Authority
CN
China
Prior art keywords
value
flame
region
convolution
follows
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811108891.7A
Other languages
Chinese (zh)
Other versions
CN109409224A (en
Inventor
巫义锐
何粤超
郭红鑫
李子铭
贾柠晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201811108891.7A priority Critical patent/CN109409224B/en
Publication of CN109409224A publication Critical patent/CN109409224A/en
Application granted granted Critical
Publication of CN109409224B publication Critical patent/CN109409224B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
    • Y02A40/28Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture specially adapted for farming

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for detecting flame in a natural scene, and belongs to the field of fire protection. The invention comprises the following steps: step 1, extracting the maximum stable extremum region of an input image; step 2, filtering redundant extremum regions through the color characteristics and the area of the maximum stable extremum region to obtain flame candidate regions; step 3, inputting the flame candidate region into a convolutional neural network model to extract convolutional features; step 4, inputting the extracted convolution characteristics into a trained support vector machine classifier to classify, and judging whether the convolution characteristics are flame or not; and 5, merging the flame areas to obtain a final flame area. The invention has strong robustness and good detection effect, and can accurately complete the flame detection task.

Description

Method for detecting flame in natural scene
Technical Field
The invention relates to a method for detecting flame in a natural scene, and belongs to the field of fire protection.
Background
Fire is always one of the most damaging and common disasters, and from residential houses to field environments, once the fire occurs and is not timely extinguished, huge losses can be generated. Therefore, the technology of flame detection is widely focused by researchers at home and abroad. At present, various techniques for detecting flames exist, however, due to the diversity and complexity of application scenarios, conventional flame detection techniques (such as smoke detection and temperature detection) always have certain limitations, so that the computer vision-based flame detection technique is a new direction for preventing fires.
Disclosure of Invention
Aiming at the limitations of the use environment of smoke detectors and temperature detectors and the popularization of video monitoring systems, the invention provides a natural scene flame detection method which is based on digital image processing and deep learning, has low detection cost and strong robustness.
The invention adopts the following technical scheme for solving the technical problems:
a method of natural scene flame detection comprising the steps of:
step 1, extracting the maximum stable extremum region of an input image;
step 2, filtering redundant extremum regions through the color characteristics and the area of the maximum stable extremum region to obtain flame candidate regions;
step 3, inputting the flame candidate region into a convolutional neural network model to extract convolutional features;
step 4, inputting the extracted convolution characteristics into a trained support vector machine classifier to classify, and judging whether the convolution characteristics are flame or not;
and 5, merging the flame areas to obtain a final flame area.
The specific process of the step 1 is as follows:
step 11, converting the input image into gray level image, denoted as I gray
Step 12, according to the order of the ascending order of the threshold values, calculating the I under each threshold value gray Extreme region of (1), extreme region (Q) i The definition is as follows:
wherein I is gray (p) And I gray (q) represents I gray The values of the pixel points p and q in the middle, i E [0,255]A threshold value representing the region of extrema,representation and extremum region Q i A set of pixels adjacent but not belonging to the extremum region;
step 13, pair I gray Calculating the change rate of the extremum region:
the extremum region change rate is defined as:
wherein, delta is the tiny variation of gray threshold value, Q i+Δ For the extreme value region obtained after the gray threshold value is increased, Q i-Δ For the extremum region obtained after the gray threshold is reduced, r (i) is region Q when i is threshold i Is a rate of change of (2);
step 14, pair I gray Solving the maximum stable extremum region:
when area Q ii The region is considered to be the maximum stable extremum region if the rate of change r (i) is less than the threshold T.
The specific process of the step 2 is as follows:
step 21, calculating the area S of each maximum stable extremum region, calculating the RGB value of each pixel, and judging whether the pixel is a flame pixel, wherein the calculation formula is as follows:
where R is the value of the R channel in the RGB image, G is the value of the G channel in the RGB image, B is the value of the B channel in the RGB image, R t Threshold for R channel, b t Threshold g for B channel t A threshold for G channel;
step 22, filtering the redundant area by the total number n of flame pixel points and the area of the maximum stable extremum area, wherein the remaining maximum stable extremum area is the flame candidate area, and the filtering conditions are as follows:
where H is the threshold and S is the area of the maximum stable extremum region.
The step of extracting the convolution feature in the step 3 is as follows:
step 31, preprocessing an input picture to obtain 224×224×3 images;
step 32, firstly performing convolution operation with step length of 1, convolution kernel size of 3×3 and convolution number of 64 on an input image, and then performing pooling operation with pooling size of 2×2 and step length of 2 to obtain a 112×112×64 feature map;
step 33, for the feature map obtained in step 31, performing convolution operation with step length of 1, convolution kernel size of 3×3 and convolution sum of 128 twice, and performing pooling operation again to obtain a 56×56×128 feature map;
step 34, for the feature map obtained in step 32, performing a convolution operation with a step size of 1, a convolution kernel size of 3×3, and a convolution sum of 256 for four times, and performing a pooling operation again to obtain a feature map with a size of 28×28×256;
step 35, for the feature map obtained in step 33, performing a convolution operation with a step size of 1, a convolution kernel size of 3×3, and a convolution sum of 512, and performing a pooling operation to obtain a feature map with a size of 14×14×512;
step 36, for the feature map obtained in step 34, performing a convolution operation with a step size of 1, a convolution kernel size of 3×3, and a convolution sum of 512, and performing a pooling operation to obtain a 7×7×128 feature map;
step 37, converting the feature map obtained in step 35 into feature vectors of 1×1×4096;
step 38, converting the feature vector obtained in step 36 into a feature vector of 1×1×1000.
The specific process of step 31 is as follows:
the first step, scaling the picture to a size of 224×224×3;
the second step, normalize the pixel value of the picture, the computational method is:
where R is the value of the R channel in the RGB image, G is the value of the G channel in the RGB image, and B is the value of the B channel in the RGB image;
thirdly, carrying out de-averaging on the pixel values of the picture, wherein the calculating method comprises the following steps:
wherein r is mean 、b mean 、g mean Mean values of R channel, G channel and B channel of the picture are respectively shown.
The specific process of step 32 is as follows:
(1) Convolution operation:
let A be the input feature map M i Is a convolution kernel:
wherein: a, a 00 A represents the pixel value of the 0 th row and 0 th column in the feature plane A 01 Representing the pixel value of the 0 th row and 1 st column in the feature plane A, a 02 Representing the pixel value of row 0 and column 2 in feature plane A, a 10 Representing the pixel value of the 1 st row and 0 th column in the feature plane A, a 11 Representing the pixel value of the 1 st row and 1 st column in the feature plane A, a 12 Representing the pixel value of the 1 st row and 2 nd column in the feature plane A, a 20 Representing characteristic surfacesPixel value of row 2, column 0, a in a 21 Representing the pixel value of the 2 nd row and 1 st column in the feature plane A, a 22 Representing the pixel values of row 2 and column 2, w in feature plane A 00 A weight value representing row 0 and column 0 of the convolution kernel K; w (w) 01 A weight value representing row 0 and column 1 in the convolution kernel K; w (w) 02 A weight value representing row 0 and column 2 in the convolution kernel K; w (w) 10 A weight value representing row 1 and column 0 of the convolution kernel K; w (w) 11 A weight value representing row 1 and column 1 of the convolution kernel K; w (w) 12 A weight value representing row 1 and column 2 of the convolution kernel K; w (w) 20 A weight value representing row 2 and column 0 of the convolution kernel K; w (w) 21 A weight value representing row 2 and column 1 of the convolution kernel K; w (w) 22 A weight value representing row 2 and column 2 of the convolution kernel K;
under the condition of step size stride= (1, 1),
wherein: a, a ij Representing the pixel value, w, of the ith row and jth column in feature plane A ij A weight value representing the ith row and jth column in the convolution kernel K, A 00 For the value of row 0 and column 0 in A.times.K, A 01 For the value of row 0, column 1 in A.times.K, A 10 For the value of row 1 and column 0 in A.times.K, A 11 For the value of row 1 and column 1 in a x K, K is a convolution kernel, the step size stride= (1, 1) indicates that the distance of the convolution kernel K moving transversely and longitudinally on the feature plane a is 1, when the convolution kernel moves the edge of the feature plane, if the number of elements of the feature plane is insufficient, the insufficient place is filled with '0', so as to ensure that the output image obtained after the convolution operation is the same as the input image in size.
Let B be the output profile M o Is characterized by:
wherein A is i For inputting feature map M o Is the ith feature plane of (a), the symbol x represents convolutionN is the number of convolution kernels; the convolution operation is performed twice;
(2) Pooling operation:
let B be one of the feature planes in the feature map M obtained by convolution, and C be the feature plane obtained by pooling:
wherein b 00 Representing the pixel value of row 0 and column 0 in feature plane B, B 01 Representing the pixel value of row 0 and column 1 in feature plane B, B 02 Representing the pixel value of row 0 and column 2 in feature plane B, B 03 Representing the pixel value of row 0 and column 3 in feature plane B, B 10 Representing the pixel value of the 1 st row and 0 th column in the feature plane B, B 11 Representing the pixel value of row 0 and column 1 in feature plane B, B 12 Representing the pixel values of row 1 and column 2 in feature plane B, B 13 Representing the pixel values of row 1 and column 3 in feature plane B, B 20 Representing the pixel value of row 2 and column 0 in feature plane B, B 21 Representing the pixel values of row 2 and column 1 in feature plane B, B 22 Representing the pixel values of row 2 and column 2 in feature plane B, B 23 Representing the pixel values of row 2 and column 3 in feature plane B, B 30 Representing the pixel value of row 3, column 0, B in feature plane B 31 Representing the pixel values of row 3 and column 1 in feature plane B, B 32 Representing the pixel values of row 3 and column 2 in feature plane B, B 33 A pixel value representing the 3 rd row and 3 rd column in the feature plane B; c 00 B is 00 、b 01 、b 10 、b 11 Maximum value of c 01 B is 02 、b 03 、b 12 、b 13 Maximum value of c 10 B is 20 、b 21 、b 30 、b 31 Maximum value of c 11 B is 22 、b 23 、b 32 、b 33 Is the maximum value of (a).
After the pooling operation, a characteristic surface C is obtained, and the size is reduced to be half of the original size;
(3) After pooling, nonlinear transformation is performed on the output result by using a RELU function, and the RELU formula is as follows:
wherein: c ij Is the value of the ith row and jth column of the feature map C.
The specific process of the step 37 is as follows:
1) Converting the feature map obtained in the step 35 into a vector of 1 x (7 x 512);
2) Multiplying the vector obtained in the step 6 by trained weights to obtain a feature vector of 1×1×4096, wherein the calculation formula is as follows:
wherein a is 0 Value 0 as eigenvector, a 1 For the feature vector 1 st value, a 4095 4095 th value for the feature vector; w (w) 00 The value w is the value of the 0 th row and the 0 th column of the weight matrix 01 For the value of row 0, column 1, w 10 For the value of row 1 and column 0 of the weight matrix, w 11 Values for row 1 and column 1 of the weight matrix; b 0 0 th value of offset vector, b 1 For the offset vector 1 st value, b 4095 4095 th value for offset vector; x is x 0 To input vector 0 th value, x 1 To input the 1 st value of the vector, x 512 512 th value is the input vector.
3) Nonlinear transformation is carried out on the output result by using a RELU function, and the RELU formula is as follows:
wherein: a, a i Is of special interestThe symptom vector is the i-th value.
The step 38 includes:
a) Multiplying the feature vector obtained in the step 36 by trained weights to obtain a feature vector of 1×1×1000, and the calculation formula is as follows:
wherein a is 0 Value 0 as eigenvector, a 1 For the feature vector 1 st value, a 999 The 999 th value for the feature vector; w (w) 00 The value w is the value of the 0 th row and the 0 th column of the weight matrix 01 For the value of row 0, column 1, w 10 For the value of row 1 and column 0 of the weight matrix, w 11 Values for row 1 and column 1 of the weight matrix; b 0 0 th value of offset vector, b 1 For the offset vector 1 st value, b 999 A 999 th value for the bias vector; x is x 0 To input vector 0 th value, x 1 To input the 1 st value of the vector, x 999 The 999 th value for the input vector;
b) Nonlinear transformation is carried out on the output result by using a RELU function, and the RELU formula is as follows:
wherein: a, a i Is the i-th value of the feature vector.
The step of training the support vector machine in the step 4 includes:
step 41, taking a flame training data set, extracting convolution characteristics of all flame candidate areas through the steps 1,2 and 3, and taking the convolution characteristics as a training set;
step 42, inputting the training set into a support vector machine for training the classification problem, wherein the solving step of the learning algorithm of the support vector machine comprises the following steps:
(1) Selecting a proper parameter C, constructing and solving an optimization problem:
constraint conditions
Obtaining an optimal solution For the 1 st value thereof, +.>For the 2 nd value thereof, +.>An nth value therein; alpha i Is the ith Lagrangian multiplier, alpha j Is the j-th Lagrangian multiplier, x i Is the ith training sample, x j Is the j-th training sample, y i E { -1,1} is the class label corresponding to the ith sample; y is j E { -1,1} is the class label corresponding to the jth sample, K (x i ,x j ) Is a gaussian kernel function, the formula of which is as follows:
wherein: sigma is the standard deviation;
(2) Select alpha * Is a positive component of (a) For the j-th value, calculate
Wherein: b * For the required bias;
(3) Constructing a decision function:
wherein:for the i-th value in the optimal solution, sign () represents a sign function, f (x) =1 when the term in the bracket is greater than a certain threshold value, and f (x) =0 when the term in the bracket is less than a certain threshold value.
The specific process of the step 5 is as follows:
step 51, calculating all S for the flame region set S obtained in step 4 i E S center point c i ,s i Is the ith zone in the set of flame zones S;
step 52, for any flame region s i ,s j ∈S,s j Is the jth zone of the set of flame zones S if their center point c i And c j And if the Euler distance between the two is smaller than the threshold value F, combining the two to obtain the final flame region.
The beneficial effects of the invention are as follows:
(1) The invention uses MSER (maximum stability extremum region algorithm) and RGB (a color space) characteristics to extract flame candidate regions, and the characteristics describe the flame candidate regions by using the change rate of image gray values and RGB values. The method has strong robustness because the color variation range of the flame area is small, and the flame color has a certain rule and can be expressed by an RGB model.
(2) The invention utilizes the convolutional neural network to extract the flame region characteristics, and the convolutional neural network can extract a large number of characteristics and can well describe the flame.
(3) The invention can detect whether flame exists in the image or not, and can accurately mark the flame position.
Drawings
Fig. 1 is a detection flow chart.
Fig. 2 is a flow chart of convolutional neural network extraction of convolutional features.
Detailed Description
The invention will be described in further detail with reference to the accompanying drawings.
The invention relates to a method for detecting flame in a natural scene, which is shown in figure 1 in the operation process and comprises the following steps:
step 1: inputting a flame image to be detected;
step 2: the method comprises the following steps:
first, an input image is converted into a gray scale, denoted as I gray
Next, the I under each threshold is obtained according to the order of the ascending order of the threshold gray Extreme region of (1), extreme region (Q) i The definition is as follows:
wherein I is gray (p) and I gray (q) represents I gray The values of the pixel points p and q in the middle, i E [0,255]A threshold value representing the region of extrema,representation and extremum region Q i A set of pixels adjacent but not belonging to the extremum region;
in the process of changing the threshold value, for I gray The change rate of the extremum area is calculated, and the change rate of the extremum area is defined as:
wherein, delta is the tiny variation of gray threshold value, Q i+Δ For the extreme value region obtained after the gray threshold value is increased, Q i-Δ Is a gray thresholdAn extremum region obtained after the value is reduced, and a region Q when r (i) is i threshold value i Is a rate of change of (2); when area Q i If the rate of change r (i) is less than the threshold T, the region is considered to be the maximum stable extremum region;
then, calculating the area S of each maximum stable extremum region, calculating the RGB value of each pixel point, and judging whether the pixel point is a flame pixel point, wherein the calculation formula is as follows:
where R is the value of the R channel in the RGB image, G is the value of the G channel in the RGB image, B is the value of the B channel in the RGB image, R t Threshold for R channel, b t Threshold g for B channel t Is the threshold for the G channel.
Finally, filtering redundant areas through the total number n of flame pixel points and the area of the maximum stable extremum area, wherein the remaining maximum stable extremum area is the flame candidate area, and the filtering conditions are as follows:
where H is the threshold and S is the area of the maximum stable extremum region.
Step 3: the method comprises the following steps:
firstly, preprocessing an input picture to obtain 224×224×3 images, wherein the specific operation process is as follows:
(1) Scaling the picture to a size of 224×224×3;
(2) The pixel value of the picture is normalized, and the calculation method comprises the following steps:
where R is the value of the R channel in the RGB image, G is the value of the G channel in the RGB image, and B is the value of the B channel in the RGB image;
(3) The pixel values of the pictures are subjected to de-averaging, and the calculation method is as follows:
wherein r is mean 、b mean 、g mean Average values of R channel, G channel and B channel of the picture are respectively represented;
secondly, carrying out operations such as convolution, pooling and the like on the image, wherein the specific process is as follows:
(1) Firstly, carrying out convolution operation with the step length of 1, the convolution kernel size of 3 multiplied by 3 and the convolution sum of 64 on an input image twice, and carrying out pooling operation with the pooling size of 2 multiplied by 2 and the step length of 2 once again to obtain a 112 multiplied by 64 feature map;
(2) For the obtained feature map, performing convolution operation with step length of 1, convolution kernel size of 3×3 and convolution sum of 128 twice, and performing pooling operation once to obtain a 56×56×128 feature map;
(3) For the obtained feature map, performing convolution operation with four steps of 1, convolution kernel size of 3×3 and convolution sum of 256, and performing pooling operation again to obtain a 28×28×256 feature map;
(4) For the obtained feature map, performing convolution operation with four steps of 1, convolution kernel size of 3×3 and convolution sum of 512, and performing pooling operation again to obtain a 14×14×512 feature map;
(5) For the obtained feature map, performing convolution operation with four steps of 1, convolution kernel size of 3×3 and convolution sum of 512, and performing pooling operation again to obtain a 7×7×128 feature map;
then, the obtained feature map is converted into feature vectors of 1×1×4096;
finally, converting the obtained feature vector into a feature vector of 1 multiplied by 1000;
the convolution operation process is as follows:
let A be the feature map of the inputM i Is a 3 x 3 convolution kernel:
wherein: a, a 00 A represents the pixel value of the 0 th row and 0 th column in the feature plane A 01 Representing the pixel value of the 0 th row and 1 st column in the feature plane A, a 02 Representing the pixel value of row 0 and column 2 in feature plane A, a 10 Representing the pixel value of the 1 st row and 0 th column in the feature plane A, a 11 Representing the pixel value of the 1 st row and 1 st column in the feature plane A, a 12 Representing the pixel value of the 1 st row and 2 nd column in the feature plane A, a 20 Representing the pixel value, a, of the 2 nd row and 0 th column of the feature plane A 21 Representing the pixel value of the 2 nd row and 1 st column in the feature plane A, a 22 Pixel values representing row 2 and column 2 in feature plane a; w (w) 00 A weight value representing row 0 and column 0 of the convolution kernel K; w (w) 01 A weight value representing row 0 and column 1 in the convolution kernel K; w (w) 02 A weight value representing row 0 and column 2 in the convolution kernel K; w (w) 10 A weight value representing row 1 and column 0 of the convolution kernel K; w (w) 11 A weight value representing row 1 and column 1 of the convolution kernel K; w (w) 12 A weight value representing row 1 and column 2 of the convolution kernel K; w (w) 20 A weight value representing row 2 and column 0 of the convolution kernel K; w (w) 21 A weight value representing row 2 and column 1 of the convolution kernel K; w (w) 22 A weight value representing row 2 and column 2 of the convolution kernel K;
under the condition of step size stride= (1, 1),
wherein: a, a ij Representing the pixel value, w, of the ith row and jth column in feature plane A ij A weight value representing the ith row and jth column in the convolution kernel K, A 00 For the value of row 0 and column 0 in A.times.K, A 01 For the value of row 0, column 1 in A.times.K, A 10 For the value of row 1 and column 0 in A.times.K, A 11 For the value of row 1 and column 1 in a x K, K is a convolution kernel, the step size stride= (1, 1) indicates that the distance of the convolution kernel K moving transversely and longitudinally on the feature plane a is 1, when the convolution kernel moves the edge of the feature plane, if the number of elements of the feature plane is insufficient, the insufficient place is filled with '0', so as to ensure that the output image obtained after the convolution operation is the same as the input image in size.
Let B be the output profile M o Is characterized by:
wherein A is i For inputting feature map M o Is the number of convolution kernels. The convolution operation is performed twice.
The pooling operation process is as follows:
let B be one of the feature planes in the feature map M obtained by convolution, and C be the feature plane obtained by pooling:
wherein b 00 Representing the pixel value of row 0 and column 0 in feature plane B, B 01 Representing the pixel value of row 0 and column 1 in feature plane B, B 02 Representing the pixel value of row 0 and column 2 in feature plane B, B 03 Representing the pixel value of row 0 and column 3 in feature plane B, B 10 Representing the pixel value of the 1 st row and 0 th column in the feature plane B, B 11 Representing the pixel value of row 0 and column 1 in feature plane B, B 12 Representing the pixel values of row 1 and column 2 in feature plane B, B 13 Representing the pixel values of row 1 and column 3 in feature plane B, B 20 Representing the pixel value of row 2 and column 0 in feature plane B, B 21 Representing the pixel values of row 2 and column 1 in feature plane B, B 22 Representing the pixel values of row 2 and column 2 in feature plane B, B 23 Representing the pixel values of row 2 and column 3 in feature plane B, B 30 Representing the pixel value of row 3, column 0, B in feature plane B 31 Representing the pixel values of row 3 and column 1 in feature plane B, B 32 Representing the pixel values of row 3 and column 2 in feature plane B, B 33 A pixel value representing the 3 rd row and 3 rd column in the feature plane B; c 00 B is 00 、b 01 、b 10 、b 11 Maximum value of c 01 B is 02 、b 03 、b 12 、b 13 Maximum value of c 10 B is 20 、b 21 、b 30 、b 31 Maximum value of c 11 B is 22 、b 23 、b 32 、b 33 Is the maximum value of (a).
After the pooling operation, the characteristic surface C is obtained, and the size is reduced to be half of the original size.
After pooling, nonlinear transformation is performed on the output result by using a RELU function, and the RELU formula is as follows:
wherein: c ij Is the value of the ith row and jth column of the feature map C.
The step of converting the result of the convolution operation into a feature vector is as follows:
(1) Converting the obtained characteristic diagram into a vector of 1 x (7 x 512);
(2) The resulting vector is multiplied by the trained weights to yield a feature vector of 1 x 4096, calculated as:
wherein a is 0 Value 0 as eigenvector, a 1 For the feature vector 1 st value, a 4095 4095 th value for the feature vector; w (w) 00 The value w is the value of the 0 th row and the 0 th column of the weight matrix 01 For the value of row 0, column 1, w 10 For the value of row 1 and column 0 of the weight matrix, w 11 Values for row 1 and column 1 of the weight matrix; b 0 0 th value of offset vector, b 1 For the offset vector 1 st value, b 4095 4095 th value for offset vector; x is x 0 To input vector 0 th value, x 1 To input the 1 st value of the vector, x 512 512 th value is the input vector.
(3) Nonlinear transformation is carried out on the output result by using a RELU function, and the RELU formula is as follows:
wherein: a, a i Is the i-th value of the feature vector.
(4) Multiplying the result obtained in the last step by trained weights to obtain a feature vector of 1 multiplied by 1000, wherein the calculation formula is as follows:
wherein: a, a 999 The 999 th value for the feature vector; b 999 A 999 th value for the bias vector; x is x 999 The 999 th value for the input vector.
(5) Nonlinear transformation is carried out on the output result by using a RELU function, and the RELU formula is as follows:
wherein: a, a i Is the i-th value of the feature vector.
Step 4, training the support vector machine, which comprises the following steps:
firstly, taking a flame training data set, extracting convolution characteristics of all flame candidate areas through the steps 1,2 and 3, and taking the convolution characteristics as a training set;
then, the training set is input into a support vector machine to carry out classification problem training, wherein the solving step of the learning algorithm of the support vector machine comprises the following steps:
(1) Selecting a proper parameter C, constructing and solving an optimization problem:
constraint conditions/>
Obtaining an optimal solution For the 1 st value thereof, +.>For the 2 nd value thereof, +.>Is the nth value thereof. Alpha i Is the ith Lagrangian multiplier, alpha j Is the j-th Lagrangian multiplier, x i Is the ith training sample, x j Is the j-th training sample, y i E { -1,1} is the class label corresponding to the ith sample; y is j E { -1,1} is the class label corresponding to the jth sample, K (x i ,x j ) Is a gaussian kernel function, the formula of which is as follows:
wherein: sigma is the standard deviation.
(2) Select alpha * Is a positive component of (a) For the j-th value, calculate
Wherein: b * For the required bias.
(3) Constructing a decision function:
wherein:for the i-th value in the optimal solution, sign () represents a sign function, f (x) =1 when the term in the bracket is greater than a certain threshold value, and f (x) =0 when the term in the bracket is less than a certain threshold value. The input feature vectors can be classified by using the decision function.
Step 5: the method comprises the following steps:
first, for the flame region S obtained in step 2, all S are calculated i E S center point c i Wherein s is i Is the ith zone in the set of flame zones S;
then, for any flame region s i ,s j ∈S,s j Is the jth zone of the set of flame zones S if their center point c i And c j And if the Euler distance between the two is smaller than the threshold value F, combining the two to obtain the final flame region.

Claims (6)

1. A method for detecting flame in natural scene is characterized in that: the method comprises the following steps:
step 1, extracting the maximum stable extremum region of an input image;
step 2, filtering redundant extremum regions through the color characteristics and the area of the maximum stable extremum region to obtain flame candidate regions; the specific process of the step 2 is as follows:
step 21, calculating the area S of each maximum stable extremum region, calculating the RGB value of each pixel, and judging whether the pixel is a flame pixel, wherein the calculation formula is as follows:
where R is the value of the R channel in the RGB image, G is the value of the G channel in the RGB image, B is the value of the B channel in the RGB image, R t Threshold for R channel, b t Threshold g for B channel t A threshold for G channel;
step 22, filtering the redundant area by the total number n of flame pixel points and the area of the maximum stable extremum area, wherein the remaining maximum stable extremum area is the flame candidate area, and the filtering conditions are as follows:
wherein H is a threshold value, and S is the area of the maximum stable extremum region;
step 3, inputting the flame candidate region into a convolutional neural network model to extract convolutional features; the step of extracting convolution characteristics in the step 3 is as follows:
step 31, preprocessing an input picture to obtain 224×224×3 images;
step 32, firstly performing convolution operation with the step length of 1, the convolution kernel size of 3 multiplied by 3 and the convolution kernel number of 64 on an input image, and then performing pooling operation with the pooling size of 2 multiplied by 2 and the step length of 2 to obtain a 112 multiplied by 64 feature map;
step 33, for the feature map obtained in step 32, performing convolution operation with step length of 1, convolution kernel size of 3×3 and convolution kernel number of 128 twice, and performing pooling operation again to obtain a 56×56×128 feature map;
step 34, for the feature map obtained in step 33, performing a convolution operation with a step size of 1, a convolution kernel size of 3×3, and a convolution kernel number of 256, and performing a pooling operation again to obtain a feature map with a size of 28×28×256;
step 35, for the feature map obtained in step 34, performing a convolution operation with four steps of 1, a convolution kernel size of 3×3 and a convolution kernel number of 512, and performing a pooling operation to obtain a feature map with 14×14×512;
step 36, for the feature map obtained in step 35, performing a convolution operation with four steps of 1, a convolution kernel size of 3×3 and a convolution kernel number of 512, and performing a pooling operation to obtain a 7×7x512 feature map;
step 37, converting the feature map obtained in step 36 into feature vectors of 1×1×4096;
step 38, converting the feature vector obtained in step 37 into a feature vector of 1×1×1000;
step 4, inputting the extracted convolution characteristics into a trained support vector machine classifier to classify, and judging whether the convolution characteristics are flame or not;
step 5, combining the flame areas to obtain a final flame area; the specific process of the step 5 is as follows:
step 51, calculating all S for the flame region set S obtained in step 4 i E S center point c i ,s i Is the ith zone in the set of flame zones S;
step 52, for any flame region s i ,s j ∈S,s j Is the jth zone of the set of flame zones S if their center point c i And c j And if the Euler distance between the two is smaller than the threshold value F, combining the two to obtain the final flame region.
2. A method of natural scene flame detection as defined in claim 1, wherein: the specific process of the step 1 is as follows:
step 11, converting the input image into gray level image, denoted as I gray
Step 12, according to the order of the ascending order of the threshold values, calculating the I under each threshold value gray Extreme region of (1), extreme region (Q) i The definition is as follows:
wherein I is gray (p) and I gray (q) represents I gray The values of the pixel points p and q in the middle, i E [0,255]A threshold value representing the region of extrema,representation and extremum region Q i A set of pixels adjacent but not belonging to the extremum region;
step 13, pair I gray Calculating the change rate of the extremum region:
the extremum region change rate is defined as:
wherein delta is the tiny variation of gray threshold value, Q i+Δ For the extreme value region obtained after the gray threshold value is increased, Q i-Δ An extremum region obtained by reducing the gray threshold value, and a region Q when r (i) is i threshold value i Is a rate of change of (2);
step 14, pair I gray Solving the maximum stable extremum region:
when area Q i The region is considered to be the maximum stable extremum region if the rate of change r (i) is less than the threshold T.
3. A method of natural scene flame detection as defined in claim 1, wherein: the specific process of step 31 is as follows:
the first step, scaling the picture to a size of 224×224×3;
the second step, normalize the pixel value of the picture, the computational method is:
where R is the value of the R channel in the RGB image, G is the value of the G channel in the RGB image, and B is the value of the B channel in the RGB image;
thirdly, carrying out de-averaging on the pixel values of the picture, wherein the calculating method comprises the following steps:
wherein r is mean 、b mean 、g mean Mean values of R channel, B channel and G channel of the picture are respectively shown.
4. The method of natural scene flame detection as recited in claim 1, wherein: the specific process of the step 37 is as follows:
1) Converting the feature map obtained in the step 36 into a vector of 1×1× (7×7×512);
2) Multiplying the vector obtained in the step 36 by trained weights to obtain a feature vector of 1 multiplied by 4096;
3) Nonlinear transformation is carried out on the output result by using a RELU function, and the RELU formula is as follows:
wherein: a, a i Is the i-th value of the feature vector.
5. The method of natural scene flame detection as recited in claim 1, wherein: the step 38 includes:
a) Multiplying the feature vector obtained in the step 37 by trained weights to obtain a feature vector of 1 multiplied by 1000;
b) Nonlinear transformation is carried out on the output result by using a RELU function, and the RELU formula is as follows:
wherein: a, a i Is the i-th value of the feature vector.
6. A method of natural scene flame detection as defined in claim 1, wherein: the step of training the support vector machine in the step 4 includes:
step 41, taking a flame training data set, extracting convolution characteristics of all flame candidate areas through the steps 1,2 and 3, and taking the convolution characteristics as a training set;
step 42, inputting the training set into a support vector machine for training the classification problem, wherein the solving step of the learning algorithm of the support vector machine comprises the following steps:
(1) Selecting a parameter C, constructing and solving an optimization problem:
0≤α i ≤C,i=1,2,...,N
obtaining an optimal solution For the 1 st value thereof, +.>Is the 2 nd thereofValue of->For the nth value therein, α i Is the ith Lagrangian multiplier, alpha j Is the j-th Lagrangian multiplier, x i Is the ith training sample, x j Is the j-th training sample, y i E { -1,1} is the class label corresponding to the ith sample; y is j E { -1,1} is the class label corresponding to the jth sample, K (x i ,x j ) Is a gaussian kernel function, the formula of which is as follows:
wherein: sigma is the standard deviation;
(2) Select alpha * Is a positive component of (a) For the j-th value, calculate
Wherein: b * For the required bias;
(3) Constructing a decision function:
wherein:for the i-th value in the optimal solution, sign () represents a sign function, f (x) =1 when the term in the bracket is greater than a certain threshold value, and f (x) =0 when the term in the bracket is less than a certain threshold value. />
CN201811108891.7A 2018-09-21 2018-09-21 Method for detecting flame in natural scene Active CN109409224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811108891.7A CN109409224B (en) 2018-09-21 2018-09-21 Method for detecting flame in natural scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811108891.7A CN109409224B (en) 2018-09-21 2018-09-21 Method for detecting flame in natural scene

Publications (2)

Publication Number Publication Date
CN109409224A CN109409224A (en) 2019-03-01
CN109409224B true CN109409224B (en) 2023-09-05

Family

ID=65465228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811108891.7A Active CN109409224B (en) 2018-09-21 2018-09-21 Method for detecting flame in natural scene

Country Status (1)

Country Link
CN (1) CN109409224B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309808B (en) * 2019-07-09 2021-03-12 北京林业大学 Self-adaptive smoke root node detection method in large-scale space
CN110378421B (en) * 2019-07-19 2021-06-25 西安科技大学 Coal mine fire identification method based on convolutional neural network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853512A (en) * 2010-05-13 2010-10-06 电子科技大学 Flame detection method based on video time and spatial information
CN105336085A (en) * 2015-09-02 2016-02-17 华南师范大学 Remote large-space fire monitoring alarm method based on image processing technology
CN106250845A (en) * 2016-07-28 2016-12-21 北京智芯原动科技有限公司 Flame detecting method based on convolutional neural networks and device
CN107749067A (en) * 2017-09-13 2018-03-02 华侨大学 Fire hazard smoke detecting method based on kinetic characteristic and convolutional neural networks
CN107944359A (en) * 2017-11-14 2018-04-20 中电数通科技有限公司 Flame detecting method based on video
CN108038486A (en) * 2017-12-05 2018-05-15 河海大学 A kind of character detecting method
CN108052865A (en) * 2017-07-06 2018-05-18 同济大学 A kind of flame detecting method based on convolutional neural networks and support vector machines
CN108416316A (en) * 2018-03-19 2018-08-17 中南大学 A kind of detection method and system of black smoke vehicle

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100450793B1 (en) * 2001-01-20 2004-10-01 삼성전자주식회사 Apparatus for object extraction based on the feature matching of region in the segmented images and method therefor
KR101081051B1 (en) * 2010-11-16 2011-11-09 계명대학교 산학협력단 A method for detecting fire-flame using fuzzy finite automata

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853512A (en) * 2010-05-13 2010-10-06 电子科技大学 Flame detection method based on video time and spatial information
CN105336085A (en) * 2015-09-02 2016-02-17 华南师范大学 Remote large-space fire monitoring alarm method based on image processing technology
CN106250845A (en) * 2016-07-28 2016-12-21 北京智芯原动科技有限公司 Flame detecting method based on convolutional neural networks and device
CN108052865A (en) * 2017-07-06 2018-05-18 同济大学 A kind of flame detecting method based on convolutional neural networks and support vector machines
CN107749067A (en) * 2017-09-13 2018-03-02 华侨大学 Fire hazard smoke detecting method based on kinetic characteristic and convolutional neural networks
CN107944359A (en) * 2017-11-14 2018-04-20 中电数通科技有限公司 Flame detecting method based on video
CN108038486A (en) * 2017-12-05 2018-05-15 河海大学 A kind of character detecting method
CN108416316A (en) * 2018-03-19 2018-08-17 中南大学 A kind of detection method and system of black smoke vehicle

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《基于颜色增强变换和MSER检测的烟雾检测算法》;李笋等;《北京理工大学学报》;20161015;第36卷(第10期);第2-4页 *

Also Published As

Publication number Publication date
CN109409224A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN107016357B (en) Video pedestrian detection method based on time domain convolutional neural network
CN111723693B (en) Crowd counting method based on small sample learning
CN109583315B (en) Multichannel rapid human body posture recognition method for intelligent video monitoring
CN108596010B (en) Implementation method of pedestrian re-identification system
CN112861635B (en) Fire disaster and smoke real-time detection method based on deep learning
CN111738054B (en) Behavior anomaly detection method based on space-time self-encoder network and space-time CNN
TWI441096B (en) Motion detection method for comples scenes
CN110084201B (en) Human body action recognition method based on convolutional neural network of specific target tracking in monitoring scene
Ahn et al. Research of multi-object detection and tracking using machine learning based on knowledge for video surveillance system
CN110751195B (en) Fine-grained image classification method based on improved YOLOv3
CN110942471A (en) Long-term target tracking method based on space-time constraint
CN102469302A (en) Background model learning system for lighting change adaptation utilized for video surveillance
CN112149533A (en) Target detection method based on improved SSD model
CN104463869A (en) Video flame image composite recognition method
CN109409224B (en) Method for detecting flame in natural scene
CN111833353B (en) Hyperspectral target detection method based on image segmentation
CN113537226A (en) Smoke detection method based on deep learning
Zeng et al. Steel sheet defect detection based on deep learning method
CN112633179A (en) Farmer market aisle object occupying channel detection method based on video analysis
CN117197746A (en) Safety monitoring system and method based on deep learning
CN106530300A (en) Flame identification algorithm of low-rank analysis
EP3516592A1 (en) Method for object detection in digital image and video using spiking neural networks
Roh et al. Fire image classification based on convolutional neural network for smart fire detection
CN112241682B (en) End-to-end pedestrian searching method based on blocking and multi-layer information fusion
KR101748412B1 (en) Method and apparatus for detecting pedestrian using joint aggregated channel features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant