CN114419558B - Fire video image identification method, fire video image identification system, computer equipment and storage medium - Google Patents

Fire video image identification method, fire video image identification system, computer equipment and storage medium Download PDF

Info

Publication number
CN114419558B
CN114419558B CN202210327700.6A CN202210327700A CN114419558B CN 114419558 B CN114419558 B CN 114419558B CN 202210327700 A CN202210327700 A CN 202210327700A CN 114419558 B CN114419558 B CN 114419558B
Authority
CN
China
Prior art keywords
layer
module
video image
block
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210327700.6A
Other languages
Chinese (zh)
Other versions
CN114419558A (en
Inventor
柯峰
方恩权
杨利萍
庄泽升
彭东亮
马跃
何冬冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Guangzhou Metro Group Co Ltd
Shenzhen Launch Digital Technology Co Ltd
Original Assignee
South China University of Technology SCUT
Guangzhou Metro Group Co Ltd
Shenzhen Launch Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT, Guangzhou Metro Group Co Ltd, Shenzhen Launch Digital Technology Co Ltd filed Critical South China University of Technology SCUT
Priority to CN202210327700.6A priority Critical patent/CN114419558B/en
Priority to PCT/CN2022/084441 priority patent/WO2023184350A1/en
Publication of CN114419558A publication Critical patent/CN114419558A/en
Application granted granted Critical
Publication of CN114419558B publication Critical patent/CN114419558B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fire video image identification method, a fire video image identification system, computer equipment and a storage medium, wherein the fire video image identification method comprises the following steps: acquiring a data set, wherein the data set is a video image data set of fire and non-fire; constructing a convolutional neural network; training the convolutional neural network by using a data set to obtain a fire video image recognition model; acquiring a video to be identified, and performing framing processing on the video to be identified to obtain a video image to be identified; and inputting the video image to be identified into a fire video image identification model to realize fire video image identification. The invention can reduce the number of parameters of the network model, improve the detection efficiency and accuracy of the network model, and realize the rapid identification of the fire video image, thereby discovering the fire hazard in time and ensuring the personal and property safety.

Description

Fire video image identification method, fire video image identification system, computer equipment and storage medium
Technical Field
The invention relates to a fire video image identification method, a fire video image identification system, computer equipment and a storage medium, and belongs to the field of computer vision.
Background
With the improvement of the economic and technological levels of China, the population quantity is continuously increased, and buildings are continuously increased and dense. The continued use of electricity and fuel increases the risk of fire and the damage caused by fire increases. Since a fire causes economic loss in society and also harms the safety of the public, it is necessary to specially study a fire detection technology to identify a fire when the fire is initially ignited, so as to reduce the loss caused by the fire as much as possible and protect the safety of people.
The traditional fire detection technologies mainly include smoke detection, temperature detection, light detection and gas detection, and mainly identify the occurrence of fire according to physical characteristics of the fire, such as the concentration of smoke generated, the temperature of the environment, the illumination intensity of flame, and the concentrations of O2 consumed by combustion and gases generated such as CO and CO 2. Traditional fire detection techniques have certain limitations: firstly, the method is only limited to a closed environment, if the physical characteristics are not obviously changed in a place with a large area, the detection efficiency of the sensor is reduced, meanwhile, the time for transmitting the physical characteristics such as gas, particles and the like to the sensor is prolonged along with the increase of the distance, the detection time is prolonged, and the timely broadcasting cannot be realized; secondly, the fire disaster monitoring system is easily influenced by the environment, and the physical characteristics of a fire scene can be influenced by the change of environmental factors such as rain, snow, wind speed and the like, so that the detection accuracy of the sensor is influenced; thirdly, the cost is high, the price of the sensor is high, and the sensor is easy to be corroded, aged and even damaged.
With the development of the information age, people begin to develop fire detection technologies towards the direction of intellectualization, and detection and identification are performed on extracted flame characteristics by using technologies such as image processing, artificial intelligence and the like. Meanwhile, the video monitoring technology is continuously developed, and monitoring full coverage is realized in most areas. The image can intuitively find the fire source, the fire behavior and other conditions, and the video-based fire detection technology is increasingly emphasized. However, the existing fire detection technology based on artificial intelligence is complex in model, overlarge in parameter quantity and low in detection efficiency, and is not beneficial to rapid detection of fire. Therefore, it is a problem of great interest to researchers to find a fire recognition model with a simple model, a small parameter amount and high detection efficiency.
Disclosure of Invention
In view of the above, the invention provides a fire video image recognition method, a fire video image recognition system, a computer device and a storage medium, which construct a fire video image recognition model by combining multi-scale feature information, a network residual error structure and a depth separable convolution operation to form a new module.
The invention aims to provide a fire video image identification method.
The second purpose of the invention is to provide a fire video image recognition system.
It is a third object of the invention to provide a computer apparatus.
It is a fourth object of the present invention to provide a storage medium.
The first purpose of the invention can be achieved by adopting the following technical scheme:
a fire video image identification method, the method comprising:
acquiring a data set, wherein the data set is a video image data set of fire and non-fire;
constructing a convolutional neural network, wherein the convolutional neural network comprises a layer of input layer, a layer of module A, three layers of modules B, two layers of modules C, two layers of 1 multiplied by 1 convolutional blocks A, four layers of maximum pooling layers, a layer of self-adaptive average pooling layer and a layer of self-adaptive average pooling layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
training the convolutional neural network by using a data set to obtain a fire video image recognition model;
acquiring a video to be identified, and performing framing processing on the video to be identified to obtain a video image to be identified;
and inputting the video image to be identified into a fire video image identification model to realize fire video image identification.
Further, the three layers of modules B are respectively a first module B, a second module B and a third module B, the two layers of modules C are respectively a first module C and a second module C, the two layers of 1 × 1 rolling blocks a are respectively a first 1 × 1 rolling block a and a second 1 × 1 rolling block a, and the four layers of maximum pooling layers are respectively a first pooling layer, a second pooling layer, a third pooling layer and a fourth pooling layer;
the construction of the convolutional neural network specifically comprises the following steps:
sequentially connecting an input layer, a module A, a first maximum pooling layer, a first module B, a first 1X 1 volume block A, a second maximum pooling layer, a first module C, a third maximum pooling layer, a second 1X 1 volume block A, a second module B, a fourth maximum pooling layer, a second module C, a third module B, an adaptive average pooling layer, a,dropoutA layer,flattenA layer, a full connecting layer,softmaxAnd classifying the layers, and further constructing to obtain the convolutional neural network.
Further, the module A comprises an input layer, a first feature extraction layer and an output layer; the module B comprises an input layer, a second feature extraction layer and an output layer; the module C comprises an input layer, a third feature extraction layer and an output layer.
Further, the first feature extraction layer comprises a first input channel, a first output channel, a second output channel and a third output channel;
the first input channel is formed by sequentially connecting a first 3 × 3 volume block A, a second 3 × 3 volume block A and a third 3 × 3 volume block A;
the first output channel outputs a characteristic information matrix of a first 3 x 3 convolution block A;
the second output channel outputs a characteristic information matrix of a second 3 x 3 convolution block a;
the third output channel outputs a feature information matrix of a third 3 x 3 convolution block a.
Further, the second feature extraction layer includes a second input channel, a third input channel, a fourth output channel, a fifth output channel, a sixth output channel, a seventh output channel, and an eighth output channel;
the second input channel is a third 1 × 1 convolution block a;
the third input channel specifically comprises: firstly, a first 3 x 3 rolling block B and a second 3 x 3 rolling block B are sequentially connected, and after the characteristic information matrix output of the first 3 x 3 rolling block B and the characteristic information matrix output of the second 3 x 3 rolling block B are added, the first 3 x 3 rolling block B and the second 3 x 3 rolling block B are sequentially connected;
the fourth input channel is formed by sequentially connecting a fifth maximum pooling layer with a fourth 1 x 1 volume block A;
the fourth output channel outputs a characteristic information matrix of a third 1 × 1 convolution block a;
the fifth output channel outputs a characteristic information matrix of the first 3 × 3 convolution block B;
the sixth output channel outputs a characteristic information matrix of a second 3 × 3 convolution block B;
the seventh output channel outputs a feature information matrix of a third 3 × 3 convolution block B;
the eighth output channel outputs a feature information matrix of a fourth 1 × 1 convolution block a.
Further, the third feature extraction layer includes a first input/output channel, a second input/output channel, a third input/output channel, a fourth input/output channel, and a fifth input/output channel;
the first input/output channel is a fifth 1 × 1 convolution block a;
the second input and output channel is formed by sequentially connecting a fourth 3X 3 volume block B and a sixth 1X 1 volume block A;
the third input and output channel is formed by sequentially connecting a fifth 3 × 3 volume block B, a sixth 3 × 3 volume block B and a seventh 1 × 1 volume block A;
the fourth input and output channel is formed by sequentially connecting a seventh 3 × 3 convolution block B, an eighth 3 × 3 convolution block B, a ninth 3 × 3 convolution block B and an eighth 1 × 1 convolution block A;
and the fifth input and output channel is formed by sequentially connecting a sixth largest pooling layer with a ninth 1 × 1 rolling block A.
Further, the convolution block B comprises a convolution layer, a batch normalization layer and a second activation layer which are connected in sequence;
the activation function adopted by the second activation layer is as follows: RELU6, RELU6(x) = min (max (x,0), 6);
the convolutional layers in the convolutional block B employ a depth separable convolution operation.
Further, the input of the output layer performs deep stitching on all feature information matrices output in the corresponding feature extraction layer.
The second purpose of the invention can be achieved by adopting the following technical scheme:
a fire video image recognition system, the system comprising:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a data set, and the data set is a video image data set of fire and non-fire;
a construction unit for constructing a convolutional neural network, which comprises an input layer, a module A, three layers of modules B, two layers of modules C, and two layers of 1 × 1 convolutional blocksA. Four maximum pooling layers, one adaptive average pooling layer, and one layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
the training unit is used for training the convolutional neural network by using the data set to obtain a fire video image recognition model;
the second acquisition unit is used for acquiring the video to be identified and performing framing processing on the video to be identified to obtain a video image to be identified;
and the identification unit is used for inputting the video image to be identified into the fire video image identification model to realize fire video image identification.
The third purpose of the invention can be achieved by adopting the following technical scheme:
a computer device comprises a processor and a memory for storing a program executable by the processor, wherein the processor executes the program stored in the memory to realize the fire video image identification method.
The fourth purpose of the invention can be achieved by adopting the following technical scheme:
a storage medium stores a program which, when executed by a processor, implements the fire video image recognition method described above.
Compared with the prior art, the invention has the following beneficial effects:
(1) the fire video image recognition model built by the invention not only can reduce the parameter quantity of the network model, but also can improve the detection efficiency and accuracy of the network model, thereby realizing the rapid recognition of the fire video image, finding the fire hazard in time and ensuring the personal and property safety.
(2) According to the invention, the collected video is subjected to framing processing to obtain a video image data set, and then the video image data is subjected to preprocessing, so that the problems of insufficient illumination, shadow and the like in the collecting process of the monitoring equipment are effectively solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
Fig. 1 is a flowchart of a fire video image recognition method according to embodiment 1 of the present invention.
Fig. 2 is a frame diagram of a fire video image recognition model according to embodiment 1 of the present invention.
Fig. 3 is a frame diagram of module a according to embodiment 1 of the present invention.
Fig. 4 is a frame diagram of module B of embodiment 1 of the present invention.
Fig. 5 is a frame diagram of module C according to embodiment 1 of the present invention.
FIG. 6 is a block diagram of convolution blocks A and B according to embodiment 1 of the present invention.
Fig. 7 is a bar chart showing the parameter number of each network model in embodiment 1 of the present invention.
Fig. 8 is a statistical graph of fire identification accuracy of each network model according to embodiment 1 of the present invention.
Fig. 9 is a flowchart of a fire video image recognition system according to embodiment 2 of the present invention.
Fig. 10 is a block diagram of a computer device according to embodiment 3 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.
Example 1:
as shown in fig. 1, the present embodiment provides a fire video image recognition method, which includes the following steps:
s101, acquiring a data set.
In the embodiment, videos of flames and non-flames are collected through a network, and then the collected videos of flames and non-flames are subjected to framing processing (12 frames are taken as a unit) through an opencv library, so that video image data sets of fire and non-fire with labels are obtained.
Further, the present embodiment divides the data set into a training set and a testing set by using a script, and performs data enhancement on the training set, where the data enhancement includes random rotation, mirroring, and random clipping.
And S102, constructing a convolutional neural network.
As shown in fig. 2, the convolutional neural network in this embodiment includes a layer of input layer, a layer of module a, a layer of module B, a layer of module C, two layers of 1 × 1 convolutional blocks a, four maximum pooling layers, a layer of adaptive average pooling layer, and a layer of adaptive average pooling layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer; the three layers of modules B are respectively a first module B, a second module B and a third module B, the two layers of modules C are respectively a first module C and a second module C, the two layers of 1 × 1 rolling blocks A are respectively a first 1 × 1 rolling block A and a second 1 × 1 rolling block A, and the four layers of maximum pooling layers are respectively a first pooling layer, a second pooling layer, a third pooling layer and a fourth pooling layer.
In this embodiment, an input layer, a module A, a first maximum pooling layer, a first module B, a first 1 × 1 rolling block A, a second maximum pooling layer, a first module C, a third maximum pooling layer, a second 1 × 1 rolling block A, a second module B, a fourth maximum pooling layer, a second module C, a third module B, an adaptive average pooling layer, a first filtering layer, a second filtering layer, a third filtering layer, a fourth filtering layer, and a fourth filtering layer,dropoutA layer,flattenA layer, a full connecting layer,softmaxThe classification layers are connected in sequence, and then the convolutional neural network is constructed.
In the present embodiment, the convolution layers used for the first 1 × 1 convolution block a and the second 1 × 1 convolution block a have a step pitch of 1 and a fill of 0.
Further, as shown in fig. 3, the module a in the present embodiment includes an input layer, a first feature extraction layer, and an output layer; wherein the first feature extraction layer includes a first input channel, a first output channel, a second output channel, and a third output channel. Specifically, the first input channel is formed by sequentially connecting a first 3 × 3 convolution block a, a second 3 × 3 convolution block a, and a third 3 × 3 convolution block a; the first output channel outputs a characteristic information matrix of the first 3 x 3 convolution block A; the second output channel outputs a characteristic information matrix of the second 3 x 3 convolution block A; the third output channel outputs a feature information matrix of a third 3 x 3 convolution block a.
In block a, the first 3 x 3 convolutional block a uses the convolutional layer with a step size of 2 and a fill of 1; the convolution layers used for the second 3 × 3 convolution block a and the third 3 × 3 convolution block a have a step size of 1 and a fill of 1.
Further, as shown in fig. 4, a module B in the present embodiment includes an input layer, a second feature extraction layer, and an output layer; the second feature extraction layer comprises a second input channel, a third input channel, a fourth output channel, a fifth output channel, a sixth output channel, a seventh output channel and an eighth output channel. Specifically, the second input channel is a third 1 × 1 convolution block a; the third input channel specifically comprises: firstly, a first 3 multiplied by 3 volume block B and a second 3 multiplied by 3 volume block B are sequentially connected, and after the characteristic information matrix output of the first 3 multiplied by 3 volume block B and the characteristic information matrix output of the second 3 multiplied by 3 volume block B are added, the first 3 multiplied by 3 volume block B and the second 3 multiplied by 3 volume block B are sequentially connected; the fourth input channel is formed by sequentially connecting a fifth maximum pooling layer with a fourth 1 multiplied by 1 volume block A; the fourth output channel outputs the characteristic information matrix of the third 1 multiplied by 1 convolution block A in the second input channel; a fifth output channel outputs a characteristic information matrix of the first 3 × 3 convolution block B; the sixth output channel outputs the characteristic information matrix of the second 3 × 3 convolution block B; the seventh output channel outputs the characteristic information matrix of the third 3 × 3 convolution block B; the eighth output channel outputs the feature information matrix of the fourth 1 × 1 convolution block a in the fourth input channel.
In the module B, the first 3 × 3 convolutional block B, the second 3 × 3 convolutional block B and the third 3 × 3 convolutional block B use convolutional layers, the step distances are all 1, and the padding is all 1; the convolutional layers used for the third 1 × 1 convolutional block a and the fourth 1 × 1 convolutional block a are all 1 in step size and have no padding.
Further, as shown in fig. 5, a module C in the present embodiment includes an input layer, a third feature extraction layer, and an output layer; the third feature extraction layer comprises a first input and output channel, a second input and output channel, a third input and output channel, a fourth input and output channel and a fifth input and output channel. Specifically, the first input/output channel is a fifth 1 × 1 convolution block a; the second input and output channel is formed by sequentially connecting a fourth 3X 3 volume block B and a sixth 1X 1 volume block A; the third input and output channel is formed by sequentially connecting a fifth 3 × 3 volume block B, a sixth 3 × 3 volume block B and a seventh 1 × 1 volume block A; the fourth input and output channel is formed by sequentially connecting a seventh 3 × 3 volume block B, an eighth 3 × 3 volume block B, a ninth 3 × 3 volume block B and an eighth 1 × 1 volume block A; the fifth input/output channel is formed by sequentially connecting the sixth largest pooling layer with the ninth 1 × 1 rolling block a.
In the module C, the convolution layers used for the fourth 3 × 3 convolution block B, the fifth 3 × 3 convolution block B, the sixth 3 × 3 convolution block B, the seventh 3 × 3 convolution block B, the eighth 3 × 3 convolution block B, and the ninth 3 × 3 convolution block B are all 1 in step pitch and 1 in padding; the convolution layers used for the fifth 1 × 1 convolution block a, the sixth 1 × 1 convolution block a, the seventh 1 × 1 convolution block a, the eighth 1 × 1 convolution block a, and the ninth 1 × 1 convolution block a all have a step pitch of 1 and no padding.
The first active layer in this embodiment is the active layer in module B, and the second active layer is the active layer in convolution block a and convolution block B.
The input layers in this embodiment are all used to receive the output of the previous layer; the input of the output layer performs deep splicing on all feature information matrixes output from the corresponding feature extraction layer, and specifically comprises the following steps: in the module A, the input of the output layer is the deep splicing of the characteristic information matrixes output by the three output channels; in the module B, the input of the output layer is the characteristic information matrix output by the five output channels for splicing in depth; in the module C, the same description is omitted.
Further, as shown in fig. 6, the convolution block a and the convolution block B each include a convolution layer, a Batch Normalization (BN) layer, and a second activation layer, which are sequentially connected; the convolutional layers in the convolutional block a adopt a normal convolution operation, the convolutional layers in the convolutional block B adopt a depth separable convolution operation, and the activation functions adopted by the second active layer are all RELU6, and RELU6(x) = min (max (x,0), 6).
The depth separable convolution in this embodiment specifically includes: the number of channels of the convolution kernel is 1, and meanwhile, the number of channels of the input feature matrix = the number of channels of the convolution kernel = the number of channels of the output feature matrix.
In this embodiment, the sizes of the first largest pooling layer, the second largest pooling layer, the third largest pooling layer, the fourth largest pooling layer, the fifth largest pooling layer and the sixth largest pooling layer are all 3 × 3, the step pitch is 1, and the padding is 1;dropoutthe number of layer randomly inactivated neurons was 40%; the number of neurons in the full junction layer is 2.
The specific parameter conditions of the convolutional neural network in this embodiment are shown in table 1:
table 1 shows the specific parameter conditions of the convolutional neural network
Figure DEST_PATH_IMAGE001
Wherein: 3 × 3-1A, 3 × 3-2A and 3 × 3-3A represent the first, second and third 3 × 3 convolutional blocks A, respectively, in module A; 1 × 1-1B and 1 × 1-2B denote the number of third 1 × 1 convolutional blocks A in the fourth output channel and the fourth 1 × 1 convolutional blocks A in the eighth output channel, respectively, in module B; 1 × 1-1C, 1 × 1-2C, 1 × 1-3C, 1 × 1-4C and 1 × 1-5C respectively represent a fifth 1 × 1 convolution block A in a first input/output channel, a sixth 1 × 1 convolution block A in a second input/output channel, a seventh 1 × 1 convolution block A in a third input/output channel, an eighth 1 × 1 convolution block A in a fourth input/output channel and a ninth 1 × 1 convolution block A in a fifth input/output channel in the module C; 1 × 1 denotes a 1 × 1 volume block.
And S103, training the convolutional neural network by using the data set to obtain a fire video image recognition model.
Inputting the training set obtained in step S101 into a fire video image recognition model for training, and adjusting network parameters to obtain a pre-training model (a trained fire video image recognition model), and inputting the test set obtained in step S101 into the pre-training model to obtain recognition accuracy.
And (3) carrying out performance test on the fire video image recognition model, wherein the specific result is as follows:
as shown in fig. 7, the fire video image recognition model parameters are much smaller than those of other classical convolutional neural network models, and the model parameters are 1.02% of those of VGG19, 23.80% of GoogleNet, and 6.68% of those of resnet 34.
As shown in fig. 8, the performance of the fire video image recognition model on the test set is far better than that of other classical convolutional neural network models, specifically: under the condition of the same iteration of 300 epochs, the highest fire identification accuracy of the fire video image identification model is 97.06%, which is 2.31% higher than that of the classical convolutional neural network model GoogleNet and 0.85% higher than that of the classical convolutional neural network model respet 34.
And S104, acquiring the video to be identified, and performing framing processing on the video to be identified to obtain a video image to be identified.
And S105, inputting the video image to be identified into a fire video image identification model to realize fire video image identification.
Those skilled in the art will appreciate that all or part of the steps in the method for implementing the above embodiments may be implemented by a program to instruct associated hardware, and the corresponding program may be stored in a computer-readable storage medium.
It should be noted that although the method operations of the above-described embodiments are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the depicted steps may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
Example 2:
as shown in fig. 9, the present embodiment provides a fire video image recognition system, which includes a first acquiring unit 901, a constructing unit 902, a training unit 903, a second acquiring unit 904, and a recognition unit 905, and the specific functions of each unit are as follows:
a first acquiring unit 901, configured to acquire a data set, where the data set is a video image data set of a fire and a non-fire;
a constructing unit 902, configured to construct a convolutional neural network, where the convolutional neural network includes a layer of input layer, a layer of module a, three layers of module B, two layers of module C, two layers of 1 × 1 rolling block a, four maximum pooling layers, a layer of adaptive average pooling layer, and a layer of adaptive average pooling layerflattenLayer, layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
a training unit 903, configured to train the convolutional neural network by using a data set to obtain a fire video image recognition model;
a second obtaining unit 904, configured to obtain a video to be identified, and perform framing processing on the video to be identified to obtain a video image to be identified;
and the identification unit 905 is used for inputting the video image to be identified into the fire video image identification model to realize fire video image identification.
The specific implementation of each unit in this embodiment may refer to embodiment 1, which is not described herein any more; it should be noted that the system provided in this embodiment is only illustrated by the division of the functional units, and in practical applications, the functions may be allocated to different functional units as needed to complete, that is, the internal structure is divided into different functional units to complete all or part of the functions described above.
Example 3:
as shown in fig. 10, the present embodiment provides a computer apparatus including a processor 1002, a memory, an input device 1003, a display device 1004, and a network interface 1005 connected by a system bus 1001. The processor 1002 is configured to provide computing and control capabilities, the memory includes a nonvolatile storage medium 1006 and an internal memory 1007, the nonvolatile storage medium 1006 stores an operating system, a computer program, and a database, the internal memory 1007 provides an environment for the operating system and the computer program in the nonvolatile storage medium 1006 to run, and when the computer program is executed by the processor 1002, the fire video image recognition method of embodiment 1 is implemented as follows:
acquiring a data set, wherein the data set is a video image data set of fire and non-fire;
constructing a convolutional neural network, wherein the convolutional neural network comprises a layer of input layer, a layer of module A, three layers of modules B, two layers of modules C, two layers of 1 multiplied by 1 convolutional blocks A, four layers of maximum pooling layers, a layer of self-adaptive average pooling layer and a layer of self-adaptive average pooling layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
training the convolutional neural network by using a data set to obtain a fire video image recognition model;
acquiring a video to be identified, and performing framing processing on the video to be identified to obtain a video image to be identified;
and inputting the video image to be identified into a fire video image identification model to realize fire video image identification.
Example 4:
the present embodiment provides a storage medium, which is a computer-readable storage medium, and stores a computer program, and when the computer program is executed by a processor, the fire video image recognition method of embodiment 1 is implemented as follows:
acquiring a data set, wherein the data set is a video image data set of fire and non-fire;
constructing a convolutional neural network, wherein the convolutional neural network comprises a layer of input layer, a layer of module A, three layers of modules B, two layers of modules C, two layers of 1 multiplied by 1 convolutional blocks A, four layers of maximum pooling layers, a layer of self-adaptive average pooling layer and a layer of self-adaptive average pooling layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
training the convolutional neural network by using a data set to obtain a fire video image recognition model;
acquiring a video to be identified, and performing framing processing on the video to be identified to obtain a video image to be identified;
and inputting the video image to be identified into a fire video image identification model to realize fire video image identification.
It should be noted that the computer readable storage medium of the present embodiment may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In the present embodiment, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this embodiment, however, a computer readable signal medium may include a propagated data signal with a computer readable program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer-readable storage medium may be written with a computer program for performing the present embodiments in one or more programming languages, including an object oriented programming language such as Java, Python, C + +, and conventional procedural programming languages, such as C, or similar programming languages, or combinations thereof. The program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
In summary, the invention performs framing processing on the acquired video to obtain a video image data set, and then performs preprocessing on the video image data, thereby effectively solving the problems of insufficient illumination, shadow and the like in the acquisition process of the monitoring equipment; in addition, the built fire video image recognition model not only can reduce the number of parameters of the network model, but also can improve the detection efficiency and accuracy of the network model, and further realize the quick recognition of the fire video image, so that the fire hazard can be found in time, and the personal and property safety is ensured.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution and the inventive concept of the present invention within the scope of the present invention.

Claims (8)

1. A fire video image recognition method, the method comprising:
acquiring a data set, wherein the data set is a video image data set of fire and non-fire;
constructing a convolutional neural network, wherein the convolutional neural network comprises a layer of input layer, a layer of module A, three layers of modules B, two layers of modules C, two layers of 1 multiplied by 1 convolutional blocks A, four layers of maximum pooling layers, a layer of self-adaptive average pooling layer and a layer of self-adaptive average pooling layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
training the convolutional neural network by using a data set to obtain a fire video image recognition model;
acquiring a video to be identified, and performing framing processing on the video to be identified to obtain a video image to be identified;
inputting a video image to be identified into a fire video image identification model to realize fire video image identification;
the module A comprises an input layer, a first feature extraction layer and an output layer; the module B comprises an input layer, a second feature extraction layer and an output layer; the module C comprises an input layer, a third feature extraction layer and an output layer;
the first feature extraction layer comprises a first input channel, a first output channel, a second output channel and a third output channel;
the first input channel is formed by sequentially connecting a first 3 × 3 volume block A, a second 3 × 3 volume block A and a third 3 × 3 volume block A;
the first output channel outputs a characteristic information matrix of a first 3 x 3 convolution block A;
the second output channel outputs a characteristic information matrix of a second 3 x 3 convolution block a;
the third output channel outputs a feature information matrix of a third 3 x 3 convolution block a.
2. The fire video image recognition method according to claim 1, wherein the three layers of modules B are a first module B, a second module B and a third module B, the two layers of modules C are a first module C and a second module C, the two layers of 1 x 1 rolling blocks a are a first 1 x 1 rolling block a and a second 1 x 1 rolling block a, and the four layers of maximum pooling layers are a first pooling layer, a second pooling layer, a third pooling layer and a fourth pooling layer, respectively;
the construction of the convolutional neural network is specifically as follows:
an input layer, a module A, a first maximum pooling layer, a first module B, a first 1 x 1 rolling block A, a second maximum pooling layer, a first module C, a third maximum pooling layer, a second 1 x 1 rolling block A, a second module B, a fourth maximum pooling layer, a second module C, a third module B, an adaptive average pooling layer, a first filtering layer, a second filtering layer, a third filtering layer, a fourth filtering layer, a third module B, a third filtering layer, a second filtering, a third module B, a third module C, a third module B, a third module C, a third module B, a fourth maximum filtering layer, a third module B, a fourth maximum filtering layer, a third module B, a fourth maximum filtering layer, a third module C, a fourth maximum filtering layer, a third module, a fourth maximum filtering layer, a fourth, a third module B, a fourth maximum filtering layer, a fourth, a third module, a fourth,dropouta layer,flattenA layer, a full connecting layer,softmaxAnd classifying the layers, and further constructing to obtain the convolutional neural network.
3. The fire video image recognition method according to claim 1, wherein the second feature extraction layer comprises a second input channel, a third input channel, a fourth output channel, a fifth output channel, a sixth output channel, a seventh output channel, and an eighth output channel;
the second input channel is a third 1 × 1 convolution block a;
the third input channel specifically comprises: firstly, a first 3 x 3 rolling block B and a second 3 x 3 rolling block B are sequentially connected, and after the characteristic information matrix output of the first 3 x 3 rolling block B and the characteristic information matrix output of the second 3 x 3 rolling block B are added, the first 3 x 3 rolling block B and the second 3 x 3 rolling block B are sequentially connected;
the fourth input channel is formed by sequentially connecting a fifth maximum pooling layer with a fourth 1 x 1 volume block A;
the fourth output channel outputs a characteristic information matrix of a third 1 × 1 convolution block a;
the fifth output channel outputs a characteristic information matrix of the first 3 × 3 convolution block B;
the sixth output channel outputs a characteristic information matrix of a second 3 × 3 convolution block B;
the seventh output channel outputs a feature information matrix of a third 3 × 3 convolution block B;
the eighth output channel outputs a feature information matrix of a fourth 1 × 1 convolution block a.
4. The fire video image recognition method according to claim 1, wherein the third feature extraction layer comprises a first input/output channel, a second input/output channel, a third input/output channel, a fourth input/output channel, and a fifth input/output channel;
the first input and output channel is a fifth 1 multiplied by 1 convolution block A;
the second input and output channel is formed by sequentially connecting a fourth 3X 3 convolution block B and a sixth 1X 1 convolution block A;
the third input and output channel is formed by sequentially connecting a fifth 3 × 3 volume block B, a sixth 3 × 3 volume block B and a seventh 1 × 1 volume block A;
the fourth input and output channel is formed by sequentially connecting a seventh 3 × 3 volume block B, an eighth 3 × 3 volume block B, a ninth 3 × 3 volume block B and an eighth 1 × 1 volume block A;
and the fifth input and output channel is formed by sequentially connecting the sixth largest pooling layer with the ninth 1 × 1 convolution block A.
5. The fire video image recognition method according to any one of claims 3 to 4, wherein the convolution block B comprises a convolution layer, a batch normalization layer and a second active layer which are connected in sequence;
the activation function adopted by the second activation layer is: RELU6, RELU6(x) = min (max (x,0), 6);
the convolutional layers in the convolutional block B employ a depth separable convolution operation.
6. The fire video image recognition method according to claim 1, wherein the input of the output layer is a depth stitching of all feature information matrices output in the corresponding feature extraction layers.
7. A fire video image recognition system, the system comprising:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a data set, and the data set is a video image data set of fire and non-fire;
a construction unit for constructing a convolutional neural network, the convolutional neural networkThe network comprises a first input layer, a first module A, a third module B, a second module C, a second 1X 1 rolling block A, a fourth maximum pooling layer, a first self-adaptive average pooling layer, and a first layerflattenLayer, one layerdropoutLayer, a full joint layer and a layersoftmaxA classification layer;
the training unit is used for training the convolutional neural network by using a data set to obtain a fire video image recognition model;
the second acquisition unit is used for acquiring the video to be identified and performing framing processing on the video to be identified to obtain a video image to be identified;
the identification unit is used for inputting the video image to be identified into a fire video image identification model to realize fire video image identification;
the module A comprises an input layer, a first feature extraction layer and an output layer; the module B comprises an input layer, a second feature extraction layer and an output layer; the module C comprises an input layer, a third feature extraction layer and an output layer;
the first feature extraction layer comprises a first input channel, a first output channel, a second output channel and a third output channel;
the first input channel is formed by sequentially connecting a first 3 × 3 volume block A, a second 3 × 3 volume block A and a third 3 × 3 volume block A;
the first output channel outputs a characteristic information matrix of a first 3 x 3 convolution block A;
the second output channel outputs a characteristic information matrix of a second 3 x 3 convolution block a;
the third output channel outputs a feature information matrix of a third 3 x 3 convolution block a.
8. A computer device comprising a processor and a memory for storing a program executable by the processor, wherein the processor, when executing the program stored in the memory, implements the fire video image recognition method according to any one of claims 1 to 6.
CN202210327700.6A 2022-03-31 2022-03-31 Fire video image identification method, fire video image identification system, computer equipment and storage medium Active CN114419558B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210327700.6A CN114419558B (en) 2022-03-31 2022-03-31 Fire video image identification method, fire video image identification system, computer equipment and storage medium
PCT/CN2022/084441 WO2023184350A1 (en) 2022-03-31 2022-03-31 Fire video image recognition method and system, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210327700.6A CN114419558B (en) 2022-03-31 2022-03-31 Fire video image identification method, fire video image identification system, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114419558A CN114419558A (en) 2022-04-29
CN114419558B true CN114419558B (en) 2022-07-05

Family

ID=81264231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210327700.6A Active CN114419558B (en) 2022-03-31 2022-03-31 Fire video image identification method, fire video image identification system, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN114419558B (en)
WO (1) WO2023184350A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593610B (en) * 2024-01-17 2024-04-26 上海秋葵扩视仪器有限公司 Image recognition network training and deployment and recognition methods, devices, equipment and media

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292298A (en) * 2017-08-09 2017-10-24 北方民族大学 Ox face recognition method based on convolutional neural networks and sorter model
CN109063728A (en) * 2018-06-20 2018-12-21 燕山大学 A kind of fire image deep learning mode identification method
CN109522819A (en) * 2018-10-29 2019-03-26 西安交通大学 A kind of fire image recognition methods based on deep learning
CN110059582A (en) * 2019-03-28 2019-07-26 东南大学 Driving behavior recognition methods based on multiple dimensioned attention convolutional neural networks
CN111507962A (en) * 2020-04-17 2020-08-07 无锡雪浪数制科技有限公司 Cotton sundry identification system based on depth vision
CN111553298A (en) * 2020-05-07 2020-08-18 北京天仪百康科贸有限公司 Fire disaster identification method and system based on block chain
CN112231974A (en) * 2020-09-30 2021-01-15 山东大学 TBM rock breaking seismic source seismic wave field characteristic recovery method and system based on deep learning

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6968681B2 (en) * 2016-12-21 2021-11-17 ホーチキ株式会社 Fire monitoring system
US11182611B2 (en) * 2019-10-11 2021-11-23 International Business Machines Corporation Fire detection via remote sensing and mobile sensors
CN111639571B (en) * 2020-05-20 2023-05-23 浙江工商大学 Video action recognition method based on contour convolution neural network
CN112419650A (en) * 2020-11-11 2021-02-26 国网福建省电力有限公司电力科学研究院 Fire detection method and system based on neural network and image recognition technology
CN113591591A (en) * 2021-07-05 2021-11-02 北京瑞博众成科技有限公司 Artificial intelligence field behavior recognition system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292298A (en) * 2017-08-09 2017-10-24 北方民族大学 Ox face recognition method based on convolutional neural networks and sorter model
CN109063728A (en) * 2018-06-20 2018-12-21 燕山大学 A kind of fire image deep learning mode identification method
CN109522819A (en) * 2018-10-29 2019-03-26 西安交通大学 A kind of fire image recognition methods based on deep learning
CN110059582A (en) * 2019-03-28 2019-07-26 东南大学 Driving behavior recognition methods based on multiple dimensioned attention convolutional neural networks
CN111507962A (en) * 2020-04-17 2020-08-07 无锡雪浪数制科技有限公司 Cotton sundry identification system based on depth vision
CN111553298A (en) * 2020-05-07 2020-08-18 北京天仪百康科贸有限公司 Fire disaster identification method and system based on block chain
CN112231974A (en) * 2020-09-30 2021-01-15 山东大学 TBM rock breaking seismic source seismic wave field characteristic recovery method and system based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
G Ciaburro.Sound event detection in underground parking garage using convolutional neural network.《Big Data and Cognitive Computing》.2020,第4卷(第3期), *
史海山等.基于遗传神经网络的火灾图像识别及应用.《计算机科学》.2006,(第11期), *
吴喆.基于深度学习的动态背景下航道船舶检测识别与跟踪研究.《中国优秀硕士学位论文全文数据库 (工程科技Ⅱ辑)》.2020,(第3期), *
吴雪等.基于数据增强的卷积神经网络火灾识别.《科学技术与工程》.2020,(第03期), *

Also Published As

Publication number Publication date
WO2023184350A1 (en) 2023-10-05
CN114419558A (en) 2022-04-29

Similar Documents

Publication Publication Date Title
CN107862270B (en) Face classifier training method, face detection method and device and electronic equipment
CN109492612A (en) Fall detection method and its falling detection device based on skeleton point
WO2021104125A1 (en) Abnormal egg identification method, device and system, storage medium, and electronic device
CN110162462A (en) Test method, system and the computer equipment of face identification system based on scene
CN111931719B (en) High-altitude parabolic detection method and device
Liu et al. Visual smoke detection based on ensemble deep cnns
CN114419558B (en) Fire video image identification method, fire video image identification system, computer equipment and storage medium
CN116343301B (en) Personnel information intelligent verification system based on face recognition
CN111242868A (en) Image enhancement method based on convolutional neural network under dark vision environment
CN117040109A (en) Substation room risk early warning method based on heterogeneous data feature fusion
Zheng et al. A lightweight algorithm capable of accurately identifying forest fires from UAV remote sensing imagery
Chen et al. A novel smoke detection algorithm based on improved mixed Gaussian and YOLOv5 for textile workshop environments
CN114282258A (en) Screen capture data desensitization method and device, computer equipment and storage medium
CN114067268A (en) Method and device for detecting safety helmet and identifying identity of electric power operation site
CN112989932A (en) Improved prototype network-based less-sample forest fire smoke identification method and device
CN116543333A (en) Target recognition method, training method, device, equipment and medium of power system
CN115358952B (en) Image enhancement method, system, equipment and storage medium based on meta-learning
CN116563762A (en) Fire detection method, system, medium, equipment and terminal for oil and gas station
CN113111804B (en) Face detection method and device, electronic equipment and storage medium
CN115760616A (en) Human body point cloud repairing method and device, electronic equipment and storage medium
CN114299475A (en) Method for detecting corrosion of damper and related equipment
CN114881103A (en) Countermeasure sample detection method and device based on universal disturbance sticker
CN114882557A (en) Face recognition method and device
CN111401317A (en) Video classification method, device, equipment and storage medium
Ma et al. Smoke Detection Algorithm based on Negative Sample Mining.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant