CN112308153B

CN112308153B - Firework detection method and device

Info

Publication number: CN112308153B
Application number: CN202011207901.XA
Authority: CN
Inventors: 黄泽; 张泽覃
Original assignee: Alnnovation Guangzhou Technology Co ltd
Current assignee: Alnnovation Guangzhou Technology Co ltd
Priority date: 2020-11-02
Filing date: 2020-11-02
Publication date: 2023-11-24
Anticipated expiration: 2040-11-02
Also published as: CN112308153A

Abstract

The embodiment of the application provides a smoke and fire detection method and device, wherein the smoke and fire detection method comprises the following steps: acquiring an image to be detected; inputting an image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected; the smoke and fire detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting original feature vectors of an image to be detected, the attention layer is used for calculating global feature vectors according to attention scores of the original feature vectors and the original feature vectors, and the full connection layer is used for classifying the global feature vectors to obtain smoke and fire detection results. By means of the technical scheme, the smoke and fire detection accuracy can be improved.

Description

Firework detection method and device

Technical Field

The application relates to the field of smoke and fire detection, in particular to a smoke and fire detection method and device.

Background

Currently, smoke detection is typically performed by a target detection algorithm. The target detection algorithm mainly comprises an Anchor-based algorithm and an Anchor-free algorithm, and the target detection algorithm introducing an Anchor super parameter is used for realizing an accurate detection process, and priori knowledge such as the size of a target to be detected needs to be acquired in advance, otherwise detection performance is affected.

However, when the device is oriented to an open scene (for example, an outdoor scene), since smoke or fire is not a target to be detected with a rigid characteristic, the size of the smoke or fire is not a certain fixed value, so that interference can be generated on the configuration of an Anchor super parameter, accurate regression detection is not facilitated, and the speed performance of the smoke or fire cannot meet the real-time requirement.

Therefore, the existing smoke and fire detection method at least has the problem of low smoke and fire detection accuracy when facing an open scene.

Disclosure of Invention

The embodiment of the application aims to provide a smoke and fire detection method and device, which are used for solving the problem of low smoke and fire detection accuracy when facing an open scene in the prior art.

In a first aspect, an embodiment of the present application provides a smoke detection method, including: acquiring an image to be detected; inputting an image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected; the smoke and fire detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting original feature vectors of an image to be detected, the attention layer is used for calculating global feature vectors according to attention scores of the original feature vectors and the original feature vectors, and the full connection layer is used for classifying the global feature vectors to obtain smoke and fire detection results.

Therefore, by means of the technical scheme, the embodiment of the application avoids the precision loss caused by the target detection algorithm in the detection task of the irregular targets such as smoke, obviously improves the detection precision of the smoke detection model, weakens the sensitivity of the image classification algorithm to data distribution, introduces a spatial domain attention mechanism, further refines the characteristic granularity, greatly reduces the false detection rate of the targets, and can effectively solve the problem of lower smoke detection accuracy when facing open scenes in the prior art.

In one possible embodiment, the image to be detected comprises an outdoor image.

In one possible embodiment, before inputting the image to be detected into the pre-trained smoke detection model, and obtaining the smoke detection result of the image to be detected, the smoke detection method further includes: calculating a target loss value of the initial smoke detection model, wherein the target loss value is calculated by a target loss function comprising a face recognition loss function ArcFace loss; and adjusting the initial smoke and fire detection model by using the target loss value to obtain a pre-trained detection model.

Therefore, in order to strengthen the distinguishing property of the characteristics, the embodiment of the application introduces ArcFace loss in the face recognition field on the design of the target loss function of the initial smoke detection model, increases the similarity between similar sample data and the mutual exclusivity between heterogeneous sample data, scientifically sets the hyper-parameters of the loss function, accelerates the convergence rate of the model, and improves the accuracy of the model.

In one possible embodiment, the objective loss function is:

L ₁ ＝L ₂ +α·L ₃ +β·L ₄

wherein L is ₁ For the target loss value, L ₂ For the first Loss value calculated by the cross entropy Loss function Softmax Loss, alpha is a first super-parameter, L ₃ For the second Loss value calculated by the classified Loss function Focal Loss, beta is the second super parameter, L ₄ Is the third loss value calculated by ArcFace loss.

In one possible embodiment, arcFace is:

wherein N represents the category corresponding to the smoke detection result output by the initial smoke detection model;the included angle between the global feature vector corresponding to the initial smoke detection model and the ith column vector in the parameter matrix of the full-connection layer of the initial smoke detection model; s is a third super parameter for adjusting ArcFace; m is a fourth super parameter for adjusting ArcFace; k is the number of columns of the parameter matrix.

In a second aspect, embodiments of the present application provide a smoke detection device comprising: the acquisition module is used for acquiring the image to be detected; the input module is used for inputting the image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected; the smoke and fire detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting original feature vectors of an image to be detected, the attention layer is used for calculating global feature vectors according to attention scores of the original feature vectors and the original feature vectors, and the full connection layer is used for classifying the global feature vectors to obtain smoke and fire detection results.

In one possible embodiment, the pyrotechnic detection device further comprises: the calculation module is used for calculating a target loss value of the initial smoke detection model, wherein the target loss value is calculated through a target loss function comprising a face recognition loss function ArcFace loss; and the adjusting module is used for adjusting the initial smoke and fire detection model by utilizing the target loss value to obtain a pre-trained detection model.

In one possible embodiment, the objective loss function is:

L ₁ ＝L ₂ +α·L ₃ +β·L ₄

In one possible embodiment, arcFace is:

wherein N represents the category corresponding to the smoke detection result output by the initial smoke detection model;the included angle between the global feature vector corresponding to the initial smoke detection model and the ith column vector in the parameter matrix of the full-connection layer of the initial smoke detection model; s is used for adjusting ArcFaceA third superparameter for loss; m is a fourth super parameter for adjusting ArcFace; k is the number of columns of the parameter matrix.

In a third aspect, embodiments of the present application provide a storage medium having stored thereon a computer program which, when executed by a processor, performs the method of the first aspect or any alternative implementation of the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the method of the first aspect or any alternative implementation of the first aspect.

In a fifth aspect, the application provides a computer program product which, when run on a computer, causes the computer to perform the method of the first aspect or any of the possible implementations of the first aspect.

In order to make the above objects, features and advantages of the embodiments of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 illustrates a flow chart of a method for smoke and fire detection provided by an embodiment of the present application;

FIG. 2 illustrates a block diagram of a smoke detection model provided in an embodiment of the present application;

FIG. 3 is a block diagram showing the structure of a layer of an attention layer according to an embodiment of the present application;

FIG. 4 shows a block diagram of a smoke and fire detection apparatus provided in an embodiment of the present application;

fig. 5 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

At present, the existing smoke and fire detection method mainly comprises the following two algorithms: the method comprises the steps of training a target detection algorithm, namely obtaining a complete detection network through a large amount of marked data, positioning the occurrence position of smoke and/or fire by using a boundary frame in a prediction stage, wherein the accuracy of the target detection algorithm mainly depends on the characteristic expression capacity and the selected characteristic size of a backbone (backbone) network; the image classification algorithm, namely, a common convolutional neural network is selected to express the object features in the image in the form of vectors, then the vectors are classified through a classifier to obtain smoke and fire detection results, the performance of the image classification algorithm also depends on a backstone network and the size of the image, and besides, the selected training data set distribution can have a great influence on the performance of a classification model.

And, in general, the above-mentioned target detection algorithm mainly includes an Anchor-based algorithm and an Anchor-free algorithm. And, the target detection algorithm of Anchor super parameters is introduced to realize an accurate detection process, which needs to acquire priori knowledge of the size and the like of the target to be detected in advance, otherwise, the detection performance is affected. However, in the task of executing smoke and fire detection, since smoke or fire is not a target to be detected with rigid characteristics, the size of the smoke or fire is not a certain fixed value, so that interference can be generated on the configuration of Anchor super parameters, accurate regression detection is not facilitated, and the speed performance of the smoke or fire cannot meet the real-time requirement. And, the Anchor-free algorithm mainly introduces the detection of the target center and the frame to realize faster prediction compared with the Anchor-based algorithm, but the accuracy is also more or less lost.

And for the image classification algorithm, the distribution of training data can influence the classification performance of the model in an open scene, meanwhile, the data collection difficulty in the open scene is correspondingly increased, and the data collection on one side can cause serious false alarm of the classification algorithm, so that the image classification algorithm has obvious disadvantages in the task of smoke and fire detection.

That is, there are disadvantages in terms of performance and accuracy in smoke detection for open scenes, regardless of the object detection algorithm or the image classification algorithm.

Based on the above, the embodiment of the application skillfully provides a smoke and fire detection method, which obtains a smoke and fire detection result of an image to be detected by acquiring the image to be detected and inputting the image to be detected into a pre-trained smoke and fire detection model. The smoke and fire detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting original feature vectors of an image to be detected, the attention layer is used for calculating global feature vectors according to attention scores of the original feature vectors and the original feature vectors, and the full connection layer is used for classifying the global feature vectors to obtain smoke and fire detection results.

In order to facilitate understanding of the embodiments of the present application, some terms in the embodiments of the present application are first explained herein as follows:

the term "hyper-parameter": it refers to the fact that in the context of machine learning, a hyper-parameter is a parameter that is set to a value before starting the learning process, rather than parameter data obtained by training. In general, the super parameters need to be optimized, and a group of optimal super parameters are selected for the learning machine so as to improve the learning performance and effect.

Referring to fig. 1, fig. 1 shows a flowchart of a smoke detection method according to an embodiment of the present application, and it should be understood that the smoke detection method shown in fig. 1 may be performed by a smoke detection device, which may correspond to the smoke detection device shown in fig. 4 hereinafter, and the smoke detection device may be various devices capable of performing the method, for example, such as a personal computer, a server, or a network device, etc., which is not limited in this regard. The smoke detection method of fig. 1 comprises the steps of:

step S110, an image to be detected is acquired.

It should be understood that the image type of the image to be detected may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the image to be detected may be an outdoor image or an indoor image.

Step S120, inputting the image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected.

It should be appreciated that the specific configuration of the smoke detection model may be set according to actual needs, and embodiments of the present application are not limited thereto.

For example, referring to fig. 2, fig. 2 shows a block diagram of a smoke detection model according to an embodiment of the present application. The smoke detection model as shown in fig. 2 includes a feature extraction layer, an attention layer, and a full connection layer. Wherein the feature extraction layer is used for extracting original feature vectors of an image (for example, an image to be detected or a sample image); the attention layer is used for acquiring the attention score of the original feature vector of the image, and performing bit-by-bit multiplication (or element wise multiplication, multiplying the array elements in sequence) on the attention score of the original feature vector of the image and the attention score of the original feature vector of the image to obtain a global feature vector; the full connection layer is used for classifying the global feature vectors to obtain smoke detection results.

That is, firstly, the feature extraction layer extracts the features of the image, namely, the original feature vector of the image is obtained, and in consideration of the fact that the extracted features have no obvious difference between the target area (for example, the area where the smoke in the image is located) and the background area, the attention mechanism of the space domain is connected behind the feature extraction layer, so that the target area and the background area in the original feature vector can be distinguished by the feature extraction layer, and the attention score of the original feature vector is calculated mainly by the attention layer in the calculation process. For example, if an obvious pyrotechnic object appears in the image region corresponding to the current location, a greater attention score is assigned to the feature corresponding to the current location; if the image area corresponding to the current position only contains the background area, the attention score assigned to the feature corresponding to the current position is smaller.

The attention layer may then perform a element wise multiplication product of the generated attention score (or attention score matrix) corresponding to the image and the original feature vector of the image, i.e., the attention layer may perform element wise multiplication product of the input of the attention layer and the attention score of the image to obtain a global feature vector.

Finally, the full connection layer can classify the global feature vector to obtain a smoke and fire detection result, namely the full connection layer can be a classifier, so that the classification of the global feature vector can be realized. Among other things, smoke detection results may include smoke, fire, smoke, or no smoke.

Therefore, compared with the existing image classification method, the smoke and fire detection model in the embodiment of the application is additionally provided with the attention mechanism layer, so that the accuracy and performance of an algorithm can be improved while the detection speed is ensured.

It should be understood that the layer structure of the feature extraction layer, the layer structure of the attention layer, and the layer structure of the full connection layer may all be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, referring to fig. 3, fig. 3 is a block diagram illustrating a layer structure of an attention layer according to an embodiment of the present application. The layer structure of the attention layer as shown in fig. 3 includes an input layer, a convolution layer, an attention sub-layer, and a product layer. The input layer is used for inputting the original feature vectors of the image into the convolution layer and the product layer respectively; the convolution layer is used for carrying out convolution calculation on the input original feature vector; the attention sub-layer is used for calculating the attention score of the original feature vector of the image; the product layer is used to multiply element wise multiplication the original feature vector and the attention score to obtain a global feature vector.

Furthermore, the convolution layer may be composed of two convolution sublayers in series, due to consideration of the parameter number and detection speed of the smoke detection model. The first convolution sub-layer can be connected with the input layer, the second convolution sub-layer can be connected with the first convolution sub-layer and the attention sub-layer, and an activation function at the tail of the second convolution sub-layer can be set to be a softplus activation function, so that parameters of the attention layer and parameters of the feature extraction layer can be learned simultaneously, an end-to-end training process is realized, and the attention fraction of an image can be calculated more reasonably by using the softplus activation function.

It should be noted that, although the above description is given by taking the smoke detection model as an example, it should be understood by those skilled in the art that the smoke detection model may be a trained smoke detection model or an initial smoke detection model, that is, the structure of the trained smoke detection model is the same as that of the initial smoke detection model.

In order to facilitate an understanding of embodiments of the present application, the following description is made by way of specific examples.

Specifically, the initial smoke detection model may be trained to obtain a trained smoke detection model prior to inputting the image to be detected into the trained smoke detection model. The training process of the initial smoke detection model is as follows:

first, a sample image for training an initial smoke detection model and a sample detection result corresponding to the sample image may be acquired. The sample detection result is a smoke detection result of the sample image.

Subsequently, the sample image may be preprocessed and the preprocessed image may be input into the initial smoke detection model to obtain an initial smoke detection result. And, an objective function value of the initial smoke detection model may be calculated, the objective loss value being calculated from an objective loss function including the face recognition loss function ArcFace. And adjusting the initial smoke and fire detection model by utilizing the target loss value to obtain a pre-trained detection model. The target loss function may be determined based on the initial smoke detection result and the sample detection result, or may be designed.

It should be understood that the specific process of preprocessing may be set according to actual requirements, and embodiments of the present application are not limited thereto.

For example, the size of the sample image may be uniformly adjusted to a preset size. In addition, in order to increase the richness of the data samples, the sample images are subjected to random saturation brightness conversion, gaussian noise addition, color disturbance and other preprocessing operations with a certain probability before being input into an initial smoke detection model.

It should also be appreciated that the specific dimensions of the preset dimensions may be set according to actual needs, and embodiments of the present application are not limited thereto.

For example, the preset size may be 320×320.

It should also be understood that the calculation formula corresponding to the objective loss function may be set according to actual requirements, and embodiments of the present application are not limited thereto.

For example, in the design of the objective Loss function, in order to reduce the trouble caused by the number of negative samples and the ratio, the embodiment of the present application uses a linear combination of the cross entropy Loss function Softmax Loss and the classification Loss function Focal Loss as the objective Loss function. In order to achieve more accurate smoke detection performance and reduce false alarm of normal data samples to the greatest extent, the embodiment of the application also uses a loss function ArcFace loss in the face recognition field to optimize a smoke detection model, and designs a final target loss function into the following formula:

L ₁ ＝L ₂ +α·L ₃ +β·L ₄

wherein L is ₁ For the target loss value, L ₂ For the first Loss value calculated by the cross entropy Loss function Softmax Loss, alpha is a first super-parameter, L ₃ For the second Loss value calculated by the classified Loss function Focal Loss, beta is the second super parameter, L ₄ And a third loss value calculated by the face recognition loss function ArcFace.

It should also be understood that the calculation formula corresponding to the cross entropy Loss function Softmax Loss may also be set according to actual requirements, and the embodiment of the application is not limited thereto.

For example, the calculation formula corresponding to the cross entropy Loss function Softmax Loss is as follows:

wherein T represents the length of an output vector of the smoke detection model for representing a smoke detection result; y is _j Representing the value of the tag used to identify the category (e.g., in the case where the action recognition model corresponds to four output categories, the tag of the first category may be 1000, the tag of the second category may be 0100, the tag of the third category may be 0010, the tag of the fourth category may be 0001, etc.) at the j-position (e.g., if the tag is the j-th category, y _j Equal to 1, 0 in other positions); b _j Representing the value of the output vector at the j position (i.e., the probability of being the j-th class).

It should also be understood that the calculation formula corresponding to the classification Loss function Focal Loss may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the calculation formula corresponding to the classification Loss function Focal Loss is as follows:

wherein c ^∧ Representing a predicted value of the initial smoke detection model; c represents the true value of the initial smoke detection model; λ represents a fifth hyper-parameter for adjusting the weight of positive and negative samples, wherein negative samples refer to pyrotechnics in an image, positive samples refer to a background environment in an image, and the like; beta is a sixth super parameter for making the loss of a simple sample smaller while making the loss of a difficult sample larger.

It should also be understood that the calculation formula corresponding to the face recognition loss function ArcFace may also be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the calculation formula corresponding to the face recognition loss function ArcFace is as follows:

It should be noted that although the training process of the initial smoke detection model is described above, those skilled in the art will appreciate that in the case where the initial smoke detection model is trained, the trained smoke detection model may be used directly without training prior to each use.

In addition, the image to be detected may be preprocessed before being input into the pre-trained smoke detection model, and the preprocessed image to be detected may be input into the pre-trained smoke detection model. Wherein the preprocessing may include adjusting the size of the image to be detected to a preset size. Subsequently, the pre-trained smoke detection model may output smoke detection results.

Therefore, the embodiment of the application avoids the precision loss brought by the target detection algorithm in the detection task of irregular targets such as fireworks, obviously improves the detection precision of a smoke detection model, weakens the sensitivity of an image classification algorithm to data distribution, introduces a attention mechanism of a spatial domain, further refines the granularity of the features, greatly reduces the false detection rate of the targets, and can effectively solve the problem of lower smoke detection precision when facing open scenes in the prior art, thereby effectively solving the problem of smoke detection task in open scenes.

In addition, in order to strengthen the distinguishing property of the characteristics, the embodiment of the application introduces ArcFace loss in the face recognition field on the design of the target loss function of the initial smoke detection model, increases the similarity between similar sample data and the mutual exclusivity between heterogeneous sample data, scientifically sets the hyper-parameters of the loss function, accelerates the convergence rate of the model, and improves the accuracy of the model.

It should be understood that the above-described smoke detection method is merely exemplary, and that a person skilled in the art may make various modifications, adaptations or modifications according to the above-described method, and that the modifications are also within the scope of the present application.

Referring to fig. 4, fig. 4 is a block diagram illustrating a smoke detection device 400 according to an embodiment of the present application, and it should be understood that the smoke detection device 400 corresponds to the above method embodiment, and is capable of performing the steps related to the above method embodiment, and specific functions of the smoke detection device 400 may be referred to the above description, and detailed descriptions thereof are omitted herein for avoiding repetition. The pyrotechnic detection device 400 includes at least one software functional module that can be stored in memory in the form of software or firmware (firmware) or cured in an Operating System (OS) of the pyrotechnic detection device 400. Specifically, the smoke detection device 400 includes:

an acquisition module 410, configured to acquire an image to be detected;

the input module 420 is configured to input an image to be detected into a pre-trained smoke detection model, and obtain a smoke detection result of the image to be detected; the smoke and fire detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting original feature vectors of an image to be detected, the attention layer is used for calculating global feature vectors according to attention scores of the original feature vectors and the original feature vectors, and the full connection layer is used for classifying the global feature vectors to obtain smoke and fire detection results.

In one possible embodiment, the smoke detection device 400 further comprises: a calculation module (not shown) for calculating a target loss value of the initial smoke detection model, wherein the target loss value is calculated by a target loss function including a face recognition loss function ArcFace; and the adjusting module (not shown) is used for adjusting the initial smoke detection model by utilizing the target loss value to obtain a pre-trained detection model.

In one possible embodiment, the objective loss function is:

L ₁ ＝L ₂ +α·L ₃ +β·L ₄

In one possible embodiment, arcFace is:

It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding procedure in the foregoing method for the specific working procedure of the apparatus described above, and this will not be repeated here.

Referring to fig. 5, fig. 5 is a block diagram illustrating a structure of an electronic device 500 according to an embodiment of the application. The electronic device 500 may include a processor 510, a communication interface 520, a memory 530, and at least one communication bus 540. Wherein the communication bus 540 is used to enable direct connection communication for these components. Wherein, the communication interface 520 in the embodiment of the present application is used for signaling or data communication with other devices. Processor 510 may be an integrated circuit chip with signal processing capabilities. The processor 510 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 510 may be any conventional processor or the like.

The Memory 530 may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc. The memory 530 has stored therein computer readable instructions which, when executed by the processor 510, enable the electronic device 500 to perform the steps of the method embodiments described above.

The electronic device 500 may further include a memory controller, an input-output unit, an audio unit, a display unit.

The memory 530, the memory controller, the processor 510, the peripheral interface, the input/output unit, the audio unit, and the display unit are electrically connected directly or indirectly to each other, so as to realize data transmission or interaction. For example, the elements may be electrically coupled to each other via one or more communication buses 540. The processor 510 is configured to execute executable modules stored in the memory 530. And, the electronic device 500 is configured to perform the following method: acquiring an image to be detected; inputting the image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected; the smoke detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting original feature vectors of the image to be detected, the attention layer is used for calculating global feature vectors according to attention scores of the original feature vectors and the original feature vectors, and the full connection layer is used for classifying the global feature vectors to obtain smoke detection results.

The input-output unit is used for providing the user with input data to realize the interaction between the user and the server (or the local terminal). The input/output unit may be, but is not limited to, a mouse, a keyboard, and the like.

The audio unit provides an audio interface to the user, which may include one or more microphones, one or more speakers, and audio circuitry.

The display unit provides an interactive interface (e.g. a user-operated interface) between the electronic device and the user or is used to display image data to a user reference. In this embodiment, the display unit may be a liquid crystal display or a touch display. In the case of a touch display, the touch display may be a capacitive touch screen or a resistive touch screen, etc. supporting single-point and multi-point touch operations. Supporting single-point and multi-point touch operations means that the touch display can sense touch operations simultaneously generated from one or more positions on the touch display, and the sensed touch operations are passed to the processor for calculation and processing.

It is to be understood that the configuration shown in fig. 5 is illustrative only, and that the electronic device 500 may also include more or fewer components than shown in fig. 5, or have a different configuration than shown in fig. 5. The components shown in fig. 5 may be implemented in hardware, software, or a combination thereof.

The application also provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the method according to the method embodiment.

The application also provides a computer program product which, when run on a computer, causes the computer to perform the method according to the method embodiments.

It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding procedure in the foregoing method for the specific working procedure of the system described above, and this will not be repeated here.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of smoke detection comprising:

acquiring an image to be detected;

inputting the image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected;

the smoke detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting an original feature vector of the image to be detected, a target area and a background area in the original feature vector are obtained according to the difference between the target area and the background area, the attention layer is used for calculating a global feature vector according to the attention fraction of the original feature vector and the original feature vector, and the full connection layer is used for classifying the global feature vector to obtain a smoke detection result;

the attention layer is used for acquiring the attention score of the original feature vector of the image, and multiplying the attention score of the original feature vector of the image by the attention score of the original feature vector of the image bit by bit to obtain a global feature vector;

the layer structure of the attention layer comprises an input layer, a convolution layer, an attention sub-layer and a product layer;

the input layer is used for inputting the original feature vectors of the image into the convolution layer and the product layer respectively; the convolution layer is used for carrying out convolution calculation on an input original feature vector, the convolution layer at least comprises two serially connected convolution sublayers, the first convolution sublayer is connected with the input layer, the second convolution sublayer is connected with the first convolution sublayer and the attention sublayer, and an activation function at the tail of the second convolution sublayer is set as a softplus activation function;

the attention sub-layer is used for calculating the attention score of the original feature vector of the image;

the product layer is used for carrying out product operation on the original feature vector and the attention score so as to obtain a global feature vector.

2. The smoke detection method of claim 1, wherein the image to be detected comprises an outdoor image.

3. The smoke detection method according to claim 1, wherein before said inputting the image to be detected into a pre-trained smoke detection model, obtaining a smoke detection result of the image to be detected, the smoke detection method further comprises:

calculating a target loss value of an initial smoke detection model, wherein the target loss value is calculated through a target loss function comprising a face recognition loss function ArcFace loss;

and adjusting the initial smoke and fire detection model by using the target loss value to obtain the pre-trained detection model.

4. A smoke detection method according to claim 3, wherein the target loss function is:

L ₁ ＝L ₂ +α·L ₃ +β·L ₄

wherein L is ₁ For the target loss value, L ₂ For the first Loss value calculated by the cross entropy Loss function Softmax Loss, alpha is a first super-parameter, L ₃ For the second Loss value calculated by the classified Loss function Focal Loss, beta is the second super parameter, L ₄ And a third loss value calculated by the ArcFace loss.

5. The smoke detection method of claim 4, wherein said ArcFace loss is:

wherein N represents the category corresponding to the smoke detection result output by the initial smoke detection model;an included angle between a global feature vector corresponding to the initial smoke detection model and an ith column vector in a parameter matrix of a full-connection layer of the initial smoke detection model; s is a third super parameter for adjusting the ArcFace; m is a fourth super parameter for adjusting the ArcFace; k is the parameterThe number of columns of the matrix.

6. A smoke and fire detection apparatus comprising:

the acquisition module is used for acquiring the image to be detected;

the input module is used for inputting the image to be detected into a pre-trained smoke detection model to obtain a smoke detection result of the image to be detected;

the smoke and fire detection model comprises a feature extraction layer, an attention layer and a full connection layer, wherein the feature extraction layer is used for extracting an original feature vector of the image to be detected and acquiring a target area and a background area in the original feature vector according to the difference between the target area and the background area; the attention layer is used for calculating a global feature vector according to the attention score of the original feature vector and the original feature vector, and the full connection layer is used for classifying the global feature vector to obtain the smoke and fire detection result;

7. The smoke detection device of claim 6, wherein said image to be detected comprises an outdoor image.

8. The smoke-detecting device of claim 6, wherein the smoke-detecting device further comprises:

the calculation module is used for calculating a target loss value of the initial smoke detection model, wherein the target loss value is obtained by calculating a target loss function comprising a face recognition loss function ArcFace;

and the adjusting module is used for adjusting the initial smoke and fire detection model by utilizing the target loss value to obtain the pre-trained detection model.

9. The smoke detection device of claim 8, wherein said target loss function is:

L ₁ ＝L ₂ +α·L ₃ +β·L ₄

10. The smoke detection device of claim 9, wherein said ArcFace loss is:

wherein N represents the category corresponding to the smoke detection result output by the initial smoke detection model;an included angle between a global feature vector corresponding to the initial smoke detection model and an ith column vector in a parameter matrix of a full-connection layer of the initial smoke detection model; s is a third super parameter for adjusting the ArcFace; m is a fourth super parameter for adjusting the ArcFace; k is the number of columns of the parameter matrix.