CN116977909B

CN116977909B - Deep learning fire intensity recognition method and system based on multi-modal data

Info

Publication number: CN116977909B
Application number: CN202311226491.7A
Authority: CN
Inventors: 施朦; 李汉博; 章志超; 张文科
Original assignee: South Central University for Nationalities
Current assignee: South Central Minzu University
Priority date: 2023-09-22
Filing date: 2023-09-22
Publication date: 2023-12-19
Anticipated expiration: 2043-09-22
Also published as: CN116977909A

Abstract

The invention belongs to the technical field of fire intensity recognition, and particularly discloses a deep learning fire intensity recognition method and system based on multi-mode data, wherein the method comprises the following steps: collecting fire scene data; preprocessing the collected fire scene data, including preprocessing sensor data, video recordings and images, and converting the fire scene data in various data forms into a format which can be input into a deep learning model; extracting multi-mode data characteristics; reorganizing the multi-mode information features by adopting a graph structure; and inputting the information characteristics into a deep learning model, and obtaining a predicted value of the heat release rate through forward propagation calculation of the deep learning model to finish fire intensity identification. The invention utilizes the deep learning technology to extract valuable information from the fire scene and related data so as to judge the fire intensity more accurately and more quickly and improve the accuracy of fire cause analysis, thereby providing more powerful support for fire prevention and control and prevention.

Description

Deep learning fire intensity recognition method and system based on multi-modal data

Technical Field

The invention relates to the technical field of fire intensity recognition, in particular to a deep learning fire intensity recognition method and system based on multi-mode data.

Background

Fire investigation is an emerging edge intersection subject, and has wide coverage, and mainly researches fire cause investigation, fire loss verification, fire responsibility identification and treatment, fire investigation inquiry and interrogation technology, fire trace material evidence technology identification and the like. The current investigation of the cause of fire and the verification of fire loss are mainly based on manual investigation and depend on the working experience of investigation staff. In order to solve the problem, the invention recognizes the trace of the fire scene by a deep learning method, reversely predicts the heat release rate in the fire process and provides more accurate quantification standard for investigators.

The deep learning model can be simply understood as a neural network model with a large number of hidden layers, can reveal a nonlinear relation between an input value and an output value, is a powerful tool in the field of data recognition, and is widely applied to tasks such as image recognition and voice recognition. The deep learning model may be divided into three layers, an input layer, an output layer, and a hidden layer. The input layer will receive sample data, which may be images, audio, text, etc. The output layer may output target data required for the problem. The hidden layer is arranged between the input layer and the output layer, contains a large number of parameters, and is continuously optimized by learning the target data, so that the effect of converting the input data into accurate target data is finally achieved. The more the number of hidden layers, the more the parameters, and thus the higher the complexity of the model, the more complex learning tasks can be accomplished.

Modalities refer to forms in which information exists or sources of information, and data composed of two or more modalities is referred to as multi-modal data. Multimodal data, which is common in the field of artificial intelligence, is images, speech, text. In a fire scene, multi-mode fire data information can be formed due to different data acquisition modes.

The heat release rate is generally used for representing the combustion intensity of fire, and the heat release rate per unit area is commonly used in practical application and is expressed in kW/m ² . In performing fire hazard analysis, it is necessary to determine how much the heat release rate of a fire is, and how the heat release rate changes. The heat release rate can generally be calculated from the mass burning rate of the combustible, but the burning in a fire is generally incompleteFull combustion, therefore the calculation accuracy of the method is not high. Therefore, a method for efficiently and accurately calculating the heat release rate of a fire is needed.

After a fire occurs, fire extinguishing and investigation and evidence collection should be performed simultaneously. However, the current focus of fire work is how to extinguish the fire and ensure the safety of personnel. Only in the case of severe fire, serious damage to life and property occurs, but investigation of the cause and responsibility of the fire is emphasized. With the continuous development of society and technology, various new electronic products are continuously appeared, and new processes and new materials are applied, so that the cause of fire is increasingly complex, the concealment is stronger, and the requirements on detection equipment are also higher. Therefore, the problems of low efficiency and high cost occur.

Currently, in the fire field, applications of intelligent technologies are frequent, such as fire early warning systems, fire location recognition technologies, intelligent fire extinguishing devices, and the like. The advanced intelligent technologies play an important role before and during the occurrence of the fire, effectively inhibit the spread of the fire and reduce the damage of the fire. However, in the fire investigation phase after the end of the fire, there is little application of intelligent technology, especially in judging the intensity of the fire, and the traditional investigation method is still mainly relied on. This is not only inefficient, but also often makes it difficult to accurately determine the intensity of the fire due to the influence of human factors, thereby affecting the accuracy of analysis of the cause of the fire and the effectiveness of the investigation results. Therefore, developing and applying intelligent technologies, such as machine learning, deep learning, etc., to improve accuracy and efficiency of fire intensity judgment is a problem that needs to be solved in the current fire investigation field.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a multi-mode data-based deep learning fire intensity recognition method and system, which utilize a deep learning technology to extract valuable information from a fire scene and related data so as to judge the fire intensity more accurately and more rapidly. In addition, the fire intensity judging method based on deep learning can provide more analysis and judging basis for fire investigators, and improve the accuracy of fire cause analysis, thereby providing more powerful support for fire prevention and control and prevention and solving the problems in the background technology.

In order to achieve the above purpose, the present invention provides the following technical solutions: a deep learning fire intensity recognition method based on multi-mode data comprises the following steps:

s1, acquiring fire scene data;

s2, preprocessing the collected fire scene data, including preprocessing sensor data, video recordings and images, and converting the fire scene data in various data forms into a format which can be input into a deep learning model;

s3, extracting multi-mode data characteristics;

s4, reorganizing the multi-mode information features by adopting a graph structure;

s5, inputting the information features into a deep learning model, and obtaining a predicted value of the heat release rate through forward propagation calculation of the deep learning model to finish fire intensity recognition.

Preferably, in step S1, fire scene data including, but not limited to, smoke trace information, glass trace information, metal discoloration trace information, and construction information is collected using cameras, sensors, and environmental monitoring equipment; including but not limited to images, video recordings, and building geometry parameters.

Preferably, in step S2, the sensor data preprocessing includes:

s211, data cleaning: dividing the data area by adopting a clustering algorithm, further carrying out standard deviation statistics on the divided subareas, calculating the standard deviation ratio of the data points to the mean value of the data points, and considering the data exceeding a certain threshold value as an abnormal value;

for abnormal values, the following two processing methods are adopted: 1) The sensor is used for acquiring data secondarily, and the abnormal value is corrected; 2) Correcting the data by adopting an interpolation or data smoothing method;

s212, data standardization: the scale of the data is normalized and the data range is adjusted to 0, 1.

Preferably, in step S2, the image data preprocessing includes:

s221, normalization: firstly, converting an image into a gray level image, and red of each pixel point in the imageGreen->Blue->The value of (2) is converted into a luminance value +.>Luminance value +.>Normalized to [0,1]]Or [ -1,1]The numerical ranges among different images are consistent;

s222, cutting: removing unnecessary parts of the image, including but not limited to removing noise at edges of the image;

s223, scaling: the size of the image is adjusted, and the images with different resolutions are scaled to the same size, so that model processing is facilitated.

The video recording preprocessing comprises the following steps:

s231, video segmentation: dividing a continuous piece of video into a plurality of meaningful sub-pieces, which typically contain a particular scene, event, or action;

s232, frame sampling: using the Brenner gradient function to judge the image quality of each frame in the sub-segment, and extracting the frame with the highest value of the Brenner gradient function so as to reduce the calculated amount of data processing;

s233, transferring the extracted frames into an image data preprocessing operation.

Preferably, in step S3, extracting the multi-modal data feature specifically includes:

s31, extracting sensor data characteristics: extracting regional features of data using sliding windows to obtain multipleData characterization of regions；

S32, extracting image features, specifically comprising:

s321, performing convolution operation on a convolution layer after the image is input, wherein the formula is expressed as follows:

；

wherein,representing a convolved output profile, < >>Representing an input image +.>Representing convolution kernel +.>And->Representing the size of the convolution kernel;

s322, carrying out pooling operation on the feature map obtained after convolution in a pooling layer to reduce redundancy of data, wherein the formula is as follows:

；

wherein,representing a pooled output profile, < >>Representing a convolved output profile, < >>Representing pooling method, < >>Andrepresenting the size of the pooling window;

s323, outputting a characteristic diagram obtained after the last pooling layerFlattening it into a one-dimensional image feature input vector +.>。

Preferably, in step S4, the reorganizing the multi-modal information feature by using the graph structure specifically includes:

s41, reorganizing a sensor data characteristic diagram structure: defining a sensor data profile structure asWherein the set of sensor data features is the set of vertices in the graph structure +.>There is a relationship between vertices of adjacent regions，/>Is the set of relations in the graph structure +.>Any two adjacent vertexes are set asThe data characteristic relation between them +.>The computational expression is:

；

combining the sensor data features according to the calculated data feature relation, and calculating the combined sensor data feature valueThe formula is expressed as follows:

；

s42, image feature map structure reorganization: set up the collectionAny two images of (a) are +.>Calculating the feature relation Euclidean distance between images>，

；

Wherein,and->Representing the size of the image; will->A threshold value of correlation degree with a preset image +.>Comparing when->Is greater than->When in use, will->Record collection->Otherwise, will be->Discarding;

after the relation of any two images in the set is calculated, the graph structure of the input image is obtained；

Combining the image features according to the calculated image feature relation, and setting any two imagesCharacteristic values of +.>The degree of association of the two images is +.>The combined image feature value is calculated according to the following formula，

；

S43, characterizing the sensor dataImage feature input vector +.>And other quantized feature vectors->Combining to obtain the final productIs input vector of the full connection layer->，/>。

Preferably, in step S5, an activation function ReLU is added after the convolutional layer, the pooling layer, and the hidden layer of the deep learning model to increase the nonlinear recognition capability of the model, where the activation function is specifically expressed as follows:

；

in the fully connected layer of the deep learning model, vectors are inputThe final predicted result is obtained by the following calculation:

；

wherein,representing the output vector, i.e., the prediction result of the deep learning model on the fire intensity, +.>The weight matrix is represented by a matrix of weights,representing the bias term.

On the other hand, in order to achieve the above purpose, the present invention further provides the following technical solutions: a multi-modal data based deep learning fire intensity recognition system, the system comprising:

the data acquisition module acquires fire scene data;

the data preprocessing module is used for preprocessing the collected fire scene data, including preprocessing sensor data, video recordings and images, and converting the fire scene data in various data forms into a format which can be input into a deep learning model;

the feature extraction module is used for extracting multi-mode data features;

the diagram structure reorganization module reorganizes the multi-mode information characteristics by adopting a diagram structure;

and the fire intensity recognition module inputs the information characteristics into the deep learning model, and obtains a predicted value of the heat release rate through forward propagation calculation of the deep learning model to complete fire intensity recognition.

The beneficial effects of the invention are as follows: the method is a more comprehensive and more effective artificial intelligent driving method for researching the relation between the fire trace and the heat release rate, and takes the data quality, the characteristic engineering and the model selection into consideration, so that more accurate fire intensity prediction can be provided. The specific technical effects are as follows:

1) And 3, improving the prediction precision: the deep learning model has strong feature extraction and data fitting capability, and can automatically learn differentiated features from complex fire scene data. This helps to improve the accuracy of fire intensity prediction and reduce the false positive rate.

2) The manual intervention is reduced: compared with the traditional fire intensity judging method, the deep learning model can automatically perform feature extraction and model training, and the necessity of manual intervention is reduced. This will reduce the time and labor costs in the fire investigation process.

3) Versatility and adaptability: the deep learning model can automatically learn data characteristics, so that the method has good universality and adaptability. This means that the model can be suitable for fire cases of different scenes and types, thereby providing a more comprehensive and reliable reference basis for fire investigation.

4) Promote the development of fire investigation technology: the deep learning model provides a new research direction and technical means for fire intensity identification, and is helpful for promoting technical innovation and progress in the field of fire investigation.

Drawings

FIG. 1 is a schematic flow chart of the steps of the method of the present invention;

FIG. 2 is a schematic diagram of a common fire trace image;

FIG. 3 is a schematic diagram of a deep learning model execution process;

FIG. 4 is a schematic diagram of a system module structure according to the present invention;

in the figure, 110 is a data acquisition module; 120-a data preprocessing module; 130-a feature extraction module; 140-diagram structure reorganization module; 150-fire intensity recognition module.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The intensity of the fire, i.e., the rate of heat release, is difficult to identify in fire surveys. Current fire investigation techniques are mainly focused on the detection of combustion residues, and the fire intensity is determined by analyzing the form and composition of the residues. This process is cumbersome and time consuming, and therefore, in many small fire surveys, this procedure is not performed. This results in a large amount of imperfect fire case data, which affects the accuracy of analysis of the cause of the fire and the validity of the survey results.

In order to solve this problem, we propose a fast and effective method for judging fire intensity, please refer to fig. 1-3, and the present invention provides a technical scheme: a deep learning fire intensity recognition method based on multi-mode data comprises the following steps:

s1, acquiring fire scene data.

Determining a data acquisition source, such as a camera, a sensor, an environmental monitoring device, etc., and acquiring fire scene data including, but not limited to, smoke trace information, glass trace information, metal discoloration trace information and building information by using the camera, the sensor, the environmental monitoring device, etc., as shown in fig. 2. Including but not limited to images, video recordings, and building geometry parameters.

The image record should take multiple angles of the fire scene including overall scene, trace details, residue details. The image needs to be kept clear, and the next image preprocessing step can be performed after the image quality is judged.

The image quality may be judged using a Brenner gradient function defined as follows:

；

wherein,represents the Brenner gradient value,>representing an imagefCorresponding pixel point is at coordinatesGray values at that point.

S2, preprocessing the collected fire scene data, including preprocessing sensor data, video recordings and images, and converting the fire scene data in various data forms into a format which can be input into a deep learning model.

Sensor data preprocessing, in which information such as combustion residue concentration, soot deposit concentration and the like can be detected by a sensor in a fire scene, wherein the sensor data preprocessing comprises:

s211, data cleaning: check whether the data has outliers, missing values, noise, etc. Based on the actual condition that the concentration value of the residues does not change greatly in a certain area, firstly dividing a data area by adopting a K-mean clustering algorithm, further carrying out standard deviation statistics on the divided subareas, calculating the standard deviation ratio of data points to the mean value of the data points, and considering the data exceeding a certain threshold value as an abnormal value;

s212, data standardization: data normalization is a linear transformation process that is generally adapted to scale data, where data is normalized in multi-modal data to provide comparability between different types of data, and the range of data is scaled to 0,1 by the following equation.

；

Representing characteristic value raw data +_>Representing the minimum value of the characteristic values, +.>Represents the maximum value in the characteristic values,/->Representing the normalized feature data.

The image data preprocessing includes:

s221, normalization: firstly, converting an image into a gray level image, and red of each pixel point in the imageGreen->Blue->The value of (2) is converted into a luminance value +.>Luminance value +.>NormalizationTo [0,1]]Or [ -1,1]And the numerical ranges among different images are consistent.

The gray map conversion process adopts a Gamma correction method:

；

、/>、/>respectively representing the brightness values of the pixels red, green and blue.

s223, scaling: the size of the image is adjusted, and the images with different resolutions are scaled to the same size, so that model processing is facilitated. The present embodiment scales the image to 256×256 pixels.

The video recording preprocessing comprises the following steps:

s231, video segmentation: a continuous piece of video is segmented into meaningful sub-segments, which typically contain specific scenes, events, or actions.

In this embodiment, the video is divided by frame difference calculation, and it is assumed that any two adjacent frames in the video are respectivelyAndcalculating Euclidean distance of corresponding pixels of two frames>，/>And->Representing the size of the image, when +.>When the number is more than 50, two frames are considered to belong to different scenes, and +.>The frames divide the video and the divided video sub-segments are stored to wait for further processing.

S3, extracting multi-mode data features.

Further, extracting the multi-modal data features specifically includes:

s31, extracting sensor data characteristics:

sliding windows are used to capture local variations of the data, and statistical features within each window, such as mean, variance, etc., are calculated.

The traditional method usually uses a sliding window to extract time sequence data information, but the data on site after fire disaster often has no time sequence but regional, so the invention uses the sliding window to extract the regional characteristics of the data to obtain the data characteristics of a plurality of regions；

S32, extracting image features, specifically comprising:

；

wherein,representing a convolved output profile, < >>Representing an input image +.>Representing convolution kernel +.>And->Indicating the size of the convolution kernel.

Assuming that a convolution kernel of size 7 x 7 is used for the operation, the formula is written as:

；

wherein,representing a convolved output profile, < >>Representing an input image +.>Representing the convolution kernel.

；

assuming a pooling window of size 3 x 3 is employed, the formula is written as:

wherein,representing a pooled output profile, < >>Representing a convolved output profile, < >>Representing pooling method, < >>Andrepresenting the size of the pooling window; except that the first layer is a convolution layer and the last layer is a pooling layer, the convolution layer and the pooling layer can be repeatedly overlapped between the convolution layers.

Since a plurality of region features are extracted from the sensor data and a plurality of image features are obtained in the case where a plurality of images are input, a plurality of region features or image features are required to be reorganized to enhance the correlation between the plurality of region features or image features and reduce the calculation pressure. The present invention reorganizes features using a graph structure, which is generally defined as，/>Representing the diagram structure->Set of middle vertices>Representing the diagram structure->Is a set of relationships in the set.

S4, reorganizing the multi-mode information features by adopting a graph structure.

Further, the reorganizing the multi-mode information features by using the graph structure specifically includes:

；

combining the sensor data features according to the calculated data feature relation, and calculating the combined sensor data feature valueFormula tableThe method comprises the following steps:

；

s42, image feature map structure reorganization: set up the collectionAny two images +.>Calculating the feature relation Euclidean distance between images>，

；

S43, characterizing the sensor dataImage feature input vector +.>And other quantized feature vectors->Combining to obtain the final full connection layer input vector +.>，/>。/>May be, for example, building size information, weather environment information vectors, etc.

S5, inputting the information features into a deep learning model, and obtaining a predicted value of the heat release rate through forward propagation calculation of the deep learning model to finish fire intensity recognition, wherein the predicted value is shown in FIG. 3.

The activation function can increase the nonlinear recognition capability of the model, prevent the problems of gradient explosion and gradient disappearance of the model in the training process, and can be selectively added after each layer before the output layer of the fully connected layer.

Further, in step S5, an activation function ReLU is added after the convolutional layer, the pooling layer, and the hidden layer of the deep learning model to increase the nonlinear recognition capability of the model, where the activation function is specifically expressed as follows:

；

The invention utilizes the deep learning technology to extract valuable information from the fire scene and related data so as to judge the fire intensity more accurately and more quickly. In addition, the fire intensity judging method based on deep learning can provide more analysis and judging basis for fire investigators, and improve the accuracy of fire cause analysis, thereby providing more powerful support for fire prevention and control and prevention.

Based on the same inventive concept as the above method embodiment, the present application embodiment further provides a deep learning fire intensity recognition system based on multi-modal data, which can implement the functions provided by the above method embodiment, as shown in fig. 4, and the system includes:

a data acquisition module 110 for acquiring fire scene data;

the data preprocessing module 120 is used for preprocessing the collected fire scene data, including preprocessing sensor data, video recording and images, and converting the fire scene data in various data forms into a format which can be input into a deep learning model;

the feature extraction module 130 extracts multi-modal data features;

the graph structure reorganizing module 140 reorganizes the multi-modal information features by adopting a graph structure;

the fire intensity recognition module 150 inputs the information features into a deep learning model, and obtains a predicted value of the heat release rate through forward propagation calculation of the deep learning model, thereby completing fire intensity recognition.

Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims

1. A deep learning fire intensity identification method based on multi-modal data, which is characterized by including the following steps:

S1. Collect fire scene data;

S2. Preprocess the collected fire scene data, including preprocessing sensor data, video recordings and images, and convert the fire scene data in various data forms into a format that can be input to the deep learning model;

S3. Extract multi-modal data features;

S4. Use graph structure to reorganize multi-modal data features;

S5. Input multi-modal data features into the deep learning model, and calculate the predicted value of heat release rate through forward propagation of the deep learning model to complete fire intensity identification;

In step S2, the image data preprocessing includes:

S221. Normalization: First, convert the image into a grayscale image, convert the red R, green G, and blue B values of each pixel in the image into a brightness value Y, and normalize the brightness value Y to [0, 1] or [-1,1] to make the numerical range between different images consistent;

S222. Cropping: remove unnecessary parts of the image, including but not limited to removing noise at the edge of the image;

S223. Zoom: adjust the size of the image and scale images of different resolutions to the same size to facilitate model processing;

The video recording preprocessing includes:

S231. Video segmentation: Split a continuous video segment into multiple meaningful sub-segments. These sub-segments usually contain specific scenes, events or actions;

S232. Frame sampling: Use the Brenner gradient function to judge the image quality of each frame in the sub-segment, and extract the frame with the highest Brenner gradient function value to reduce the calculation amount of data processing;

S233. Transfer the extracted frames to the image data preprocessing operation;

In step S4, using a graph structure to reorganize multi-modal information features specifically includes:

S41. Structural reorganization of sensor data feature graph: Define the sensor data feature graph structure as G _s (V _s ,E _s ), where the set of sensor data features is the set of vertices in the graph structure V _s , and the vertices in adjacent areas are There is a relationship e _between them _, and the set of e is the set E _s of relationships in the graph structure. Assume that any two adjacent vertices are _v _si ₌ The expression is:

According to the calculated data feature relationship, the sensor data features are merged, and the merged sensor data feature value X _t is calculated. The formula is expressed as follows:

S42. Reorganization of the image feature map structure: Let any two images in the set V be v _pi and v _pj , and calculate the Euclidean distance d of the characteristic relationship between the images,

Among them, A and Z represent the size of the image; Compare with the preset image correlation threshold r, when/> When it is greater than r, //> Record it in the set E, otherwise //> give up;

After the relationship between any two images in the set is calculated, the graph structure G _p (V _p ,E _p ) of the input image is obtained;

According to the calculated image feature relationship, the image features are merged. Assume that the feature values of any two images v _pi and v _pj are X′ _pi and X′ _pj respectively, and the correlation degree of the two images is e _pij . According to the following formula Calculate the merged image feature value X _p ,

S43. Combine the sensor data feature X _t , image feature input vector X _p and other quantized feature vectors X _q to obtain the final fully connected layer input vector X, X=X _t ∪X _p ∪X _q .

2. The deep learning fire intensity identification method based on multi-modal data according to claim 1, characterized in that: in step S1, cameras, sensors, and environmental monitoring equipment are used to collect fire scene data, including but not limited to smoke. Trace information, glass trace information, metal discoloration trace information and architectural information; the data forms include but are not limited to images, video recordings and architectural geometric parameters.

3. The deep learning fire intensity identification method based on multi-modal data according to claim 1, characterized in that: in step S2, the sensor data preprocessing includes:

S211. Data cleaning: Use a clustering algorithm to divide the data area, then perform standard deviation statistics on the divided sub-areas, and calculate the ratio of the standard deviation of the data point to its mean. Data exceeding a certain threshold are considered outliers;

For abnormal values, the following two processing methods are adopted: 1) Use sensors to collect data twice to correct the abnormal values; 2) Use interpolation or data smoothing to correct the data;

S212. Data standardization: Standardize the scale of the data and adjust the data range to [0,1].

4. The deep learning fire intensity identification method based on multi-modal data according to claim 1, characterized in that: in step S3, extracting multi-modal data features specifically includes:

S31. Sensor data feature extraction: Use sliding windows to extract regional features of data, and obtain data features of multiple regions X _s = {x ₁ ,..., x _n };

S32. Image feature extraction, specifically including:

S321. After the image is input, the convolution operation is performed in the convolution layer. The formula is expressed as follows:

Among them, O represents the convolution output feature map, I represents the input image, K represents the convolution kernel, H and W represent the size of the convolution kernel;

S322. The feature map obtained after convolution is pooled in the pooling layer to reduce data redundancy. The formula is expressed as follows:

Among them, P represents the pooling output feature map, O represents the convolution output feature map, p represents the pooling method, and M and N represent the size of the pooling window;

S323. Flatten the output feature map P obtained after the last pooling layer into a one-dimensional image feature input vector X′ _p .

5. The deep learning fire intensity identification method based on multi-modal data according to claim 1, characterized in that: in step S5, the activation function ReLU is added after the convolution layer, pooling layer and hidden layer of the deep learning model. To increase the nonlinear recognition ability of the model, the activation function is specifically expressed as follows:

f(x)＝max(0,x)

In the fully connected layer of the deep learning model, the input vector X is calculated by the following formula to obtain the final prediction result:

C＝XT+b

Among them, C represents the output vector, which is the prediction result of the fire intensity of the deep learning model, T represents the weight matrix, and b represents the bias term.

6. A deep learning fire intensity recognition system based on multi-modal data, characterized in that: the system includes the following:

The data collection module (110) collects fire scene data;

The data preprocessing module (120) preprocesses the collected fire scene data, including preprocessing sensor data, video recordings and images, and converts the fire scene data in various data forms into data that can be input to the deep learning model. Format;

The image data preprocessing includes:

The video recording preprocessing includes:

S233. Transfer the extracted frames to the image data preprocessing operation;

Feature extraction module (130), extracts multi-modal data features;

The graph structure reorganization module (140) uses graph structures to reorganize multi-modal data features; specifically includes:

S43. Combine the sensor data feature X _t , image feature input vector X _p and other quantized feature vectors X _q to obtain the final fully connected layer input vector X, X=X _t ∪X _p ∪X _q ;

The fire intensity identification module (150) inputs the multi-modal data features into the deep learning model, calculates the predicted value of the heat release rate through forward propagation of the deep learning model, and completes the fire intensity identification.