WO2022067668A1 - 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 - Google Patents

基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 Download PDF

Info

Publication number
WO2022067668A1
WO2022067668A1 PCT/CN2020/119413 CN2020119413W WO2022067668A1 WO 2022067668 A1 WO2022067668 A1 WO 2022067668A1 CN 2020119413 W CN2020119413 W CN 2020119413W WO 2022067668 A1 WO2022067668 A1 WO 2022067668A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
image
fire
lfnet
feature extraction
Prior art date
Application number
PCT/CN2020/119413
Other languages
English (en)
French (fr)
Inventor
胡金星
王传胜
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2020/119413 priority Critical patent/WO2022067668A1/zh
Publication of WO2022067668A1 publication Critical patent/WO2022067668A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present application belongs to the technical field of fire detection, and in particular relates to a fire detection method, system, terminal and storage medium based on video image target detection.
  • Fire detection plays a vital role in security monitoring.
  • the traditional fire detection method is based on image prior, which is based on the color and shape of the image for fire detection.
  • image prior is based on the color and shape of the image for fire detection.
  • the robustness and bit error rate of color and motion features are often limited by preset parameters. It can not be applied in complex environments, and the positioning accuracy is easily affected by the region.
  • CNN convolutional neural network
  • Methods based on deep learning require a large number of remote sensing images as training data. Due to the scarcity of real remote sensing images, the training of the model is very challenging.
  • the anti-interference ability is weak, and it is easily affected by the harsh monitoring environment such as haze and dust.
  • the present application provides a fire detection method, system, terminal and storage medium based on video image target detection, aiming to solve one of the above technical problems in the prior art at least to a certain extent.
  • a fire detection method based on video image target detection comprising:
  • Construct a convolutional neural network model LFNet input the data set into the LFNet model for iterative training, and obtain optimal model parameters;
  • the convolutional neural network model LFNet includes a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model.
  • the skeleton feature extraction model extracts the main features of the input image through convolutions of three different scales; the main feature extraction model is used for further feature extraction on the main features to generate three sets of feature maps;
  • the scale feature fusion model performs adaptive fusion on the three sets of feature maps, and outputs detection results;
  • the technical solution adopted in the embodiment of the present application further includes: before the data enhancement algorithm based on the atmospheric scattering model is used to convert the original natural image into the haze image and the sand-dust image, the method includes:
  • An original natural image is obtained; the original natural image includes a non-alarm image without a fire alarm area and a real fire alarm image.
  • the technical solution adopted in the embodiment of the present application also includes: the use of the data enhancement algorithm based on the atmospheric scattering model to convert the original natural image into a haze image includes:
  • the atmospheric scattering model adopts at least two transmission rates respectively to simulate and generate haze images with different concentrations; the haze image imaging formula is:
  • I(x) is the simulated haze image
  • J(x) is the input haze-free image
  • is the atmospheric light value
  • t(x) is the scene transmission rate.
  • the technical solution adopted in the embodiment of the present application further includes: the conversion of the original natural image into the sand and dust image by the data enhancement algorithm based on the atmospheric scattering model includes:
  • the atmospheric scattering model adopts a fixed transmittance and atmospheric light value, and combines three colors to simulate and generate sand and dust images with different concentrations; the sand and dust image simulation formula is:
  • D(x) is the simulated dust image
  • J(x) is the input fog-free image
  • C(x) is the color value
  • the technical solution adopted in the embodiment of the present application further includes: the inputting the data set into the LFNet model for iterative training includes:
  • the skeleton feature extraction model adopts the convolution of the scale of $3*3$, $5*5$ and $7*7$ to extract the features of the input image, and the obtained dimensions are $13*13$, $26*26$ and $52*52$ respectively.
  • the feature maps of The fusion model maps the three sets of feature maps to different convolution kernels and strides for convolution, and splices all convolutions of the same size to obtain three sets of feature maps, and uses the channel-based attention mechanism to operate the three sets of features Mapping to obtain feature maps with sizes of $13*13$, $26*26$ and $52*52$, which are used to detect small, medium and large objects, respectively.
  • the inputting the data set into the LFNet model for iterative training further includes:
  • the mean square error and cross entropy are respectively selected as loss functions for model optimization.
  • the loss function is specifically:
  • R() represents the R channel of the image
  • SCP(x) is the difference between the image brightness and the dark channel
  • v(x) is the brightness of the image
  • DCP(x) is the value of the dark channel of the image
  • CHP represents the combustion histogram prior
  • CHP(I) and CHP(R) represent the CHP values of the area selected by the target detection algorithm and the area marked respectively
  • the loss function is a weighted summation of three different loss functions:
  • L CHP is the final loss function
  • L CE is the cross-entropy loss function
  • L MSE is the mean square error loss function
  • L CHP is the combustion histogram prior loss.
  • a fire detection system based on video image target detection comprising:
  • Data set building module It is used to convert the original natural image into haze image and sand dust image by using the data enhancement algorithm based on the atmospheric scattering model, and generate a data set for training the model;
  • the LFNet model training module used to construct a convolutional neural network model LFNet, and input the data set into the LFNet model for iterative training to obtain optimal model parameters;
  • the convolutional neural network model LFNet includes a skeleton feature extraction model and a main feature extraction model. model and variable-scale feature fusion model;
  • the skeleton feature extraction model extracts the main features of the input image through convolutions of three different scales;
  • the main feature extraction model is used for further feature extraction on the main features to generate three group feature maps;
  • the variable-scale feature fusion model performs adaptive fusion on the three groups of feature maps, and outputs detection results;
  • the detection results include the fire location area of the fire image and the fire type.
  • a terminal includes a processor and a memory coupled to the processor, wherein,
  • the memory stores program instructions for implementing the video image target detection-based fire detection method
  • the processor is configured to execute the program instructions stored in the memory to control fire detection based on video image object detection.
  • a storage medium storing program instructions executable by a processor, where the program instructions are used to execute the fire detection method based on video image target detection.
  • the beneficial effects of the embodiments of the present application are: the fire detection method, system, terminal and storage medium based on video image target detection according to the embodiments of the present application transform the original image by using the data enhancement algorithm based on the atmospheric scattering model. Convert to images subject to different degrees of haze or sand, generate a data set for training the model, and build a convolutional neural network model LFNet suitable for fire and smoke detection in uncertain environments, which can improve the model's ability to perform well in sand and haze. Robustness under abnormal weather, so that the model can obtain better detection results.
  • the size of the LFNet model in the embodiment of the present application is small, the computation cost can be reduced, and the LFNet model can be applied to a resource-constrained device.
  • FIG. 1 is a flowchart of a fire detection method based on video image target detection according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of the simulation effect of haze and sand dust images based on an atmospheric scattering model according to an embodiment of the present application
  • FIG. 3 is a frame diagram of a convolutional neural network model according to an embodiment of the present application.
  • FIG. 4 is a structural diagram of a variable-scale feature fusion model according to an embodiment of the present application.
  • FIG. 5 is a structural diagram of a channel-based attention mechanism according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a fire detection system based on video image target detection according to an embodiment of the application
  • FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • FIG. 1 is a flowchart of a fire detection method based on video image target detection according to an embodiment of the present application.
  • the fire detection method based on video image target detection according to the embodiment of the present application includes the following steps:
  • the acquired original natural images include 293 non-alarm images without fire alarm areas and 5073 real fire alarm images.
  • non-alarm images can improve the robustness of the training algorithm to non-alarm targets and reduce the bit error rate of the detector.
  • real fire alarm images can improve the detection ability of the target detection model.
  • the present invention considers the influence of abnormal weather on the fire detection algorithm, and simulates different levels of haze images and sand dust images through a data enhancement method based on an atmospheric scattering model, thereby converting the original natural images into different New synthetic images of the effects of dust and haze weather, build large-scale benchmark datasets for training and testing fire detection models, to improve the robustness of object detection models under abnormal weather conditions such as dust and haze.
  • FIG. 2 is a schematic diagram of the simulation effect of haze and sand dust images based on the atmospheric scattering model according to the embodiment of the present application, wherein (a) is the original image, and (b), (c) and (d) are respectively
  • the haze images synthesized by atmospheric scattering models with different transmission rates, (e), (f) and (g) are sand and dust images simulated with three different colors using fixed transmittance and atmospheric light values, respectively.
  • the imaging formula of haze image is:
  • I(x) is the simulated haze image
  • J(x) is the input haze-free image
  • is the atmospheric light value
  • t(x) is the scene transmission rate, which describes the The part that is not scattered and reaches the camera sensor.
  • the atmospheric light value ⁇ is set to 0.8 in the embodiment of the present application
  • the transmittance is set to 0.8, 0.6 and 0.4, respectively.
  • the embodiment of the present application selects three colors suitable for simulating sand and dust images to simulate respectively, and the sand and dust image simulation formula is:
  • D(x) is the simulated dust image
  • J(x) is the input haze-free image
  • C(x) is the selected color value.
  • LFNet consists of a common convolutional layer, a bottleneck building block, a linear unit for parameter correction, group normalization, etc., including: a skeleton feature extraction model, a main feature extraction model, and a variable-scale feature fusion model.
  • the functions of each model are as follows:
  • Skeleton Feature Extraction Model Used to extract the main features of the input image. In order to extract richer image features, firstly, convolutions with scales of $3*3$, $5*5$ and $7*7$ are used to extract the features of the input image, expand the receptive field, and extract more image features. After three convolutions of different scales, feature maps with sizes of $13*13$, $26*26$ and $52*52$ are obtained, respectively. Based on the above, by using multi-scale convolution for feature map extraction, feature information of different sizes around pixels can be extracted, which is particularly important for fire images.
  • Main feature extraction model It is used for further feature extraction on the main features extracted by the skeleton feature extraction model, and generates three sets of feature maps with sizes of $52*52$, $26*26$, $13*13$, each small.
  • the feature maps of size are all extracted from the feature maps of larger size in the upper layer, and each convolution block is extracted by one-layer convolutional structure and five-layer residual structure.
  • Variable-scale feature fusion model It is used to concatenate the features extracted by the main feature extraction model by using variable-scale feature fusion (VSFF), and then use convolution to extract features and perform adaptive fusion of features.
  • VSFF variable-scale feature fusion
  • the structure of the variable-scale feature fusion model is shown in Figure 4.
  • three sets of feature map maps are fused, and the functions of $13*13$ and $26*26$ are extended to $52*52$.
  • the three inputs are feature maps with sizes of $13*13$, $26*26$, and $52*52$, respectively.
  • Three feature maps of different sizes are mapped to different convolution kernels and strides for convolution to make upsampling. Or downsample to the other two sizes.
  • concatenate all convolutions of the same size to obtain three sets of feature maps. Since the feature map obtained by splicing contains richer image features, it can make the model localization more accurate.
  • the embodiment of the present application utilizes a channel-based attention mechanism to operate three sets of feature maps extracted from the VSFF.
  • the channel-based attention mechanism can be viewed as a process of weighting feature maps according to their importance. For example, in a set of 24 ⁇ 13 ⁇ 13 convolutions, the channel-based attention mechanism will determine which of the set of feature maps has a more significant impact on the prediction results, and then increase the weight of that part. With the help of the attention mechanism, three fusions are performed to obtain feature maps with sizes of $13*13$, $26*26$ and $52*52$, which are used to detect small, medium and large objects, respectively.
  • the detailed structure of the channel-based attention mechanism is shown in Figure 5.
  • the size of the LFNet model of the embodiment of the present application is very small (22.5M), but it occupies a leading position in both quantitative and qualitative evaluation, which reduces the computational cost and is beneficial to the application of LNet to resource-constrained devices.
  • the LFNet model has two tasks: one is to accurately locate the warning area in the image; the other is to classify the disaster types in the warning area.
  • MSE mean square error
  • CE cross entropy
  • the loss function is based on a large number of statistics on different fire images or videos. , which can help LFNet detect fire areas effectively.
  • the embodiments of the present application regard these statistical data as combustion histogram prior (CHP), and according to these statistical data, write it as the formula of CHP:
  • R() represents the R channel of the image
  • SCP(x) is the difference between the image brightness and the dark channel, which can also be written as:
  • v(x) is the brightness of the image
  • DCP(x) is the value of the dark channel of the image.
  • CHP represents the combustion histogram prior
  • CHP(I) and CHP(R) represent the CHP values of the area selected by the target detection algorithm and the area marked in the ground truth, respectively.
  • the final loss function is the weighted summation of three different loss functions: cross entropy loss function, mean square error loss function and combustion histogram prior loss function.
  • the formula is:
  • L CHP is the final loss function
  • L CE is the cross-entropy loss function
  • L MSE is the mean square error loss function
  • L CHP is the combustion histogram prior loss
  • ⁇ , ⁇ and ⁇ are set to 0.25 respectively. , 0.25 and 0.5.
  • S50 Input the fire image to be detected into the trained LFNet model, and output the fire location area and fire type of the fire image to be detected through the LFNet model.
  • FIG. 6 is a schematic structural diagram of a fire detection system based on video image target detection according to an embodiment of the present application.
  • the fire detection system 40 based on video image target detection according to the embodiment of the present application includes:
  • Data set building module 41 used to convert the original natural image into a haze image and a dust image by using a data enhancement algorithm based on the atmospheric scattering model, and generate a data set for training the model;
  • LFNet model training module 42 used to construct a convolutional neural network model LFNet, and input the data set into the LFNet model for iterative training to obtain optimal model parameters;
  • the convolutional neural network model LFNet includes a skeleton feature extraction model, a main feature Extraction model and variable-scale feature fusion model;
  • the skeleton feature extraction model extracts the main features of the input image through convolution of three different scales;
  • the main feature extraction model is used for further feature extraction on the main features, generating Three sets of feature maps;
  • the variable-scale feature fusion model performs adaptive fusion on the three sets of feature maps, and outputs detection results;
  • Model optimization module 43 used to select mean square error and cross entropy respectively as loss functions for model optimization.
  • FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • the terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51 .
  • the memory 52 stores program instructions for implementing the above-mentioned fire detection method based on video image object detection.
  • the processor 51 is configured to execute program instructions stored in the memory 52 to control fire detection based on video image object detection.
  • the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit).
  • the processor 51 may be an integrated circuit chip with signal processing capability.
  • the processor 51 may also be a general purpose processor, digital signal processor (DSP), application specific integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component .
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • FIG. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
  • the storage medium of this embodiment of the present application stores a program file 61 capable of implementing all the above methods, wherein the program file 61 may be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which may It is a personal computer, a server, or a network device, etc.) or a processor that executes all or part of the steps of the methods of the various embodiments of the present invention.
  • a computer device which may It is a personal computer, a server, or a network device, etc.
  • a processor that executes all or part of the steps of the methods of the various embodiments of the present invention.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes , or terminal devices such as computers, servers, mobile phones, and tablets.
  • the fire detection method, system, terminal, and storage medium based on video image target detection convert the original image into an image affected by different degrees of haze or sand by using a data enhancement algorithm based on an atmospheric scattering model, and generate images for
  • the data set for training the model and constructing a convolutional neural network model LFNet suitable for fire and smoke detection in uncertain environments can improve the robustness of the model under abnormal weather such as sand, dust and haze, and enable the model to obtain better detection. result.
  • the size of the LFNet model in the embodiment of the present application is small, the computation cost can be reduced, and the LFNet model can be applied to a resource-constrained device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Fire-Detection Mechanisms (AREA)
  • Image Analysis (AREA)

Abstract

一种基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质。所述方法包括:采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;构建卷积神经网络模型LFNet(S30),将数据集输入LFNet模型进行迭代训练,得到最优模型参数(S40);卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;主要特征提取模型用于对主要特征进行进一步的特征提取,生成三组特征图;变尺度特征融合模型对三组特征图进行自适应融合,输出检测结果。能够提高模型在沙尘和灰霾等异常天气下的鲁棒性,使模型获得更好的检测结果。

Description

基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 技术领域
本申请属于火灾检测技术领域,特别涉及一种基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质。
背景技术
火灾检测在安全监控中起着至关重要。目前,传统的火灾检测方法为基于图像先验的方法,该方法是基于图像的颜色和形状进行火灾检测,然而由于颜色和运动特征的鲁棒性和误码率往往受到预先设定的参数的影响,导致在复杂的环境中无法应用,且定位准确度易受区域影响。
监测是一项繁琐而耗时的工作,尤其是在不确定的监视环境下,它在时间、空间甚至规模上都具有很大的不确定性。基于传感器的探测器在误码率和感知范围方面的性能有限,因此,它无法探测到远距离或小型火灾。近年来,随着深度学习技术的迅速发展,卷积神经网络(CNN)被应用于火灾探测。然而,现有基于深度学习的火灾检测方法还存在以下不足:
一、基于深度学习的方法需要大量的遥感图像作为训练数据,由于真实遥感图像的稀缺性,模型的训练具有很大的挑战性。
二、基于深度学习的火灾检测模型规模太大,不适合用于资源受限的设备。
三、现有算法的复杂度太高,无法进行实时检测。
四、抗干扰能力弱,容易受到灰霾、粉尘等恶劣监测环境的影响。
五、大多数火灾检测算法只关注单一环境,因此,在不确定的环境中会出现较高的错误率。
综上所述,现有的火灾检测方法在算法复杂度、应用场景范围、模型大小等方面都具有很大的改进空间。
发明内容
本申请提供了一种基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质,旨在至少在一定程度上解决现有技术中的上述技术问题之一。
为了解决上述问题,本申请提供了如下技术方案:
一种基于视频图像目标检测的火灾检测方法,包括:
采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;
构建卷积神经网络模型LFNet,将所述数据集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;
将待检测火灾图像输入训练好的LFNet模型,通过LFNet模型输出待检测火灾图像的火灾定位区域以及火灾类型。
本申请实施例采取的技术方案还包括:所述采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像前包括:
获取原始自然图像;所述原始自然图像包括没有火灾报警区域的非报警图像和真实的火灾报警图像。
本申请实施例采取的技术方案还包括:所述采用基于大气散射模型的数据 增强算法将原始自然图像转换为灰霾图像包括:
所述大气散射模型分别采用至少两种传输速率分别模拟生成不同浓度的灰霾图像;所述灰霾图像成像公式为:
I(x)=J(x)t(x)+ɑ(1-t(x))
上述公式中,I(x)是模拟出来的灰霾图像,J(x)是输入的无雾图像,ɑ是大气光值,t(x)是场景传输速率。
本申请实施例采取的技术方案还包括:所述采用基于大气散射模型的数据增强算法将原始自然图像转换为沙尘图像包括:
所述大气散射模型采用固定透射率和大气光值,结合三种颜色模拟生成不同浓度的沙尘图像;所述沙尘图像模拟公式为:
D(x)=J(x)t(x)+a(C(x)*(1-t(x)))
上述公式中,D(x)为模拟出的沙尘图像,J(x)为输入的无雾图像,C(x)为颜色值。
本申请实施例采取的技术方案还包括:所述将所述数据集输入LFNet模型进行迭代训练包括:
所述骨架特征提取模型分别采用$3*3$、$5*5$和$7*7$尺度的卷积提取输入图像的特征,得到尺寸分别为$13*13$、$26*26$和$52*52$的特征图;所述主要特征提取模型对所述主要特征进行进一步的特征提取,生成大小分别为$52*52$、$26*26$、$13*13$的三组特征图;所述变尺度特征融合模型将所述三组特征图映射到不同的卷积核和步长进行卷积,并拼接所有相同大小的卷积,得到三组特征映射,利用基于信道的注意机制操作所述三组特征映射,得到大小分别为$13*13$、$26*26$和$52*52$的特征图,分别用于检测小、中、大型物体。
本申请实施例采取的技术方案还包括:所述将数据集输入LFNet模型进行迭代训练还包括:
分别选取均方误差和交叉熵作为损失函数进行模型优化。
本申请实施例采取的技术方案还包括:所述损失函数具体为:
统计火灾区域的路径的亮度、暗通道值和R通道数据,将所述统计数据视为燃烧直方图先验,写成CHP的公式:
Figure PCTCN2020119413-appb-000001
上述公式中,R()代表图像的R通道,SCP(x)是图像亮度与暗通道的差值;
SCP(x)=||v(x)-DCP(x)||
上述公式中,v(x)是图像的亮度,DCP(x)是图像暗通道的值;
LCHP=||CHP(I)-CHP(R)|| 2
上述公式中,CHP代表燃烧直方图先验,CHP(I)和CHP(R)分别代表目标检测算法选中的区域和标注的区域的CHP值;
所述损失函数为将三个不同的损失函数进行加权求和:
L CHP=βL CE+γL MSE+δL CHP
上述公式中,L CHP为最终的损失函数,L CE为交叉熵损失函数,L MSE为均方差损失函数,L CHP为燃烧直方图先验损失。
本申请实施例采取的另一技术方案为:一种基于视频图像目标检测的火灾检测系统,包括:
数据集构建模块:用于采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;
LFNet模型训练模块:用于构建卷积神经网络模型LFNet,将所述数据集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;所述检测结果包括火灾图像的火灾定位区域以及火灾类型。
本申请实施例采取的又一技术方案为:一种终端,所述终端包括处理器、与所述处理器耦接的存储器,其中,
所述存储器存储有用于实现所述基于视频图像目标检测的火灾检测方法的程序指令;
所述处理器用于执行所述存储器存储的所述程序指令以控制基于视频图像目标检测的火灾检测。
本申请实施例采取的又一技术方案为:一种存储介质,存储有处理器可运行的程序指令,所述程序指令用于执行所述基于视频图像目标检测的火灾检测方法。
相对于现有技术,本申请实施例产生的有益效果在于:本申请实施例的基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质通过使用基于大气散射模型的数据增强算法将原始图像转换为受不同程度的灰霾或沙尘图像,生成用于训练模型的数据集,并构建适用于不确定环境下火灾烟雾探测的卷积神经网络模型LFNet,能够提高模型在沙尘和灰霾等异常天气下的鲁棒性,使模型获得更好的检测结果。同时,由于本申请实施例的LFNet模型尺寸较小,可以降低计算成本,并有利于LFNet模型应用于资源受限的设备。
附图说明
图1是本申请实施例的基于视频图像目标检测的火灾检测方法的流程图;
图2是本申请实施例基于大气散射模型的灰霾和沙尘图像模拟效果示意图;
图3是本申请实施例的卷积神经网络模型的框架图;
图4是本申请实施例的变尺度特征融合模型的结构图;
图5是本申请实施例的基于信道的注意机制的结构图;
图6为本申请实施例的基于视频图像目标检测的火灾检测系统结构示意图;
图7为本申请实施例的终端结构示意图;
图8为本申请实施例的存储介质的结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
请参阅图1,是本申请实施例的基于视频图像目标检测的火灾检测方法的流程图。本申请实施例的基于视频图像目标检测的火灾检测方法包括以下步骤:
S10:获取原始自然图像;
本步骤中,获取的原始自然图像包括293个没有火灾报警区域的非报警图像和5073个真实的火灾报警图像。利用非报警图像可以提高训练算法对非报警目标的鲁棒性,降低检测器的误码率。利用真实的火灾报警图像可以提高目 标检测模型的检测能力。
S20:采用基于大气散射模型的数据增强算法将原始自然图像转换为受不同类型及不同程度的异常天气影响的新合成图像,生成用于训练模型的数据集;
本步骤中,由于现有的智能监控算法通常会忽略灰霾或沙尘等异常天气对性能的影响,导致监测算法在不确定气候条件下的鲁棒性较差。为了解决上述不足,本发明考虑了异常天气对火灾探测算法的影响问题,通过基于大气散射模型的数据增强方法分别模拟不同程度的灰霾图像及沙尘图像,从而将原始自然图像转换为受不同程度的灰霾或沙尘天气影响的新合成图像,构建用于训练和测试火灾检测模型的大规模基准数据集,以提高目标检测模型在沙尘和灰霾等异常天气下的鲁棒性。
进一步地,请参阅图2,是本申请实施例基于大气散射模型的灰霾和沙尘图像模拟效果示意图,其中,(a)为原始图像,(b)、(c)和(d)分别为不同传输速率的大气散射模型合成的灰霾图像,(e)、(f)和(g)分别为采用固定透射率和大气光值,结合三种不同颜色模拟的沙尘图像。灰霾图像成像公式为:
I(x)=J(x)t(x)+ɑ(1-t(x))      (1)
公式(1)中,I(x)是模拟出来的灰霾图像,J(x)是输入的无雾图像,ɑ是大气光值,t(x)是场景传输速率,该速率描述了视图中未散射并到达相机传感器的部分。为了模拟不同浓度的灰霾天气,本申请实施例将大气光值ɑ设为0.8,将透射率分别设为0.8、0.6和0.4。
由于深度信息在图像除尘任务中不起主要作用,因此假定传输不随图像的深度而改变。通过先验统计,本申请实施例选择了三种适合模拟沙尘图像的颜色分别进行模拟,沙尘图像模拟公式为:
D(x)=J(x)t(x)+a(C(x)*(1-t(x)))     (2)
公式(2)中,D(x)为模拟出的沙尘图像,J(x)为输入的无雾图像,C(x)为选择的颜色值。
S30:构建卷积神经网络模型LFNet;
本申请实施例中,卷积神经网络模型的框架如图3所示。LFNet由公共卷积层、瓶颈构建块、参数校正线性单元、组规范化等组成,包括:骨架特征提取模型、主要特征提取模型和变尺度特征融合模型,各模型功能具体为:
骨架特征提取模型:用于提取输入图像的主要特征。为了提取更丰富的图像特征,首先分别采用$3*3$、$5*5$和$7*7$尺度的卷积提取输入图像的特征,扩大接受野,提取更多的图像特征。通过三个不同尺度的卷积后,得到尺寸分别为$13*13$、$26*26$和$52*52$的特征图。基于上述,通过采用多尺度卷积进行特征图提取,可以提取出像素周围不同大小的特征信息,这对于火灾图像尤为重要。
主要特征提取模型:用于对骨架特征提取模型提取的主要特征进行进一步的特征提取,并生成大小分别为$52*52$、$26*26$、$13*13$的三组特征图,每个小尺寸的特征图都是从上层较大尺寸的特征图中提取出来的,每个卷积块由一层卷积结构和五层残差结构进行提取。
变尺度特征融合模型:用于采用变尺度特征融合(VSFF)对主要特征提取模型提取的特征串接起来,然后利用卷积提取特征,并对特征进行自适应融合。变尺度特征融合模型的结构如图4所示。为了融合不同尺度的卷积提取的特征图,将三组特征图映射进行融合,将$13*13$和$26*26$的功能扩展到$52*52$。三个输入是尺寸分别为$13*13$、$26*26$、$52*52$的特征图,将三个不同尺寸的特征图映射到不同的卷积核和步长进行卷积,使上采样或下采样成为另外两种尺寸。最后,拼接所有相同大小的卷积,得到三组特征映射。由于拼接得 到的特征图包含了更丰富的图像特征,因此可以使模型定位更加精确。
进一步地,本申请实施例利用基于信道的注意机制操作VSFF中提取的三组特征映射。基于信道的注意机制可以看作是根据特征图的重要性对其进行加权的过程。例如,在一组$24×13×13$的卷积中,基于信道的注意机制将确定该组特征映射中的哪一个对预测结果有更显著的影响,然后增加该部分的权重。借助注意机制,进行三次融合,得到大小分别为$13*13$、$26*26$和$52*52$的特征图,分别用于检测小、中、大型物体。基于信道的注意机制的详细结构如图5所示。
基于上述结构,本申请实施例的LFNet模型的尺寸非常小(22.5M),但在定量和定性评估方面都占据了领先地位,降低了计算成本,有利于LNet应用于资源受限的设备。
S40:将数据集输入LFNet模型进行迭代训练,得到最优模型参数;
本步骤中,模型训练过程中,LFNet模型有两个任务:一是准确定位图像中的报警区域;二是对报警区域的灾害类型进行分类。为了使模型更好地完成这两个任务,本申请实施例分别选取均方误差(MSE)和交叉熵(CE)作为损失函数指导网络优化,该损失函数基于对不同火灾图像或视频的大量统计,可以帮助LFNet有效地检测火灾区域。
具体地,经过对各种火灾图像进行大量实验发现,在烟雾区域,其亮度与暗通道值之差的绝对值高于其他区域,火灾区域的R通道高于非火区域,即路径的亮度、暗通道值和R通道随火灾危险区域的不同而变化,烟雾浓度随亮度与暗通道的差的绝对值而增大,火灾的视觉特征与R通道的像素值密切相关。基于上述特征,本申请实施例将这些统计数据视为燃烧直方图先验(CHP),根据这些统计数据,将其写成CHP的公式:
Figure PCTCN2020119413-appb-000002
公式(3)中,R()代表图像的R通道,SCP(x)是图像亮度与暗通道的差值,也可以被写成为:
SCP(x)=||v(x)-DCP(x)||    (4)
公式(4)中,v(x)是图像的亮度,DCP(x)是指图像暗通道的值。
L CHP=||CHP(I)-CHP(R)|| 2     (5)
公式(5)中,CHP代表燃烧直方图先验,CHP(I)和CHP(R)分别代表目标检测算法选中的区域和ground truth中标注的区域的CHP的值。
最终的损失函数为交叉熵损失函数、均方差损失函数和燃烧直方图先验损失函数三个不同的损失函数进行加权求和,公式为:
L CHP=βL CE+γL MSE+δL CHP    (6)
公式(6)中,L CHP为最终的损失函数,L CE为交叉熵损失函数,L MSE为均方差损失函数,L CHP为燃烧直方图先验损失,β、γ和δ分别设定为0.25、0.25和0.5。
S50:将待检测火灾图像输入训练好的LFNet模型,通过LFNet模型输出待检测火灾图像的火灾定位区域以及火灾类型。
请参阅图6,是本申请实施例的基于视频图像目标检测的火灾检测系统的结构示意图。本申请实施例的基于视频图像目标检测的火灾检测系统40包括:
数据集构建模块41:用于采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;
LFNet模型训练模块42:用于构建卷积神经网络模型LFNet,将所述数据 集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;
模型优化模块43:用于分别选取均方误差和交叉熵作为损失函数进行模型优化。
请参阅图7,为本申请实施例的终端结构示意图。该终端50包括处理器51、与处理器51耦接的存储器52。
存储器52存储有用于实现上述基于视频图像目标检测的火灾检测方法的程序指令。
处理器51用于执行存储器52存储的程序指令以控制基于视频图像目标检测的火灾检测。
其中,处理器51还可以称为CPU(Central Processing Unit,中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
请参阅图8,为本申请实施例的存储介质的结构示意图。本申请实施例的存储介质存储有能够实现上述所有方法的程序文件61,其中,该程序文件61可以以软件产品的形式存储在上述存储介质中,包括若干指令用以使得一台计 算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,或者是计算机、服务器、手机、平板等终端设备。
本申请实施例的基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质通过使用基于大气散射模型的数据增强算法将原始图像转换为受不同程度的灰霾或沙尘图像,生成用于训练模型的数据集,并构建适用于不确定环境下火灾烟雾探测的卷积神经网络模型LFNet,能够提高模型在沙尘和灰霾等异常天气下的鲁棒性,使模型获得更好的检测结果。同时,由于本申请实施例的LFNet模型尺寸较小,可以降低计算成本,并有利于LFNet模型应用于资源受限的设备。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本申请中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本申请所示的这些实施例,而是要符合与本申请所公开的原理和新颖特点相一致的最宽的范围。

Claims (10)

  1. 一种基于视频图像目标检测的火灾检测方法,其特征在于,包括:
    采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;
    构建卷积神经网络模型LFNet,将所述数据集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;
    将待检测火灾图像输入训练好的LFNet模型,通过LFNet模型输出待检测火灾图像的火灾定位区域以及火灾类型。
  2. 根据权利要求1所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像前包括:
    获取原始自然图像;所述原始自然图像包括没有火灾报警区域的非报警图像和真实的火灾报警图像。
  3. 根据权利要求1或2所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像包括:
    所述大气散射模型分别采用至少两种传输速率分别模拟生成不同浓度的灰霾图像;所述灰霾图像成像公式为:
    I(x)=J(x)t(x)+ɑ(1-t(x))
    上述公式中,I(x)是模拟出来的灰霾图像,J(x)是输入的无雾图像,ɑ是大气光值,t(x)是场景传输速率。
  4. 根据权利要求3所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述采用基于大气散射模型的数据增强算法将原始自然图像转换为沙尘图像包括:
    所述大气散射模型采用固定透射率和大气光值,结合三种颜色模拟生成不同浓度的沙尘图像;所述沙尘图像模拟公式为:
    D(x)=J(x)t(x)+a(C(x)*(1-t(x)))
    上述公式中,D(x)为模拟出的沙尘图像,J(x)为输入的无雾图像,C(x)为颜色值。
  5. 根据权利要求1所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述将所述数据集输入LFNet模型进行迭代训练包括:
    所述骨架特征提取模型分别采用$3*3$、$5*5$和$7*7$尺度的卷积提取输入图像的特征,得到尺寸分别为$13*13$、$26*26$和$52*52$的特征图;所述主要特征提取模型对所述主要特征进行进一步的特征提取,生成大小分别为$52*52$、$26*26$、$13*13$的三组特征图;所述变尺度特征融合模型将所述三组特征图映射到不同的卷积核和步长进行卷积,并拼接所有相同大小的卷积,得到三组特征映射,利用基于信道的注意机制操作所述三组特征映射,得到大小分别为$13*13$、$26*26$和$52*52$的特征图,分别用于检测小、中、大型物体。
  6. 根据权利要求5所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述将数据集输入LFNet模型进行迭代训练还包括:
    分别选取均方误差和交叉熵作为损失函数进行模型优化。
  7. 根据权利要求6所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述损失函数具体为:
    统计火灾区域的路径的亮度、暗通道值和R通道数据,将所述统计数据视为燃烧直方图先验,写成CHP的公式:
    Figure PCTCN2020119413-appb-100001
    上述公式中,R()代表图像的R通道,SCP(x)是图像亮度与暗通道的差值;
    SCP(x)=||v(x)-DCP(x)||
    上述公式中,v(x)是图像的亮度,DCP(x)是图像暗通道的值;
    L CHP=||CHP(I)-CHP(R)|| 2
    上述公式中,CHP代表燃烧直方图先验,CHP(I)和CHP(R)分别代表目标检测算法选中的区域和标注的区域的CHP值;
    所述损失函数为将三个不同的损失函数进行加权求和:
    L CHP=βL CE+γL MSE+δL CHP
    上述公式中,L CHP为最终的损失函数,L CE为交叉熵损失函数,L MSE为均方差损失函数,L CHP为燃烧直方图先验损失。
  8. 一种基于视频图像目标检测的火灾检测系统,其特征在于,包括:
    数据集构建模块:用于采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;
    LFNet模型训练模块:用于构建卷积神经网络模型LFNet,将所述数据集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提 取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;所述检测结果包括火灾图像的火灾定位区域以及火灾类型。
  9. 一种终端,其特征在于,所述终端包括处理器、与所述处理器耦接的存储器,其中,
    所述存储器存储有用于实现权利要求1-7任一项所述的基于视频图像目标检测的火灾检测方法的程序指令;
    所述处理器用于执行所述存储器存储的所述程序指令以控制基于视频图像目标检测的火灾检测。
  10. 一种存储介质,其特征在于,存储有处理器可运行的程序指令,所述程序指令用于执行权利要求1至7任一项所述基于视频图像目标检测的火灾检测方法。
PCT/CN2020/119413 2020-09-30 2020-09-30 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 WO2022067668A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/119413 WO2022067668A1 (zh) 2020-09-30 2020-09-30 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/119413 WO2022067668A1 (zh) 2020-09-30 2020-09-30 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质

Publications (1)

Publication Number Publication Date
WO2022067668A1 true WO2022067668A1 (zh) 2022-04-07

Family

ID=80949324

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/119413 WO2022067668A1 (zh) 2020-09-30 2020-09-30 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质

Country Status (1)

Country Link
WO (1) WO2022067668A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882430A (zh) * 2022-04-29 2022-08-09 东南大学 一种基于Transformer的轻量化早期火灾检测方法
CN115171006A (zh) * 2022-06-15 2022-10-11 武汉纺织大学 基于深度学习的自动识别人员进入电力危险区的检测方法
CN116958774A (zh) * 2023-09-21 2023-10-27 北京航空航天大学合肥创新研究院 一种基于自适应空间特征融合的目标检测方法
CN116977826A (zh) * 2023-08-14 2023-10-31 北京航空航天大学 一种边缘端计算架构下的可重构神经网络目标检测系统及方法
CN117132752A (zh) * 2023-10-24 2023-11-28 硕橙(厦门)科技有限公司 基于多维度加权的沙尘图像增强方法、装置、设备及介质
CN117197658A (zh) * 2023-08-08 2023-12-08 北京科技大学 基于多情境生成图像的建筑火灾多目标检测方法与系统
CN117409341A (zh) * 2023-12-15 2024-01-16 深圳市光明顶技术有限公司 基于无人机照明的图像分析方法及系统
CN117935166A (zh) * 2024-01-31 2024-04-26 中煤科工集团重庆研究院有限公司 一种煤矿采空区智能火灾监测方法和系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345477A (zh) * 2018-09-26 2019-02-15 四川长虹电器股份有限公司 一种基于深度卷积神经网络的快速图像去雾霾系统
CN110135266A (zh) * 2019-04-17 2019-08-16 浙江理工大学 一种基于深度学习的双摄像头电气火灾防控方法及系统
EP3561788A1 (en) * 2016-12-21 2019-10-30 Hochiki Corporation Fire monitoring system
CN111179202A (zh) * 2019-12-31 2020-05-19 内蒙古工业大学 一种基于生成对抗网络的单幅图像去雾增强方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3561788A1 (en) * 2016-12-21 2019-10-30 Hochiki Corporation Fire monitoring system
CN109345477A (zh) * 2018-09-26 2019-02-15 四川长虹电器股份有限公司 一种基于深度卷积神经网络的快速图像去雾霾系统
CN110135266A (zh) * 2019-04-17 2019-08-16 浙江理工大学 一种基于深度学习的双摄像头电气火灾防控方法及系统
CN111179202A (zh) * 2019-12-31 2020-05-19 内蒙古工业大学 一种基于生成对抗网络的单幅图像去雾增强方法和系统

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114882430A (zh) * 2022-04-29 2022-08-09 东南大学 一种基于Transformer的轻量化早期火灾检测方法
CN115171006A (zh) * 2022-06-15 2022-10-11 武汉纺织大学 基于深度学习的自动识别人员进入电力危险区的检测方法
CN117197658A (zh) * 2023-08-08 2023-12-08 北京科技大学 基于多情境生成图像的建筑火灾多目标检测方法与系统
CN116977826A (zh) * 2023-08-14 2023-10-31 北京航空航天大学 一种边缘端计算架构下的可重构神经网络目标检测系统及方法
CN116977826B (zh) * 2023-08-14 2024-03-22 北京航空航天大学 一种边缘端计算架构下的可重构神经网络目标检测方法
CN116958774A (zh) * 2023-09-21 2023-10-27 北京航空航天大学合肥创新研究院 一种基于自适应空间特征融合的目标检测方法
CN116958774B (zh) * 2023-09-21 2023-12-01 北京航空航天大学合肥创新研究院 一种基于自适应空间特征融合的目标检测方法
CN117132752A (zh) * 2023-10-24 2023-11-28 硕橙(厦门)科技有限公司 基于多维度加权的沙尘图像增强方法、装置、设备及介质
CN117132752B (zh) * 2023-10-24 2024-02-02 硕橙(厦门)科技有限公司 基于多维度加权的沙尘图像增强方法、装置、设备及介质
CN117409341A (zh) * 2023-12-15 2024-01-16 深圳市光明顶技术有限公司 基于无人机照明的图像分析方法及系统
CN117409341B (zh) * 2023-12-15 2024-02-13 深圳市光明顶技术有限公司 基于无人机照明的图像分析方法及系统
CN117935166A (zh) * 2024-01-31 2024-04-26 中煤科工集团重庆研究院有限公司 一种煤矿采空区智能火灾监测方法和系统

Similar Documents

Publication Publication Date Title
WO2022067668A1 (zh) 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质
Hu et al. Fast forest fire smoke detection using MVMNet
CN110598558B (zh) 人群密度估计方法、装置、电子设备及介质
CN111179217A (zh) 一种基于注意力机制的遥感图像多尺度目标检测方法
CN108734210B (zh) 一种基于跨模态多尺度特征融合的对象检测方法
CN112689843B (zh) 闭环自动数据集创建系统和方法
TWI667621B (zh) 人臉辨識方法
US11756306B2 (en) Anti-drowning safety alarm method and device for swimming pool
Jiang et al. A self-attention network for smoke detection
CN112036381B (zh) 视觉跟踪方法、视频监控方法及终端设备
Qiang et al. Forest fire smoke detection under complex backgrounds using TRPCA and TSVB
Yu et al. SAR ship detection based on improved YOLOv5 and BiFPN
CN110827320A (zh) 基于时序预测的目标跟踪方法和装置
CN116524189A (zh) 一种基于编解码索引化边缘表征的高分辨率遥感图像语义分割方法
Viraktamath et al. Comparison of YOLOv3 and SSD algorithms
Xu et al. Tackling small data challenges in visual fire detection: a deep convolutional generative adversarial network approach
CN113627504B (zh) 基于生成对抗网络的多模态多尺度特征融合目标检测方法
Li et al. A self-attention feature fusion model for rice pest detection
Ning et al. Point-voxel and bird-eye-view representation aggregation network for single stage 3D object detection
Cheng et al. C 2-YOLO: Rotating Object Detection Network for Remote Sensing Images with Complex Backgrounds
CN114463624A (zh) 一种应用于城市管理监督的违章建筑物检测方法及装置
CN116912675B (zh) 一种基于特征迁移的水下目标检测方法及系统
CN112215122B (zh) 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质
CN117576461A (zh) 一种用于变电站场景的语义理解方法、介质及系统
CN115205793B (zh) 基于深度学习二次确认的电力机房烟雾检测方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20955693

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20955693

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20955693

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14/12/2023)