CN114898181A

CN114898181A - Method and device for identifying hidden dangers and violations of regulations for explosive-related videos

Info

Publication number: CN114898181A
Application number: CN202210552989.1A
Authority: CN
Inventors: 李万林; 徐勇; 何鑫
Original assignee: Sichuan Jingchuang Guoxin Technology Co ltd
Current assignee: Sichuan Jingchuang Guoxin Technology Co ltd
Priority date: 2022-05-19
Filing date: 2022-05-19
Publication date: 2022-08-12

Abstract

The invention discloses a hidden danger violation identification method and a hidden danger violation identification device for an explosion-related video, which relate to the field of explosion construction, and the method comprises the steps of S1 constructing a video authenticity detection model, a video target identification model and an identity identification model; s2, constructing an annotation database and a face information database; s3, training and optimizing a video target recognition model and an identity recognition model; s4, acquiring the explosion-related video; s5, carrying out authenticity analysis on the explosion-related video, and identifying related information of the blasting operation and information of blasting operators; s6, analyzing hidden dangers and violation conditions in the blasting video; the video true and false detection model, the optimized video target identification model and the optimized identity identification model are used for analyzing the explosion-related video, analyzing and judging the true and false of the explosion-related video, the related operation information in the explosion process and the information of operators operating the explosion, judging whether potential safety hazards exist in the process operation or not according to the analysis results, and standardizing the operation of the operators.

Description

Method and device for identifying hidden dangers and violations of regulations for explosive-related videos

技术领域technical field

本发明涉及爆破施工领域，尤其涉及一种用于涉爆视频的隐患违章识别方法及装置。The invention relates to the field of blasting construction, in particular to a method and device for identifying hidden dangers and violations of regulations for blasting-related videos.

背景技术Background technique

在石油勘探技术的安全监督领域，如何保证爆破员与施工员在民爆作业过程中的工作质量、流程规范性及个人安全，是本领域的技术难题。In the field of safety supervision of petroleum exploration technology, how to ensure the work quality, process standardization and personal safety of blasters and constructors during civil blasting operations is a technical problem in this field.

现有技术大多为施工人员带上信息采集终端设备，进行规范性操作并完整录制作业过程，当日将作业视频数据提交给信息采集中心，进行人为评判施工流程是否合格，比如：未戴安全帽、未穿防静电服、爆破员持证比对等等，在评判的过程中，工作人员每日需要审核大量的视频，视觉疲劳会造成漏查漏报。人的判断具有主观性，对数据的审核不能严格符合所制定的标准。同时，传统的技术浪费了大量的人力、物力和财力。Most of the existing technologies are for construction personnel to bring information collection terminal equipment, carry out standardized operations and completely record the operation process, submit the operation video data to the information collection center on the same day, and manually judge whether the construction process is qualified, such as: not wearing a helmet, In the process of judging, the staff needs to review a large number of videos every day, and visual fatigue will cause missed inspections and missed reports. Human judgment is subjective, and the auditing of data cannot strictly conform to the established standards. At the same time, the traditional technology wastes a lot of human, material and financial resources.

发明内容SUMMARY OF THE INVENTION

本发明的目的就在于为了解决上述问题设计了一种用于涉爆视频的隐患违章识别方法及装置。The purpose of the present invention is to design a method and device for identifying hidden dangers and violations of regulations for explosive-related videos in order to solve the above problems.

本发明通过以下技术方案来实现上述目的：The present invention realizes above-mentioned purpose through following technical scheme:

用于涉爆视频的隐患违章识别方法，包括：Hidden and illegal identification methods for explosive-related videos, including:

S1、构建视频真伪检测模型、视频目标识别模型和身份识别模型，视频目标识别模型为YOLOv5检测模型，身份识别模型为人脸识别模型；S1. Build a video authenticity detection model, a video target recognition model and an identity recognition model. The video target recognition model is a YOLOv5 detection model, and the identity recognition model is a face recognition model;

S2、构建标注数据库和人脸信息数据库，对训练数据集进行标注导出标注文件，形成标注数据库，人脸信息数据库内储存有持证的民爆作业人员相关信息；S2. Build a labeling database and a face information database, label the training data set and export the labeling file to form a labeling database, and the face information database stores relevant information about the licensed civilian explosive operators;

S3、采用标注数据库对视频目标识别模型进行训练优化，采用人脸信息数据库对身份识别模型进行训练优化；S3. Use the annotation database to train and optimize the video target recognition model, and use the face information database to train and optimize the identity recognition model;

S4、获取待分析的涉爆视频；S4. Obtain the explosion-related video to be analyzed;

S5、待分析的涉爆视频导入至视频真伪检测模型、优化后的视频目标识别模型和优化后的身份识别模型，对涉爆视频进行真伪分析、识别爆破操作的相关信息和识别爆破操作人员的信息；S5. Import the explosion-related video to be analyzed into the video authenticity detection model, the optimized video target recognition model, and the optimized identity recognition model, and perform authenticity analysis on the explosion-related video, identify the relevant information of the blasting operation, and identify the blasting operation. information about personnel;

S6、根据涉爆视频进行真伪分析、识别爆破操作的相关信息和识别爆破操作人员的信息分析爆破视频中存在的隐患和违章情况。S6. Analyze the authenticity of the blasting-related video, identify the relevant information of blasting operations and identify the information of blasting operators, and analyze the hidden dangers and violations of regulations in the blasting video.

用于涉爆视频的隐患违章识别装置，包括底层服务器，底层服务器包括：The device for identifying hidden dangers and violations of laws and regulations used for explosive-related videos, including the bottom-level server, the bottom-level server includes:

储存器；储存器用于储存计算机程序；storage; storage for storing computer programs;

处理器；处理器用于执行计算机程序，处理器执行计算机程序时，实现如上述的用于涉爆视频的隐患违章识别方法的步骤。A processor; the processor is used to execute a computer program, and when the processor executes the computer program, it implements the steps of the above-mentioned method for identifying hidden dangers and violations of regulations for explosive-related videos.

本发明的有益效果在于：通过视频真伪检测模型、优化后的视频目标识别模型和优化后的身份识别模型对涉爆视频进行分析，分析判断该涉爆视频的真伪性、爆破过程中的相关操作信息和操作爆破的操作人员信息，并通过这些分析结果判断是否流程作业中是否存在安全隐患，规范工作人员的操作，实现了高效运行、精准判定、智能分析，让安全问题的管控更加高效，节省大量的人力、物力和财力。The beneficial effect of the present invention is that: the video authenticity detection model, the optimized video target recognition model and the optimized identity recognition model are used to analyze the explosion-related video, so as to analyze and judge the authenticity of the explosion-related video and the quality of the explosion-related video. Relevant operation information and information of operators who operate blasting, and through these analysis results to determine whether there are safety hazards in the process operation, standardize the operation of the staff, realize efficient operation, accurate judgment, intelligent analysis, and make the management and control of safety issues more efficient. , saving a lot of human, material and financial resources.

附图说明Description of drawings

图1是本发明中视频目标识别模型的原理图；1 is a schematic diagram of a video target recognition model in the present invention;

图2是本发明中分析涉爆视频清洗度的流程图；Fig. 2 is the flow chart of analyzing the cleaning degree of video related to explosion in the present invention;

图3是本发明中身份识别模型的识别流程图；Fig. 3 is the identification flow chart of the identity identification model in the present invention;

图4是本发明中视频真伪检测模型的检测流程图；Fig. 4 is the detection flow chart of the video authenticity detection model in the present invention;

图5是本发明RNN文字识别框架；Fig. 5 is the RNN character recognition framework of the present invention;

图6是本发明单层双向RNN网络。FIG. 6 is a single-layer bidirectional RNN network of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述。显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the invention generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations.

因此，以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围，而是仅仅表示本发明的选定实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。Thus, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the invention as claimed, but is merely representative of selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

应注意到：相似的标号和字母在下面的附图中表示类似项，因此，一旦某一项在一个附图中被定义，则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

在本发明的描述中，需要理解的是，术语“上”、“下”、“内”、“外”、“左”、“右”等指示的方位或位置关系为基于附图所示的方位或位置关系，或者是该发明产品使用时惯常摆放的方位或位置关系，或者是本领域技术人员惯常理解的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的设备或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that the orientations or positional relationships indicated by the terms "upper", "lower", "inner", "outer", "left", "right", etc. are based on those shown in the accompanying drawings The orientation or positional relationship, or the orientation or positional relationship that the product of the invention is usually placed in use, or the orientation or positional relationship that is commonly understood by those skilled in the art, are only for the convenience of describing the present invention and simplifying the description, rather than indicating or It is implied that the device or element referred to must have a particular orientation, be constructed and operate in a particular orientation, and therefore should not be construed as a limitation of the invention.

此外，术语“第一”、“第二”等仅用于区分描述，而不能理解为指示或暗示相对重要性。Furthermore, the terms "first", "second", etc. are only used to differentiate the description and should not be construed to indicate or imply relative importance.

在本发明的描述中，还需要说明的是，除非另有明确的规定和限定，“设置”、“连接”等术语应做广义理解，例如，“连接”可以是固定连接，也可以是可拆卸连接，或一体地连接；可以是机械连接，也可以是电连接；可以是直接连接，也可以通过中间媒介间接连接，可以是两个元件内部的连通。对于本领域的普通技术人员而言，可以根据具体情况理解上述术语在本发明中的具体含义。In the description of the present invention, it should also be noted that, unless otherwise expressly specified and limited, terms such as "arrangement" and "connection" should be understood in a broad sense, for example, "connection" may be a fixed connection or Detachable connection, or integral connection; may be mechanical connection or electrical connection; may be direct connection or indirect connection through an intermediate medium, and may be internal communication between two components. For those of ordinary skill in the art, the specific meanings of the above terms in the present invention can be understood according to specific situations.

下面结合附图，对本发明的具体实施方式进行详细说明。The specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

S1、构建视频真伪检测模型、视频目标识别模型和身份识别模型，视频目标识别模型为YOLOv5检测模型，身份识别模型为人脸识别模型。S1. Build a video authenticity detection model, a video target recognition model, and an identity recognition model. The video target recognition model is a YOLOv5 detection model, and the identity recognition model is a face recognition model.

S2、构建标注数据库和人脸信息数据库，对训练数据集进行标注导出标注文件，形成标注数据库，人脸信息数据库内储存有持证的民爆作业人员相关信息。S2. Build a labeling database and a face information database, label the training data set and export the labeling file to form a labeling database, and the face information database stores relevant information of licensed civilian explosive operators.

S3、采用标注数据库对视频目标识别模型进行训练优化，采用人脸信息数据库对身份识别模型进行训练优化。S3. Use the annotation database to train and optimize the video target recognition model, and use the face information database to train and optimize the identity recognition model.

S4、获取待分析的涉爆视频。S4. Obtain the explosion-related video to be analyzed.

S0、分析评价待分析的涉爆视频中图像的清晰度，并对其进行筛选剔除；如图2所示，具体包括：S0, analyze and evaluate the clarity of the images in the explosion-related video to be analyzed, and filter them out; as shown in Figure 2, specifically including:

S01、对待分析涉爆视频的图像进行傅里叶变化转换到频域，

N为图像序列长度，f(i,j)代表一幅大小为N×N的矩阵，其中i＝0,1,2,···,N-1和j＝0,1,2,···,N-1，F(k,l)表示f(i,j)的傅里叶变换S01. Perform Fourier transform of the image to be analyzed in the explosion-related video into the frequency domain,

N is the length of the image sequence, f(i,j) represents a matrix of size N×N, where i=0,1,2,...,N-1 and j=0,1,2,... ·,N-1, F(k,l) represents the Fourier transform of f(i,j)

S02、去除低于预先设定频率的低频信号；S02, remove the low frequency signal lower than the preset frequency;

S03、利用快速傅里叶变换将图像从频域转换到空间域，S03. Convert the image from the frequency domain to the spatial domain by using the fast Fourier transform,

上述式子中，f(a,b)代表一幅大小为N×N的矩阵，其中a＝0,1,2,···,N-1和b＝0,1,2,···,N-1，f(a,b)表示F(k,l)的逆傅里叶变换。

In the above formula, f(a,b) represents a matrix of size N×N, where a=0,1,2,...,N-1 and b=0,1,2,... , N-1, f(a,b) represents the inverse Fourier transform of F(k,l).

其中，F(k,l)为频域像素均值，P(k,b)为频域形式，N为序列长度，b＝0,1,2,···,N-1。

Among them, F(k,l) is the pixel mean value in the frequency domain, P(k,b) is the frequency domain form, N is the sequence length, b=0,1,2,...,N-1.

S04、计算空间域内的幅度均值，幅度均值大于预设阈值的图形为清晰图像，反之则为模糊图像，剔除模糊图像。S04: Calculate the average amplitude value in the spatial domain, and the graph whose average amplitude value is greater than the preset threshold is a clear image, otherwise it is a blurred image, and the blurred image is eliminated.

S5、待分析的涉爆视频导入至视频真伪检测模型、优化后的视频目标识别模型和优化后的身份识别模型，对涉爆视频进行真伪分析、识别爆破操作的相关信息和识别爆破操作人员的信息。S5. Import the explosion-related video to be analyzed into the video authenticity detection model, the optimized video target recognition model, and the optimized identity recognition model, and perform authenticity analysis on the explosion-related video, identify the relevant information of the blasting operation, and identify the blasting operation. personnel information.

通过视频真伪检测模型、优化后的视频目标识别模型和优化后的身份识别模型对涉爆视频进行分析，分析判断该涉爆视频的真伪性、爆破过程中的相关操作信息和操作爆破的操作人员信息，并通过这些分析结果判断是否流程作业中是否存在安全隐患，规范工作人员的操作，实现了高效运行、精准判定、智能分析，让安全问题的管控更加高效，节省大量的人力、物力和财力；Through the video authenticity detection model, the optimized video target recognition model and the optimized identity recognition model, the explosion-related video is analyzed to analyze and judge the authenticity of the explosion-related video, the relevant operation information in the blasting process, and the operation of the blasting. Operator information, and through these analysis results to determine whether there are safety hazards in the process operation, standardize the operation of the staff, realize efficient operation, accurate judgment, intelligent analysis, make the management and control of safety problems more efficient, and save a lot of manpower and material resources and financial resources;

野外视频录制条件复杂，导致视频质量优劣不齐，因此在对涉爆视频进行分析前，首先对涉爆视频图像的清晰度进行分析评价，快速在大量视频中筛选出模糊视频，提高复检效率；模糊降低了图像的清晰度，严重影响图像质量，导致图像分析、处理、接收的困难甚至失败，因此必须要使用有效的模糊评价方法来控制模糊图像的使用，从而提高系统整体性能。The conditions for recording videos in the wild are complicated, resulting in uneven video quality. Therefore, before analyzing the video related to the explosion, first analyze and evaluate the clarity of the video image related to the explosion, and quickly filter out the blurred video from a large number of videos, so as to improve the re-inspection. Efficiency; blur reduces the clarity of the image, seriously affects the image quality, and causes difficulties or even failures in image analysis, processing, and reception. Therefore, an effective blur evaluation method must be used to control the use of blurred images, thereby improving the overall performance of the system.

隐患违章识别装置还包括采集设备，采集设备用于采集爆破视频和爆破人员的相关的信息，采集设备与底层服务器通讯连接。The hidden danger and violation identification device also includes a collection device, which is used for collecting blasting videos and relevant information of blasting personnel, and the collection device is connected to the underlying server for communication.

隐患违章识别装置还包括云端服务器和远程终端，云端服务器的信号端分别与远程终端的信号端和底层服务器的信号端连接。The hidden danger and violation identification device further includes a cloud server and a remote terminal, and the signal terminal of the cloud server is respectively connected with the signal terminal of the remote terminal and the signal terminal of the underlying server.

视频目标识别模型为YOLOv5检测模型，如图1所示，目标检测采用一阶目标检测算法--YOLOv5，该方法具有良好的使用Pytorch框架，能够方便地训练自己的数据集，Pytorch框架更容易投入生产。不仅易于配置环境，模型训练也非常快速，并且批处理推理产生实时结果。能够直接对单个图像，批处理图像，视频甚至网络摄像头端口输入进行有效推理。最后YOLOv5s高达的检测速度非常快，能够在短时间内获取大量作业视频的评判结果。The video target recognition model is the YOLOv5 detection model. As shown in Figure 1, the target detection adopts the first-order target detection algorithm-YOLOv5. This method has a good use of the Pytorch framework and can easily train its own data set. The Pytorch framework is easier to invest Production. Not only is the environment easy to configure, model training is also very fast, and batch inference produces real-time results. Ability to perform efficient inference directly on single images, batch images, videos, and even webcam port inputs. Finally, the detection speed of YOLOv5s is very fast, and the judgment results of a large number of homework videos can be obtained in a short time.

YOLOv5是一种单阶段目标检测算法，该算法添加了一些新的改进思路，使其速度与精度都得到了极大的性能提升。主要的改进思路如下所示：YOLOv5 is a single-stage target detection algorithm. The algorithm adds some new improvement ideas, so that its speed and accuracy have been greatly improved. The main improvement ideas are as follows:

输入端：在模型训练阶段，提出了一些改进思路，主要包括Mosaic数据增强、自适应锚框计算、自适应图片缩放；Input: In the model training stage, some improvement ideas are proposed, mainly including Mosaic data enhancement, adaptive anchor box calculation, and adaptive image scaling;

基准网络：融合其它检测算法中的一些新思路，主要包括：Focus结构与CSP结构；Benchmark network: Integrate some new ideas in other detection algorithms, mainly including: Focus structure and CSP structure;

Neck网络：目标检测网络在BackBone(骨干网络)与最后的Head输出层之间往往会插入一些层，YOLOv5中添加了FPN+PAN结构；Neck network: The target detection network often inserts some layers between the BackBone (backbone network) and the final Head output layer. The FPN+PAN structure is added to YOLOv5;

Head输出层：输出层的锚框机制与YOLOv4相同，主要改进的是训练时的损失函数GIOU_Loss，以及预测框筛选的DIOU_nms。Head output layer: The anchor box mechanism of the output layer is the same as that of YOLOv4. The main improvements are the loss function GIOU_Loss during training, and the DIOU_nms for prediction box screening.

身份识别模型为人脸识别模型，进行人脸识别的过程如图3所示，人脸检测：利用人脸检测模型(目标检测模型或者其它检测模型)在图像中找到所有人脸的位置(结果为检测框)，并将人脸部分的图像切割出来；The identity recognition model is a face recognition model, and the process of face recognition is shown in Figure 3. Face detection: use a face detection model (target detection model or other detection model) to find the position of all faces in the image (the result is detection frame), and cut out the image of the face part;

人脸对齐：利用人脸对齐模型校正人脸，基于人脸关键点进行校正，对齐标准人脸的关键点；Face alignment: Use the face alignment model to correct the face, perform correction based on the key points of the face, and align the key points of the standard face;

人脸编码：利用深度学习模型对人脸图像进行编码，提取人脸特征；Face encoding: use deep learning models to encode face images and extract face features;

身份识别：将人脸特征与人脸库中的数据进行比对，判定人脸的身份。Identity recognition: Compare the facial features with the data in the face database to determine the identity of the face.

FaceNet就是通用人脸识别系统：采用深度卷积神经网络(CNN)学习将图像映射到欧式空间。空间距离直接和图片相似度相关：同一个人的不同图像在空间距离很小，不同人的图像在空间中有较大的距离，可以用于人脸验证、识别和聚类。FaceNet is a general-purpose face recognition system: it uses a deep convolutional neural network (CNN) to learn to map images into Euclidean space. Spatial distance is directly related to image similarity: different images of the same person have a small distance in space, and images of different people have a large distance in space, which can be used for face verification, recognition and clustering.

FaceNet采用深度神经网络来提取特征，并采用triplet_loss来衡量训练过程中样本之间的距离误差。在训练前或者在线学习中不断给神经网络制造“困难”，即一直在寻找与样本最不像的“自己”，同时寻找与自己最像的“他人”。通过随机梯度下降法，不断缩短自身所有样本的差距，同时尽可能拉大与其他人的差距，最终达到一个最优，通过这样一种嵌入学习(Embedding learing)，能对原始的特征提取网络输出层再进一步学习，从而改善特征的表达。FaceNet uses a deep neural network to extract features, and uses triplet_loss to measure the distance error between samples during training. Before training or during online learning, it constantly creates "difficulties" for the neural network, that is, it is always looking for the "self" that is the least like the sample, and at the same time looking for the "other" that is most like oneself. Through the stochastic gradient descent method, the gap between all of its own samples is continuously shortened, and the gap with others is widened as much as possible, and finally an optimum is achieved. Through such an embedding learning, the original feature extraction network output can be obtained. The layers are further learned to improve the representation of the features.

对于整个FaceNet结构，采用一个极深度网络Inception ResNet-v2，由3个带有残差连接的Inception模块和1个Inception v4模块组成的模型结构进行特征提取。For the entire FaceNet structure, an extremely deep network, Inception ResNet-v2, is used for feature extraction. The model structure consists of 3 Inception modules with residual connections and 1 Inception v4 module.

模型的整体框架与其他经典深度学习方法基本一致。前面特征提提取部分也是基于CNN的，只不过深度网络Inception-v4，后面接一个特征归一化层，使得特征的二范式

The overall framework of the model is basically consistent with other classical deep learning methods. The previous feature extraction part is also based on CNN, but the deep network Inception-v4, followed by a feature normalization layer, makes the feature two-paradigm

即将图像特征都映射到一个超球面上，这样可以规避样本的成像环境带来的差异。最后采用triplet_loss作为损失，加上随机梯度下降法(SDG,Stochastic GradientDescent)进行反向传播，模型Inception还连接了残差，也是本方法的突出点之一，提高了训练收敛速度。That is, the image features are mapped to a hypersphere, which can avoid the differences caused by the imaging environment of the sample. Finally, triplet_loss is used as the loss, and stochastic gradient descent (SDG, Stochastic GradientDescent) is used for backpropagation. The model Inception also connects the residual, which is also one of the prominent points of this method, which improves the training convergence speed.

视频真伪检测模型为OCR文字识别模型，如图4所示，本发明中视频真伪检测模型主要分为两个部分的工作：The video authenticity detection model is an OCR character recognition model, as shown in Figure 4, the video authenticity detection model in the present invention is mainly divided into two parts:

视频真伪判定：利用OCR文字识别技术识别视频拍摄区域右下角水印时间是否与当日作业时间一致，防止拼接视频的出现。Video authenticity determination: OCR text recognition technology is used to identify whether the watermark time in the lower right corner of the video shooting area is consistent with the operation time of the day, so as to prevent the appearance of spliced videos.

夜间施工识别：利用作业当日拍摄的视频，提取视频上的水印日期，判别工作人员是否在夜间施工，杜绝此现象，加强安全管控。Night construction identification: Use the video shot on the day of the operation to extract the watermark date on the video, determine whether the staff is working at night, prevent this phenomenon, and strengthen safety control.

视频真伪检测模型是基于RNN文字识别算法的主要框架，如图5所示，The video authenticity detection model is the main framework based on the RNN text recognition algorithm, as shown in Figure 5.

(1)其中Max pooling中的窗口大小为1*2，保证提出的特征具有横向的长度，有利于识别较长的文本；(1) The window size in Max pooling is 1*2, which ensures that the proposed feature has a horizontal length, which is conducive to recognizing longer texts;

(2)CNN+RNN的训练比较困难，所以加入了BatchNorm，有助于模型收敛；(2) The training of CNN+RNN is difficult, so BatchNorm is added to help the model converge;

优势：Advantage:

(1)可以端到端训练；(1) End-to-end training is possible;

(2)不需要进行字符分割和水平缩放操作，只需要垂直方向缩放到固定长度即可，同时可以识别任意长度的序列；(2) There is no need to perform character segmentation and horizontal scaling operations, only need to scale to a fixed length in the vertical direction, and can identify sequences of any length;

(3)可以训练基于字典的模型和不基于词典的任意模型；(3) Dictionary-based models and arbitrary models that are not dictionary-based can be trained;

(4)训练速度快，并且模型很小。(4) The training speed is fast, and the model is small.

整个CRNN网络可以分为三个部分：Convolutional Layers-----这里的卷积层就是一个普通的CNN网络，用于提取输入图像的Convolutional feature maps，即将图像转换为卷积特征矩阵；The entire CRNN network can be divided into three parts: Convolutional Layers-----The convolutional layer here is an ordinary CNN network, which is used to extract the Convolutional feature maps of the input image, that is, convert the image into a convolutional feature matrix;

Recurrent Layers-----这里的循环网络层是一个深层双向LSTM网络，在卷积特征的基础上继续提取文字序列特征，所谓深层RNN网络，是指超过两层的RNN网络，对于单层双向RNN网络，结构如图6所示，Recurrent Layers-----The recurrent network layer here is a deep bidirectional LSTM network, which continues to extract text sequence features on the basis of convolutional features. The so-called deep RNN network refers to an RNN network with more than two layers. For single-layer bidirectional RNN network, the structure is shown in Figure 6,

在CRNN中使用了第二种stack形深层双向结构。A second stack-shaped deep bidirectional structure is used in CRNN.

Transcription Layers-----将RNN输出做softmax后，为字符输出。Transcription Layers-----After the RNN output is softmax, it is output for characters.

本发明解决了物探过程的安全管控问题，如：爆破人员未持证上岗、不穿戴安全帽和防静电服、夜间下药并爆破等。通过模型检测，判断是否流程作业中是否存在安全隐患，规范工作人员的操作；The invention solves the safety management and control problems in the geophysical exploration process, such as: blasting personnel do not hold a certificate for work, do not wear safety helmets and anti-static clothing, and spray and blast at night. Through model detection, it is judged whether there is a potential safety hazard in the process operation, and the operation of the staff is standardized;

通过底层算法和前端的集成，实现了高效运行、精准判定、智能分析，让安全问题的管控更加高效，节省大量的人力、物力和财力。Through the integration of the underlying algorithm and the front-end, efficient operation, accurate judgment, and intelligent analysis are realized, which makes the management and control of security issues more efficient and saves a lot of manpower, material and financial resources.

综合了目标检测技术、OCR文字识别技术、人脸识别技术及视频模糊判定技术、分布式信息处理技术等，能够评判涉爆视频操作的规范性，通过底层算法对视频进行处理，并通过前端功能平台生成分析结果保存到后台，传送到监管部门或采集中心。It integrates target detection technology, OCR text recognition technology, face recognition technology, video fuzzy judgment technology, distributed information processing technology, etc., can judge the standardization of explosive video operations, process the video through the underlying algorithm, and use the front-end function to process the video. The analysis results generated by the platform are saved to the background and sent to the regulatory department or collection center.

本发明的技术方案不限于上述具体实施例的限制，凡是根据本发明的技术方案做出的技术变形，均落入本发明的保护范围之内。The technical solutions of the present invention are not limited to the limitations of the above-mentioned specific embodiments, and all technical modifications made according to the technical solutions of the present invention fall within the protection scope of the present invention.

Claims

1. A hidden danger violation identification method for an implosion video is characterized by comprising the following steps:

s1, constructing a video authenticity detection model, a video target identification model and an identity identification model, wherein the video target identification model is a YOLOv5 detection model, and the identity identification model is a face identification model;

s2, constructing a label database and a face information database, labeling the training data set to export a label file to form a label database, wherein the face information database stores the relevant information of the certified civil explosion operating personnel;

s3, training and optimizing the video target recognition model by adopting the annotation database, and training and optimizing the identity recognition model by adopting the face information database;

s4, acquiring the explosion-related video to be analyzed;

s5, importing the explosion-related video to be analyzed into a video authenticity detection model, an optimized video target identification model and an optimized identity identification model, and carrying out authenticity analysis on the explosion-related video, identification of relevant information of explosion operation and identification of information of explosion operators;

and S6, carrying out authenticity analysis according to the blasting video, identifying relevant information of blasting operation and identifying information of blasting operators, and analyzing hidden dangers and violation conditions in the blasting video.

2. The method for identifying the hidden danger violation for the explosion-related video as recited in claim 1, wherein a step S0 is further included between S4 and S5, and the definition of the image in the explosion-related video to be analyzed is analyzed, evaluated and screened out.

3. The method for identifying a hidden danger violation for an implosion-related video as recited in claim 1, wherein S0 comprises:

s01, carrying out Fourier transformation on the image of the explosion-related video to be analyzed, and converting the image into a frequency domain;

s02, removing low-frequency signals with the frequency lower than the preset frequency;

s03, converting the image from a frequency domain to a space domain by using fast Fourier transform;

and S04, calculating an amplitude mean value in the spatial domain, judging whether the image is a blurred image according to the amplitude mean value, and removing the blurred image.

4. The method for identifying the hidden danger violation for the implosion-related video as recited in claim 1, wherein in S04, the graph with the amplitude mean value larger than the preset threshold is a clear image, otherwise, the graph is a blurred image.

5. A hidden danger identification device violating regulations for concerning with exploding video, its characterized in that includes bottom server, and bottom server includes:

a reservoir; the storage is used for storing a computer program;

a processor; the processor is adapted to execute a computer program, which when executed performs the steps of the method for identification of a hidden violation for an implosion video according to any of claims 1-4.

6. The hidden danger violation identification device for the implosion video as recited in claim 5, further comprising a collection device for collecting the information related to the blasting video and the blasting personnel, wherein the collection device is in communication connection with the underlying server.

7. The hidden danger violation identification device for the implosion-related video according to claim 5, further comprising a cloud server and a remote terminal, wherein a signal end of the cloud server is connected with a signal end of the remote terminal and a signal end of the bottom server respectively.