WO2020253308A1 - 矿井下皮带运输人员人机交互行为安全监控与预警方法 - Google Patents

矿井下皮带运输人员人机交互行为安全监控与预警方法 Download PDF

Info

Publication number
WO2020253308A1
WO2020253308A1 PCT/CN2020/082006 CN2020082006W WO2020253308A1 WO 2020253308 A1 WO2020253308 A1 WO 2020253308A1 CN 2020082006 W CN2020082006 W CN 2020082006W WO 2020253308 A1 WO2020253308 A1 WO 2020253308A1
Authority
WO
WIPO (PCT)
Prior art keywords
human
belt
human body
key points
person
Prior art date
Application number
PCT/CN2020/082006
Other languages
English (en)
French (fr)
Inventor
孙彦景
董锴文
程小舟
云霄
侯晓峰
王博文
王斌
徐宏力
陈晓晶
Original Assignee
中国矿业大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国矿业大学 filed Critical 中国矿业大学
Priority to CA3094424A priority Critical patent/CA3094424C/en
Publication of WO2020253308A1 publication Critical patent/WO2020253308A1/zh

Links

Images

Classifications

    • EFIXED CONSTRUCTIONS
    • E21EARTH DRILLING; MINING
    • E21FSAFETY DEVICES, TRANSPORT, FILLING-UP, RESCUE, VENTILATION, OR DRAINING IN OR OF MINES OR TUNNELS
    • E21F17/00Methods or devices for use in mines or tunnels, not covered elsewhere
    • E21F17/18Special adaptations of signalling or alarm devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons

Definitions

  • the invention belongs to the field of underground operation monitoring, and particularly relates to a method for safety monitoring of the behavior of underground belt transportation personnel.
  • the early warning system for safety behavior of coal mine employees based on video surveillance is mainly based on the analysis and recognition of personnel actions to achieve early warning of dangerous behaviors.
  • Yang Chaoyu et al. proposed in 2016 the safety behavior monitoring method based on feature extraction and SVM classification.
  • Zhang Liya proposed in 2017 a method for monitoring downhole dangerous areas based on moving target detection, which uses rectangular boxes to locate personnel downhole to achieve personnel safety behavior monitoring;
  • Zhu Aichun and others proposed difficult samples based on generative confrontation training in 2018
  • the mine underground personnel posture recognition method by mining the hourglass network, aiming at the limitations of the rectangular frame, locates and safely recognizes underground personnel through the detection of key points of the human body, which improves the accuracy and robustness of underground personnel safety recognition.
  • the present invention proposes a safety monitoring and early warning method for the human-machine interaction behavior of underground belt transportation personnel.
  • the technical solution of the present invention is:
  • the safety monitoring and early warning method of human-computer interaction behavior of underground belt transportation personnel includes the following steps:
  • step (3) each frame of the video is taken as input, and the deep features are extracted from it to obtain the feature map F; the feature map F is input into step 1 of the two convolutional neural networks,
  • the product neural network predicts a set of local affinity fields among them Represents the reasoning process of the convolutional neural network in step 1.
  • step 1 of the strip convolutional neural network is connected in series with the original feature map F and input into the subsequent steps to obtain more accurate prediction results.
  • the subsequent steps are expressed by the following formula:
  • S t and L t are the confidence map and local affinity field obtained in step t , respectively, ⁇ t and They are the reasoning process of two convolutional neural network steps t respectively.
  • the mean square error loss function is applied after each step of the two convolutional neural networks.
  • the loss function of the two convolutional neural networks at step t is as follows:
  • x j,k represents the real coordinates of the key points of the j-th person of the k-th person in the marked training sample
  • is a constant that controls the degree of dispersion of the confidence point Gaussian graph
  • a person contains 9 key points of the human body, which respectively represent the person's nose, chest, right shoulder, right hand, left shoulder, left hand, hip bone, right foot, and left foot.
  • step (5) the front view projection and top view projection of the belt danger area are determined according to the belt danger area obtained in step (2); for each person in the video, calculate the difference between the key points of the human body and the top view projection of the belt danger area The minimum distance between d T , the minimum distance d F between the key points of the human body and the front view projection of the belt danger area, and the height h of the key points of the human body, if both d T and d F are less than or equal to the safety distance threshold d, and h is less than The height of the projection of the front view of the belt dangerous area will determine that the person’s human-computer interaction behavior is unsafe and issue an early warning.
  • a deep neural network classifier is constructed, which is used to classify the detected human body key point information, and the human body key point position information in each frame of picture is combined into a sample, Corresponding to an action category, use a large number of labeled human key points—action samples to train the classifier to enable it to recognize human actions in a single frame of pictures, and determine the safety of different actions according to the recognition results of the classifier distance threshold d i, where the subscript i denotes i-type action.
  • a continuous multi-frame probability judgment model is added to the single-frame action recognition: using continuous M pictures as the judgment unit, use a single-frame action classifier to return The classification results of the actions in the M pictures are recorded, and the counts of the different classification results are recorded. Finally, the ratio of each classification result to the total number of results is calculated. The largest ratio is the action classification result of the M pictures.
  • the present invention locates the belt position in the video based on the camera calibration imaging principle, and performs three-dimensional ROI delineation based on the belt position and size; adopts the "bottom-up" key point extraction method to first detect and then cluster the belt transporters Multi-person key point detection to ensure detection accuracy and improve detection efficiency; project the key points of the human body and the ROI area twice, one for front projection, and the other for overhead projection.
  • the human body and belt are projected on two projection surfaces. Estimating the position relationship; constructing a deep neural network to classify the key point information in a single frame of pictures, returning each person’s action label, combining the person’s action recognition and position judgment, and performing actions with different safety factors based on different scales Location judgment.
  • unsafe behaviors in human-computer interaction are screened out and an early warning is given to eliminate major safety hazards of the belt transportation system caused by abnormal contact between personnel and the belt area.
  • Figure 1 is an overall flow chart of the present invention
  • Figure 2 is a three-view projection view of the belt dangerous area
  • Figure 3 is a schematic diagram of camera calibration
  • Figure 4 is a schematic diagram of the key point prediction network structure
  • Figure 5 is the coordinate-confidence curve diagram of key points
  • Figure 6 is a schematic diagram of belt coordinate transformation and projection
  • Figure 7 is a simplified schematic diagram of the key points of the human body
  • Figure 8 is a schematic diagram of the projection method to assess unsafe actions
  • Figure 9 is a schematic diagram of the classification of downhole personnel
  • Figure 10 is a schematic diagram of the safe position judgment when falling
  • Figure 11 is a schematic diagram of a safe position judgment when squatting
  • Figure 12 is a schematic diagram of a safe position judgment when smoking
  • Figure 13 is a schematic diagram of a specific implementation process of the present invention.
  • FIG. 1 The process of the safety monitoring and early warning method for human-machine interaction behavior of underground belt transportation personnel proposed by the present invention is shown in FIG. 1.
  • the camera calibration principle to model the belt position in a three-dimensional ROI (region of interest), and then use the “bottom-up” method to detect the key points of the belt transporter, and based on the key
  • the point information uses DNN to classify actions, and finally the ROI and key points are projected in the front and top directions, and the positional relationship between the key points and the ROI area is evaluated based on the safety distance threshold of different actions to make judgments and warnings for dangerous actions.
  • delineating the unsafe region of interest is the basic task of the detection stage.
  • the present invention recognizes dangerous actions by evaluating the positional relationship between the key points of the human body and the defined belt ROI. If the traditional method is adopted to delineate the belt ROI area in 2D, the false alarm rate will inevitably increase. Because the 2D ROI cannot evaluate the positional relationship between the person and the belt in the vertical direction. For example, if a miner is working normally on a certain level higher than the belt, there is a high probability that it will be evaluated as an unsafe behavior under the 2D ROI.
  • the present invention proposes to establish a 3D shape ROI model according to the belt position, estimate the size of the belt in the video according to the camera calibration imaging principle, and then delimit a 3D shape ROI area based on this, and this area has three views as shown in picture 2.
  • Image coordinate system is a coordinate system in pixels, its origin is at the upper left, and the position of each pixel is expressed in pixels, so such a coordinate system is called an image
  • the pixel coordinate system (u, v), u and v respectively represent the number of columns and rows of pixels in the digital image.
  • Camera height H the distance between the world coordinate point corresponding to the pixel coordinate center and the camera on the y axis O 3 M, the image coordinate of the pixel coordinate center point O 1 (ucenter, vcenter), the measurement point P is the world coordinate of the point to be measured Q
  • the projection on the Y axis has a pixel coordinate of P 1 (0,v).
  • the length of the actual pixel is x pix
  • the width of the actual pixel is y pix
  • O 1 O 2 is the camera focal length f.
  • the calibration diagram is shown in Figure 3.
  • the Y coordinate is calculated as:
  • represents the angle formed by O 1 O 2 and P 1 P
  • is the angle between the camera and the horizontal plane, represented by the acute angle formed by O 1 O 2 and the Y axis.
  • the X coordinate is calculated as:
  • Its function is to cluster the predicted key points according to each person and each limb to obtain a complete set of human key point information.
  • the prediction results from one step of the two branches will be concatenated with the original feature map and input into the subsequent steps to obtain more accurate prediction results.
  • the subsequent inference steps can be expressed by the following formula:
  • L 2 loss also called mean square error
  • a spatial weighting value is used to solve the problem that some data sets do not label all the key points.
  • the "bottom-up" key point detection method will finally output the coordinate information of each key point.
  • the above monocular vision method can calculate the x w , y w axis components in the world coordinates corresponding to a pixel coordinate point in the video , Is sufficient for overhead projection. But for the key points of the human body, if the z-axis component of each point cannot be calculated, it cannot be projected in the front view direction. At the same time, the target pixel in monocular vision does not contain depth information that can reflect the 3D relationship , So it cannot complete the conversion from the image coordinate system to the world coordinate system.
  • each key point model is shown in the figure below.
  • the key point customizes a height component z w , and combines it with the known x w , y w axis components to form a complete human body key point’s world coordinates x w , y w , z w .
  • the projection of the belt dangerous area ROI in the front and top directions is shown in Figure 6.
  • the system After simplifying the human body key point model, if the minimum distance d T and d F between the key point and the dangerous area ROI in the front view and the top view projection direction is less than or equal to the safety distance threshold d, and the h in the front view is less than the belt dangerous area At a height (1.5m), the system will evaluate the action at this time as an unsafe action and issue a warning.
  • the identification of specific action types is added on the basis of the dangerous action assessment based on the position relationship judgment, and different safety distance thresholds are set according to the degree of danger of different actions.
  • the key point information collected above can be classified into behaviors.
  • the key point position information in each frame of picture is combined into a sample, corresponding to a type of action.
  • a probability judgment model of continuous multiple frames is added: One picture is the judgment unit.
  • a single-frame action classifier is used to return the classification results of the actions in the five pictures, and record the count of different classification results, and finally calculate the ratio of each result to the total number of results. The largest ratio is these five The action classification result of the image.
  • the process of downhole personnel behavior classifier based on deep neural network is shown in Figure 9.
  • Figure 10-12 is a schematic diagram of safety judgments corresponding to the three actions.
  • Figure 13 shows a specific implementation process of the present invention.
  • (a) is the detection diagram of the belt dangerous area and the key points of the human body
  • (b) is the top view of the key points of the human body and the belt dangerous area
  • (c) is Front view of key points of the human body and the dangerous area of the belt.

Abstract

一种矿井下皮带运输人员人机交互行为安全监控与预警方法。基于相机标定原理对视频中的皮带位置进行定位,并基于皮带位置及尺寸进行三维ROI划定;采用"自底向上"的关键点提取方法,对皮带运输人员进行先检测再聚类的多人关键点检测,保证检测精度的同时提高检测效率;将人体关键点和ROI区域分别进行两次投影,在两个投影面上对人体和皮带位置关系进行估计,筛选出人机交互中的不安全行为并进行预警,以消除人员与皮带区域的非正常接触造成的皮带运输系统重大安全隐患。

Description

矿井下皮带运输人员人机交互行为安全监控与预警方法 技术领域
本发明属于矿井下操作监控领域,特别涉及了一种矿井下皮带运输人员行为安全监控方法。
背景技术
我国的煤炭生产行业的发展一直处于世界的领先地位,但煤矿开采作为一个高危行业,多年来存在着极大的生产安全隐患。煤矿的皮带运输机作为当前矿井下最常见的运输系统,它的安全运行直接影响煤矿生产的安全水平。现阶段针对皮带运输系统的安全管理工作大多采用人工监视的方式,具有持续时间短、覆盖范围窄以及成本高昂等局限性。因此开发一套基于视频监控模式的针对皮带运输机及其相关工作人员的安全预警系统,对于提高皮带运输系统的安全生产水平有着重要意义。
目前基于视频监控的煤矿从业人员安全行为预警系统主要仅仅基于对人员的动作进行分析和识别的方式实现危险行为预警,如杨超宇等人在2016年提出的基于特征提取和SVM分类的安全行为监控方法,以及张立亚在2017年提出的基于动目标检测的井下危险区域监测方法,通过采用矩形框对井下人员进行定位,实现人员安全行为监控;朱艾春等人在2018年提出的基于生成对抗训练的困难样本挖掘沙漏网络的煤矿井下人员姿态识别方法,针对矩形框的局限性,通过人体关键点检测对井下人员进行定位与安全识别,提高了井下人员安全识别精度和鲁棒性。以上方法对无人机交互(即人与设备的交互)的不安全行为有较好的评估和识别效果,但井下大部分安全事故都发生在人机不安全交互的过程中,仅仅通过人员动作识别或人员位置判断实现安全预警而不对人机之间交互行为进行识别是远远不够的。同时,现有算法模型(如朱艾春等人用的生成对抗训练的困难样本挖掘)存在结构复杂、运行速度慢以及检测速度随检测人数增加而线性增加等问题,不具备较好的应用前景。
发明内容
为了解决上述背景技术提到的技术问题,本发明提出了矿井下皮带运输人员人机交互行为安全监控与预警方法。
为了实现上述技术目的,本发明的技术方案为:
矿井下皮带运输人员人机交互行为安全监控与预警方法,包括以下步骤:
(1)通过监控摄像头采集矿井下的实时视频流;
(2)利用相机标定原理对视频中的皮带进行尺寸估计,再据此划定一个三维的ROI区域,即皮带危险区域;
(3)在视频中检测所有人的人体关键点,通过局部亲和场衡量关键点之间的关联程度,并结合二分图匹配优化方法将属于单个人的人体关键点进行聚类,达到检测视频中每个人的人体关键点的目的;
(4)确定检测到的人体关键点在世界坐标系中x、y轴分量,对每个人体关键点自定 义一个高度分量z,3个分量组合为完整的人体关键点的世界坐标;
(5)根据皮带危险区域与每个人的人体关键点的相对位置关系,判断人机相互行为是否安全,进而确定是否需要预警。
进一步地,在步骤(3)中,将视频中的每一帧图片作为输入,提取其中的深层特征,得到的特征图F;将特征图F输入到两条卷积神经网络的步骤1中,在步骤1中,第一条卷积神经网络会预测一组关键点的置信图S 1=ρ 1(F),其中ρ 1表示该卷积神经网络在步骤1的推理过程;第二条卷积神经网络会预测一组局部亲和场
Figure PCTCN2020082006-appb-000001
其中
Figure PCTCN2020082006-appb-000002
表示该卷积神经网络在步骤1的推理过程,它的作用是将预测出的人体关键点根据每个人、每条肢干进行聚类,以得到一组完整的人体关键点信息;随后,两条卷积神经网络步骤1的预测预测结果分别与原始的特征图F串联在一起输入到后续步骤中,以得到更加精确的预测结果,后续步骤通过下式表示:
Figure PCTCN2020082006-appb-000003
Figure PCTCN2020082006-appb-000004
上式中,S t和L t分别为步骤t得到的置信图和局部亲和场,ρ t
Figure PCTCN2020082006-appb-000005
分别为两条卷积神经网络步骤t的推理过程。
进一步地,在两条卷积神经网络的每个步骤后分别应用均方误差损失函数,两条卷积神经网络在步骤t的损失函数如下:
Figure PCTCN2020082006-appb-000006
Figure PCTCN2020082006-appb-000007
上式中,
Figure PCTCN2020082006-appb-000008
Figure PCTCN2020082006-appb-000009
分别为两条卷积神经网络在步骤t的损失函数;p为待检测图片中任意一点的坐标;W(p)为一个布尔值,当训练数据集中的标注不存在时W(p)=0,反之W(p)=1;
Figure PCTCN2020082006-appb-000010
表示点p处第j个人体关键点在步骤t的置信图,
Figure PCTCN2020082006-appb-000011
表示置信图的真实位置;
Figure PCTCN2020082006-appb-000012
表示点p处在步骤t的局部亲和场,
Figure PCTCN2020082006-appb-000013
表示局部亲和场真实位置;
定义图片中任一位置p点的关键点置信度的真实参照如下:
Figure PCTCN2020082006-appb-000014
上式中,x j,k表示已标注的训练样本中第k个人的第j个人体关键点的真实坐标,σ是控制置信点高斯图离散程度的常数;
进行取最大值的操作,则得到第k个人的第j个人体关键点的置信参照
Figure PCTCN2020082006-appb-000015
进一步地,一个人包含9个人体关键点,这9个人体关键点分别表征人的鼻子、胸口、右肩、右手、左肩、左手、胯骨、右脚和左脚。
进一步地,在步骤(5)中,根据步骤(2)得到的皮带危险区域确定皮带危险区域正 视图投影和俯视图投影;对于视频中的每个人,计算其人体关键点与皮带危险区域俯视图投影之间的最小距离d T、人体关键点与皮带危险区域正视图投影之间的最小距离d F以及该人体关键点的高度h,若d T和d F同时小于等于安全距离阈值d,且h小于皮带危险区域正视图投影的高度,则判断该人员的人机交互行为是不安全的,发出预警。
进一步地,在步骤(5)中,构建深度神经网络分类器,利用该分类器对检测到的人体关键点信息进行动作分类,将每一帧图片中的人体关键点位置信息组合为一个样本,对应于一种动作的类别,使用标注好的大量人体关键点—动作样本对分类器进行训练,使其具备识别单帧图片中人体动作的能力,根据分类器的识别结果确定不同动作对应的安全距离阈值d i,其中下标i代表第i类动作。
进一步地,考虑到监控视频中的人员动作具有连贯性,在单帧动作识别的基础上加入了连续多帧的概率判断模型:以连续的M张图片为判断单元,使用单帧动作分类器返回对这M张图片中动作的分类结果,并记录不同分类结果的计数,最后计算各个分类结果占总结果数的比率,比率最大的即为这M张图片的动作分类结果。
进一步地,动作分类结果包含3类:摔倒、下蹲和吸烟;对这3类动作分配不同的安全系数γ i,并据此计算各自的安全距离阈值d i=γ i·d,其中i=1,2,3,根据安全距离阈值判断该动作下人员的人机交互行为是否安全。
采用上述技术方案带来的有益效果:
本发明基于相机标定成像原理对视频中的皮带位置进行定位,并基于皮带位置及尺寸进行三维ROI划定;采用“自底向上”的关键点提取方法,对皮带运输人员进行先检测再聚类的多人关键点检测,保证检测精度的同时提高检测效率;将人体关键点和ROI区域分别进行两次投影,一次为正面投影,另外一次为俯视投影,在两个投影面上对人体和皮带位置关系进行估计;构建深度神经网络对单帧图片中的关键点信息进行行为分类,返回每个人的动作标签,将人员动作识别与位置判断相结合,对不同安全系数的动作基于不同的尺度进行位置判断。通过本发明筛选出人机交互中的不安全行为并进行预警,以消除人员与皮带区域的非正常接触造成的皮带运输系统重大安全隐患。
附图说明
图1是本发明整体流程图;
图2是皮带危险区域三视投影图;
图3是相机标定示意图;
图4是关键点预测网络结构示意图;
图5是关键点的坐标—置信度曲线图;
图6是皮带坐标变换及投影示意图;
图7是人体关键点简化示意图;
图8是投影法评估不安全动作示意图;
图9是井下人员动作分类示意图;
图10是摔倒时安全位置判断示意图;
图11是蹲下时安全位置判断示意图;
图12是吸烟时安全位置判断示意图;
图13是本发明具体实施过程示意图。
具体实施方式
以下将结合附图,对本发明的技术方案进行详细说明。
本发明提出的矿井下皮带运输人员人机交互行为安全监控与预警方法的流程如图1所示。对于由监控摄像头采集到的实时视频流,使用相机标定原理对皮带位置进行三维ROI(感兴趣区域)建模,然后基于“自底向上”的方法对皮带运输人员进行关键点检测,并基于关键点信息使用DNN进行动作分类,最后将ROI与关键点在正视和俯视方向上进行投影,基于不同动作的安全距离阈值评估关键点和ROI区域之间的位置关系对危险动作做出判断和预警。
1、皮带危险区域建模
在皮带安全预警识别中,划定皮带不安全的感兴趣区域(ROI)是检测阶段的基础任务。本发明通过评估人体关键点与划定的皮带ROI之间的位置关系来对危险动作进行识别,如果采取传统的方法对皮带ROI区域进行2D划定,将不可避免地提高误报警率,这是由于2D形态的ROI无法评估竖直方向上人和皮带之间的位置关系。举例来说,如果某一矿工处于高于皮带的某一台阶上正常工作,此时根据2D形态的ROI下就有极大概率会将其评估的不安全行为。为了解决上述问题,本发明提出根据皮带位置建立一种3D形态的ROI模型,根据相机标定成像原理对视频中的皮带进行尺寸估计,再据此划定一个3D形态的ROI区域,此区域三视图如图2所示。
2、相机标定皮带尺寸
(ⅰ)皮带尺寸测量原理:已知单目摄像机的内参数,以及单目镜头内的图片图像坐标系坐标,确立图像坐标系和世界坐标系的关系,从而对皮带以及周围工作人员位置进行三维建模。
(ii)图像坐标系:图像坐标系:是一个以像素为单位的坐标系,它的原点在左上方,每个像素点的位置是以像素为单位来表示的,所以这样的坐标系叫图像像素坐标系(u,v),u和v分别表示像素在数字图像中的列数和行数。
(ⅲ)世界坐标系:由用户定义的三维坐标系,用于描述三维空间中的物体和相机的位置。以X,Y,Z表示。
由图可知左上角为图像坐标系UO 1P,以O 2为原点的摄像机坐标系以及世界坐标系XO 3Y,其中已知量有:
摄像机高度H,像素坐标中心对应的世界坐标点与摄像头在y轴上的距离O 3M,像素坐标中心点O 1的图像坐标(ucenter,vcenter),测量点P为待测点Q在世界坐标Y轴上的投影,其像素坐标为P 1(0,v)。实际像素的长度x pix,实际像素的宽度y pix,O 1O 2为摄像头焦距f。标定示意图如图3所示。
Y坐标计算为:
Figure PCTCN2020082006-appb-000016
Figure PCTCN2020082006-appb-000017
β=α-γ,
Figure PCTCN2020082006-appb-000018
其中,γ表示O 1O 2和P 1P形成的夹角,α为摄像机与水平面的角度,由O 1O 2与Y轴所成的锐角表示,计算得到角度β以后,根据直角三角形的性质可计算垂直方向的坐标Y=O 3P。
X坐标计算为:
Figure PCTCN2020082006-appb-000019
Figure PCTCN2020082006-appb-000020
Figure PCTCN2020082006-appb-000021
得到
Figure PCTCN2020082006-appb-000022
可得水平方向坐标X=PQ,则Q点的的真实坐标为(X,Y)。
3、井下人体关键点检测
传统的关键点检测算法大多采用“自顶向下”的方法,即先在待检测图像中检测所有的人,再分别对每个人的关键点进行检测,这种方法很难在大人数场景下进行高速检测。而本发明采用“自底向上”的结构,首先在视频中检测所有人的关键点,再通过二分图匹配优化的方法将属于单个人的关键点进行聚类,最后达到检测视频中每个人的身体关键点的目的。检测速度不因检测人数的增加而降低,能实现多人人体关键点实时检测。关键点检测结构如图4所示。
将彩色RGB图片作为输入,通过VGG19提取其中的深层特征,得到图4中的特征图。随后将特征图输入到两条卷积神经网络(Convolutional Neural Networks,CNNs)分支的步骤1中,在这一步骤中,分支1网络会预测一组关键点的置信图S 1=ρ 1(F),其中ρ 1表示分支1网络在步骤1的推理过程;而分支2网络则会预测一组“局部亲和场”
Figure PCTCN2020082006-appb-000023
Figure PCTCN2020082006-appb-000024
表示分支2网络在步骤1的推理过程,它的作用是将预测出的关键点根据每个人、每条肢干进行聚类,以得到一组完整的人体关键点信息。随后,来自两条分支钱一个步骤的预测结果都会和原始的特征图串联在一起输入到后面的步骤中,以得到更加精确的预测结果。随后的推理步骤可以由下式表示:
Figure PCTCN2020082006-appb-000025
Figure PCTCN2020082006-appb-000026
其中ρ t
Figure PCTCN2020082006-appb-000027
表示两条CNNs分支在步骤t的推理过程。
为了引导网络迭代预测关键点部位的置信图和“局部亲和场”,对于每条分支,在每一个步骤后都应用了L 2损失(也称为均方误差),用于衡量预测值和真实值之间的误差。这里采用了一个空间加权值以解决一些数据集并未将所有人的关键点进行标注的问题。每个CNNs分支在步骤t的损失函数可以由下式表示:
Figure PCTCN2020082006-appb-000028
Figure PCTCN2020082006-appb-000029
其中
Figure PCTCN2020082006-appb-000030
为关键点置信图的真实位置;
Figure PCTCN2020082006-appb-000031
为“局部亲和场”的真实位置;W为一个布尔值,当训练数据集中的标注不存在时W=0,反之W=1,主要是用来避免检测网络在无标注情况下对真实关键点的惩罚。
定义图片中任一位置p点的关键点置信度的真实参照为:
Figure PCTCN2020082006-appb-000032
其中p为待检测图片中任意一点的坐标,k表示图片中第k个人,x j,k则表示已标注的训练样本中第k个人的第j个关键点的真实坐标,σ是控制置信点高斯图离散程度的常数。图5为每个k,j所对应的关键点的坐标—置信度曲线图。
通过对上图进行取最大值的操作,就可以得到第k个人身上第j个关键点的置信参照
Figure PCTCN2020082006-appb-000033
4、关键点坐标与ROI区域的正俯视投影方法
“自底向上”的关键点检测方法最终将输出各个关键点的坐标信息,上述单目视觉的方法可以计算出视频中某一像素坐标点所对应的世界坐标中的x w,y w轴分量,对于正俯视投影来说已经足够。但对于人体关键点来说,如果不能计算出其各点的z轴分量,就无法对其在正视图方向上进行投影,同时由于单目视觉中的目标像素不包含能反映3D关系的深度信息,故其无法完成由图像坐标系到世界坐标系的转换。为了解决这一问题,本发明对人体的关键点模型进行了简化:在已知各关键点在世界坐标中的x w,y w轴分量后,在下图所示关键点模型基础上为每个关键点自定义一个高度分量z w,使其与已知的x w,y w轴分量组合为完整的人体关键点的世界坐标x w,y w,z w。与关键点坐标相对应,皮带危险区域ROI在正视和俯视方向上的投影如图6所示。
为了降低系统运行时间,将人体关键点模型做了简化,图7中的(a)为原始系统预测的人体关键点模型,一共有25个关键点,省略原始模型中的部分关键点,保留其中编号为0,2,5,4,8,7,22,19的关键点,将其简化为7中的(b)所示的简化模型。
在此模型中,将0点的z w轴分量设置为1.6m;1点,2点和5点都设置为1.3m;4,8,7点为1m;22,19点由于和皮带在一个平面故将其设置为0m。投影效果如图8所示,图8中的(a)为俯视投影,(b)为正视投影。
简化人体关键点模型后,如果在正视和俯视投影方向上关键点与危险区域ROI之间的最小距离d T以及d F同时小于等于安全距离阈值d,并且正视图中的h小于皮带危险区域的高度(1.5m)时,系统就会将此时的动作评估为不安全动作并发出预警。
5、矿井下人员危险行为识别方法
由于上述基于位置关系的不安全行为评估方法无法具体地判断危险动作的类型,如人员在设备旁摔倒、倚靠设备或坐在设备上,而这些行为又存在着极大的安全隐患,所以对井下皮带运输人员的具体动作进行识别是一个亟待解决的问题。
本发明在位置关系判断的危险动作评估的基础上加入了具体动作类型的识别,根据不同动作危险程度的大小,设定了不同的安全距离阈值。
通过构建一个简单的深度神经网络分类器,可以对上文中采集到的关键点信息进行行为分类,每一帧图片中的关键点位置信息组合为一个样本,对应于一种动作的类别。使用标注好的大量关键点—动作样本对分类器进行训练,使其具备识别单帧图片中人体动作的能力。除此之外,考虑到监控视频中的人员动作具有连贯性,通常前后多帧的关联性很大,所以在单帧动作识别的基础上加入了连续多帧的概率判断模型:以连续的五张图片为判断单元,使用单帧动作分类器返回对这五张图片中动作的分类结果,并记录不同分类结果的计数,最后计算各个结果占总结果数的比率,比率最大的就为这五张图片的动作分类结果。基于深度神经网络的井下人员行为分类器流程如图9所示。
待识别的不安全动作包括:摔倒、下蹲和吸烟,这三类动作均会对皮带运输人员的安全产生不同程度的影响,因此,为这三种动作分别设置了不同的安全系数,摔倒γ 1=2.0,下蹲γ 2=1.5,吸烟γ 3=1.3,通过计算d i=γ i*d(i=1,2,3)可以得到不同动作对应的安全距离阈值,通过将行为识别与位置评估相结合,可以对具有不同安全系数的动作在相应的安全距离内进行预警,实现危险动作提前预警的功能,大大提高了安全预警系统的可靠性。图10-12依次为3种动作对应的安全判断示意图。
当三种危险动作与皮带之间的水平距离d Ti、d Fi小于各自的安全阈值d i,同时与皮带水平距离最近的关键点与水平面的竖直高度h i小于皮带ROI区域高度时,系统将会判断此时的状态为不安全行为,并发出报警。
图13给出了本发明了一种具体实施过程,图13中的(a)为皮带危险区域和人体关键点检测图,(b)为人体关键点与皮带危险区域的俯视图,(c)为人体关键点与皮带危险区域的正视图。
实施例仅为说明本发明的技术思想,不能以此限定本发明的保护范围,凡是按照本发明提出的技术思想,在技术方案基础上所做的任何改动,均落入本发明保护范围之内。

Claims (8)

  1. 矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,包括以下步骤:
    (1)通过监控摄像头采集矿井下的实时视频流;
    (2)利用相机标定原理对视频中的皮带进行尺寸估计,再据此划定一个三维的ROI区域,即皮带危险区域;
    (3)在视频中检测所有人的人体关键点,通过局部亲和场衡量关键点之间的关联程度,并结合二分图匹配优化方法将属于单个人的人体关键点进行聚类,达到检测视频中每个人的人体关键点的目的;
    (4)确定检测到的人体关键点在世界坐标系中x、y轴分量,对每个人体关键点自定义一个高度分量z,3个分量组合为完整的人体关键点的世界坐标;
    (5)根据皮带危险区域与每个人的人体关键点的相对位置关系,判断人机相互行为是否安全,进而确定是否需要预警。
  2. 根据权利要求1所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,在步骤(3)中,将视频中的每一帧图片作为输入,提取其中的深层特征,得到的特征图F;将特征图F输入到两条卷积神经网络的步骤1中,在步骤1中,第一条卷积神经网络会预测一组关键点的置信图S 1=ρ 1(F),其中ρ 1表示该卷积神经网络在步骤1的推理过程;第二条卷积神经网络会预测一组局部亲和场
    Figure PCTCN2020082006-appb-100001
    其中
    Figure PCTCN2020082006-appb-100002
    表示该卷积神经网络在步骤1的推理过程,它的作用是将预测出的人体关键点根据每个人、每条肢干进行聚类,以得到一组完整的人体关键点信息;随后,两条卷积神经网络步骤1的预测预测结果分别与原始的特征图F串联在一起输入到后续步骤中,以得到更加精确的预测结果,后续步骤通过下式表示:
    Figure PCTCN2020082006-appb-100003
    Figure PCTCN2020082006-appb-100004
    上式中,S t和L t分别为步骤t得到的置信图和局部亲和场,ρ t
    Figure PCTCN2020082006-appb-100005
    分别为两条卷积神经网络步骤t的推理过程。
  3. 根据权利要求2所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,在两条卷积神经网络的每个步骤后分别应用均方误差损失函数,两条卷积神经网络在步骤t的损失函数如下:
    Figure PCTCN2020082006-appb-100006
    Figure PCTCN2020082006-appb-100007
    上式中,
    Figure PCTCN2020082006-appb-100008
    Figure PCTCN2020082006-appb-100009
    分别为两条卷积神经网络在步骤t的损失函数;p为待检测图片中任意一点的坐标;W(p)为一个布尔值,当训练数据集中的标注不存在时W(p)=0,反之W(p)=1;
    Figure PCTCN2020082006-appb-100010
    表示点p处第j个人体关键点在步骤t的置信图,
    Figure PCTCN2020082006-appb-100011
    表示置信图的真实位置;
    Figure PCTCN2020082006-appb-100012
    表示点p处在步骤t的局部亲和场,
    Figure PCTCN2020082006-appb-100013
    表示局部亲和场真实位置;
    定义图片中任一位置p点的关键点置信度的真实参照如下:
    Figure PCTCN2020082006-appb-100014
    上式中,x j,k表示已标注的训练样本中第k个人的第j个人体关键点的真实坐标,σ是控制置信点高斯图离散程度的常数;
    进行取最大值的操作,则得到第k个人的第j个人体关键点的置信参照
    Figure PCTCN2020082006-appb-100015
  4. 根据权利要求1所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,一个人包含9个人体关键点,这9个人体关键点分别表征人的鼻子、胸口、右肩、右手、左肩、左手、胯骨、右脚和左脚。
  5. 根据权利要求1所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,在步骤(5)中,根据步骤(2)得到的皮带危险区域确定皮带危险区域正视图投影和俯视图投影;对于视频中的每个人,计算其人体关键点与皮带危险区域俯视图投影之间的最小距离d T、人体关键点与皮带危险区域正视图投影之间的最小距离d F以及该人体关键点的高度h,若d T和d F同时小于等于安全距离阈值d,且h小于皮带危险区域正视图投影的高度,则判断该人员的人机交互行为是不安全的,发出预警。
  6. 根据权利要求5所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,在步骤(5)中,构建深度神经网络分类器,利用该分类器对检测到的人体关键点信息进行动作分类,将每一帧图片中的人体关键点位置信息组合为一个样本,对应于一种动作的类别,使用标注好的大量人体关键点—动作样本对分类器进行训练,使其具备识别单帧图片中人体动作的能力,根据分类器的识别结果确定不同动作对应的安全距离阈值d i,其中下标i代表第i类动作。
  7. 根据权利要求6所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,考虑到监控视频中的人员动作具有连贯性,在单帧动作识别的基础上加入了连续多帧的概率判断模型:以连续的M张图片为判断单元,使用单帧动作分类器返回对这M张图片中动作的分类结果,并记录不同分类结果的计数,最后计算各个分类结果占总结果数的比率,比率最大的即为这M张图片的动作分类结果。
  8. 根据权利要求6所述矿井下皮带运输人员人机交互行为安全监控与预警方法,其特征在于,动作分类结果包含3类:摔倒、下蹲和吸烟;对这3类动作分配不同的安全系数γ i,并据此计算各自的安全距离阈值d i=γ i·d,其中i=1,2,3,根据安全距离阈值判断该动作下人员的人机交互行为是否安全。
PCT/CN2020/082006 2019-06-21 2020-03-30 矿井下皮带运输人员人机交互行为安全监控与预警方法 WO2020253308A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA3094424A CA3094424C (en) 2019-06-21 2020-03-30 Safety monitoring and early-warning method for man-machine interaction behavior of underground conveyor belt operator

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910540349.7 2019-06-21
CN201910540349.7A CN110425005B (zh) 2019-06-21 2019-06-21 矿井下皮带运输人员人机交互行为安全监控与预警方法

Publications (1)

Publication Number Publication Date
WO2020253308A1 true WO2020253308A1 (zh) 2020-12-24

Family

ID=68408462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/082006 WO2020253308A1 (zh) 2019-06-21 2020-03-30 矿井下皮带运输人员人机交互行为安全监控与预警方法

Country Status (2)

Country Link
CN (1) CN110425005B (zh)
WO (1) WO2020253308A1 (zh)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110425005B (zh) * 2019-06-21 2020-06-30 中国矿业大学 矿井下皮带运输人员人机交互行为安全监控与预警方法
CN111126193A (zh) * 2019-12-10 2020-05-08 枣庄矿业(集团)有限责任公司蒋庄煤矿 一种基于深度学习煤矿井下不安全行为人工智能识别系统
CN111310595B (zh) * 2020-01-20 2023-08-25 北京百度网讯科技有限公司 用于生成信息的方法和装置
CN111325119B (zh) * 2020-02-09 2023-10-20 华瑞新智科技(北京)有限公司 一种安全生产的视频监控方法及系统
CN111223261B (zh) * 2020-04-23 2020-10-27 佛山海格利德机器人智能设备有限公司 一种复合智能生产安防系统及其安防方法
CN111611971B (zh) * 2020-06-01 2023-06-30 城云科技(中国)有限公司 一种基于卷积神经网络的行为检测方法及系统
CN111832526A (zh) * 2020-07-23 2020-10-27 浙江蓝卓工业互联网信息技术有限公司 一种行为检测方法及装置
CN112347916B (zh) * 2020-11-05 2023-11-17 安徽继远软件有限公司 基于视频图像分析的电力现场作业安全监控方法及装置
CN112488005B (zh) * 2020-12-04 2022-10-14 临沂市新商网络技术有限公司 基于人体骨骼识别和多角度转换的在岗监测方法及系统
CN113657309A (zh) * 2021-08-20 2021-11-16 山东鲁软数字科技有限公司 一种基于Adocf的穿越安全围栏违章行为检测方法
CN113610072B (zh) * 2021-10-11 2022-01-25 精英数智科技股份有限公司 一种基于计算机视觉的人员跨越皮带识别方法及系统
CN114937230B (zh) * 2022-07-21 2022-10-04 海门市三德体育用品有限公司 一种基于计算机视觉的健身动作危险性评估方法及系统
CN115131935A (zh) * 2022-08-30 2022-09-30 山东千颐科技有限公司 危险区域防进入报警系统
CN115797874A (zh) * 2023-02-07 2023-03-14 常州海图信息科技股份有限公司 基于ai的人员乘坐皮带监管方法、系统、设备及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130050491A1 (en) * 2011-08-26 2013-02-28 Industrial Technology Research Institute Warning method and system for detecting lane-changing condition of rear-approaching vehicles
CN107506740A (zh) * 2017-09-04 2017-12-22 北京航空航天大学 一种基于三维卷积神经网络和迁移学习模型的人体行为识别方法
CN109376673A (zh) * 2018-10-31 2019-02-22 南京工业大学 一种基于人体姿态估计的煤矿井下人员不安全行为识别方法
CN110425005A (zh) * 2019-06-21 2019-11-08 中国矿业大学 矿井下皮带运输人员人机交互行为安全监控与预警方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU1716199A (en) * 1997-12-09 1999-06-28 Government Of The United States Of America, As Represented By The Secretary Of The Department Of Health And Human Services, The Remote monitoring safety system
CN2791257Y (zh) * 2005-02-03 2006-06-28 北京中矿安全技术有限公司 矿井皮带机安全防爆型防爬系统
AU2009100016A4 (en) * 2009-01-12 2009-02-19 Beveridge, Todd M. Underground safety lifeline system
CN102761987A (zh) * 2012-06-21 2012-10-31 镇江中煤电子有限公司 应用无线传感器的矿用皮带机运输过程监控系统
CN103986913B (zh) * 2014-05-26 2017-08-11 中国矿业大学 一种综采工作面跟机视频动态切换监控系统
AU2017203411A1 (en) * 2016-06-01 2017-12-21 Strata Products Worldwide, Llc Method and apparatus for identifying when an idividual is in proximity to an object
CN207177958U (zh) * 2017-03-21 2018-04-03 中国矿业大学(北京) 煤矿井下人员伤害预警系统
CN207297100U (zh) * 2017-09-30 2018-05-01 北京瑞赛长城航空测控技术有限公司 煤矿危险区域人员安全监控系统
CN107939445B (zh) * 2017-11-01 2020-04-03 太原理工大学 井下危险区域人体接近超声与红外一体化预警装置
CN108564022A (zh) * 2018-04-10 2018-09-21 深圳市唯特视科技有限公司 一种基于定位分类回归网络的多人物姿势检测方法
CN208316750U (zh) * 2018-05-18 2019-01-01 中国神华能源股份有限公司 煤矿综合信息监控及发布系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130050491A1 (en) * 2011-08-26 2013-02-28 Industrial Technology Research Institute Warning method and system for detecting lane-changing condition of rear-approaching vehicles
CN107506740A (zh) * 2017-09-04 2017-12-22 北京航空航天大学 一种基于三维卷积神经网络和迁移学习模型的人体行为识别方法
CN109376673A (zh) * 2018-10-31 2019-02-22 南京工业大学 一种基于人体姿态估计的煤矿井下人员不安全行为识别方法
CN110425005A (zh) * 2019-06-21 2019-11-08 中国矿业大学 矿井下皮带运输人员人机交互行为安全监控与预警方法

Also Published As

Publication number Publication date
CN110425005A (zh) 2019-11-08
CN110425005B (zh) 2020-06-30

Similar Documents

Publication Publication Date Title
WO2020253308A1 (zh) 矿井下皮带运输人员人机交互行为安全监控与预警方法
CN110502965B (zh) 一种基于计算机视觉人体姿态估计的施工安全帽佩戴监测方法
CA3094424C (en) Safety monitoring and early-warning method for man-machine interaction behavior of underground conveyor belt operator
CN109492581B (zh) 一种基于tp-stg框架的人体动作识别方法
CN109522793B (zh) 基于机器视觉的多人异常行为检测与识别方法
CN109670441B (zh) 一种实现安全帽穿戴识别的方法、系统、终端以及计算机可读存储介质
CN104217419B (zh) 人体检测装置及方法与人体计数装置及方法
CN102521565B (zh) 低分辨率视频的服装识别方法及系统
CN109819208A (zh) 一种基于人工智能动态监控的密集人群安防监控管理方法
CN103390164B (zh) 基于深度图像的对象检测方法及其实现装置
CN103310444B (zh) 一种基于头顶摄像头的监控行人计数的方法
CN105303191A (zh) 一种前视监视场景下的行人计数方法和装置
CN109255298A (zh) 一种动态背景中的安全帽检测方法与系统
CN106128022A (zh) 一种智慧金睛识别暴力动作报警方法和装置
CN106846297A (zh) 基于激光雷达的行人流量检测系统及方法
Hermina et al. A Novel Approach to Detect Social Distancing Among People in College Campus
CN106845361B (zh) 一种行人头部识别方法及系统
Liu et al. Metro passenger flow statistics based on yolov3
CN112382068B (zh) 基于bim与dnn的车站候车线跨越检测系统
Xiao et al. Facial mask detection system based on YOLOv4 algorithm
Peng et al. Helmet wearing recognition of construction workers using convolutional neural network
CN107240111A (zh) 边沿连通分割客流统计方法
CN113076825A (zh) 一种变电站工作人员爬高安全监测方法
Ding et al. An Intelligent System for Detecting Abnormal Behavior in Students Based on the Human Skeleton and Deep Learning
Tao Statistical calculation of dense crowd flow antiobscuring method considering video continuity

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20827073

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20827073

Country of ref document: EP

Kind code of ref document: A1