WO2023279531A1 - 一种基于人体姿态识别的打钻视频退杆计数方法 - Google Patents

一种基于人体姿态识别的打钻视频退杆计数方法 Download PDF

Info

Publication number
WO2023279531A1
WO2023279531A1 PCT/CN2021/118738 CN2021118738W WO2023279531A1 WO 2023279531 A1 WO2023279531 A1 WO 2023279531A1 CN 2021118738 W CN2021118738 W CN 2021118738W WO 2023279531 A1 WO2023279531 A1 WO 2023279531A1
Authority
WO
WIPO (PCT)
Prior art keywords
human body
drilling
coordinates
video
drill pipe
Prior art date
Application number
PCT/CN2021/118738
Other languages
English (en)
French (fr)
Inventor
姚超修
吴航海
胡亚磊
谢浩
武福生
蒋泽
蒋志龙
陈佩佩
王琪
郝东波
徐晓华
胡金成
曹宁宁
Original Assignee
天地(常州)自动化股份有限公司
中煤科工集团常州研究院有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天地(常州)自动化股份有限公司, 中煤科工集团常州研究院有限公司 filed Critical 天地(常州)自动化股份有限公司
Publication of WO2023279531A1 publication Critical patent/WO2023279531A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the present invention relates to the technical field of intelligent image recognition, in particular to a drilling video-based method for counting withdrawn rods based on human posture recognition.
  • intelligent image recognition technology uses the digital images collected by the mine camera, and performs calculation and analysis through the intelligent algorithm embedded in the camera or the back-end server algorithm to realize the perception of the video content, and then judge and identify the corresponding target according to the set rules and carry out Report accordingly. Because intelligent video recognition adopts non-contact detection, intelligent video recognition has the advantages of wide detection range and low detection cost, and can work 24 hours a day, which greatly improves work efficiency.
  • the technical problem to be solved by the present invention is: in order to overcome the deficiencies in the prior art, provide a method for counting drill rods withdrawn in drilling video based on human body posture recognition, and detect the continuous action of workers removing the drill pipe through human body posture recognition , Automatically calculate the number of drill pipes taken out by workers, thereby improving the accuracy of intelligent video analysis drill pipe counting.
  • the technical solution adopted by the present invention to solve the technical problem is: a method for counting rods retracted by drilling video based on human posture recognition, and the specific steps are as follows:
  • Step 1 Data collection: Use the front-end mining intrinsically safe camera to obtain the video when the drilling face is withdrawn, and then collect the video data;
  • Step 2 Data preprocessing and label making: the video data is transmitted to the ground server through the ring network, and the ground server analyzes and processes the video data;
  • Step 3 Train the AlphaPose model with human body key point detection function and the drill pipe, drilling rig, and human body recognition models;
  • Step 4 Detect the drill pipe, frame the detected drill pipe, and record the relevant parameters of the target frame; at the same time, detect the person, detect the key points of the human skeleton for the detected person, and record the coordinates of the key points of the human body;
  • Step 5 The back-end server algorithm jointly judges the actual obtained drill pipe by detecting whether the worker grabs the drill pipe and whether there is a moving action: judge whether the coordinates of the key points of the hand overlap with the area selected by the drill pipe. When the coordinates of the key points of the hand If there is no coincidence with the frame selection area of the drill pipe, repeat step 4; when the coordinates of the key points of the arm overlap with the frame selection area of the drill pipe, judge whether there is an action to carry the drill pipe through the movement trajectory of the key points of the whole body; If there is no action of moving drill pipes in the trajectory judgment, the number of drill pipes remains unchanged; when it is judged that there is an action of moving drill pipes through the movement trajectory of the key points of the whole body, add 1 to the number of drill pipes already taken.
  • the relevant parameters of the target frame include the center point position of the frame, the length of the frame and the height of the frame.
  • the coordinates of the key points of the human body include the coordinates of the head of the human body, the shoulders of the human body, the hands of the human body, the knees of the human body, and the feet of the human body .
  • step 2 the labelImg tool is used to label the collected image data, and each type of identical image is labeled with a corresponding category.
  • a method for counting rods withdrawn by drilling video based on human posture recognition of the present invention has the following advantages:
  • the number of drill pipes can be directly counted by analyzing video recordings, avoiding long-term and high-intensity manual counting;
  • the number of drilled and withdrawn pipes can be accurately counted, which has a very high accuracy rate
  • Fig. 1 is the algorithm flowchart of the present invention
  • Fig. 2 is a schematic diagram of the principle of the present invention.
  • Fig. 3 is a schematic diagram of the algorithm effect of the present invention.
  • Fig. 4 is the second schematic diagram of the algorithm effect of the present invention.
  • Fig. 5 is the third schematic diagram of the algorithm effect of the present invention.
  • a kind of drill video based on human posture recognition of the present invention is counted the method of withdrawing rods, including front-end mining intrinsically safe camera, ring network and back-end server, Among them, the front-end mining intrinsically safe camera has functions such as auto focus, strong light suppression, and fill light.
  • the front-end mining intrinsically safe camera is at least 2 million pixels, and the protection level is at least IP54.
  • the algorithm of the back-end server adopts the AlphaPose model with human key point detection function and the drill pipe, drilling rig, and human body recognition model to detect the key points of the drill pipe and human skeleton at the same time, and then judge whether to move the drill pipe by analyzing the trajectory of the human body. for accurate counting.
  • the front-end mining intrinsically safe camera is installed on the underground drilling face and is used to record the video when drilling and withdrawing the rod.
  • the data collected by the front-end mining intrinsically safe camera is transmitted to the ground through the ring network, and the data is analyzed by the algorithm through the back-end server to detect the number of drill pipes obtained by the workers in the video, so as to achieve the function of automatic counting.
  • the specific principle of the drilling video rod withdrawal counting method based on human body posture recognition is as follows: first, the downhole camera captures the rod withdrawal video, then the industrial ring network transmits the data, and finally the back-end algorithm processes to complete the intelligent counting.
  • Step 1 Data collection: Use the front-end mining intrinsically safe camera to obtain the video when the drilling face is withdrawn, and then collect the video data.
  • Step 2 Data preprocessing and label making: the video data is transmitted to the ground server through the ring network, and the ground server analyzes and processes the video data.
  • the labelImg tool is used to label the collected picture data, and each category of the same picture is labeled with the corresponding category.
  • the equipment in the mine is marked as "machine”
  • the person is marked as "person”
  • the target drill pipe is marked as "object”, etc.
  • the human body key point label adopts the Keypoint evaluation in the MS COCO dataset.
  • Step 3 Train the AlphaPose model with human body key point detection function and the drill pipe, drilling rig, and human body recognition models.
  • Step 4 detect the drill pipe, frame the detected drill pipe, and record the relevant parameters of the target frame, the relevant parameters of the target frame include the center point position of the frame, the length of the frame and the height of the frame; at the same time, detect people, Detect the key points of the human skeleton on the detected person, and record the coordinates of the key points of the human body.
  • the coordinates of the key points of the human body include the coordinates of the human head, the coordinates of the human shoulder, the coordinates of the human hand, the coordinates of the human knee and The coordinates of the human foot.
  • Step 5 The back-end server algorithm jointly judges the actual obtained drill pipe by detecting whether the worker has grasped the drill pipe and whether there is a moving action, so as to avoid missed detection and false detection: judge whether the coordinates of the key points of the hand overlap with the frame selection area of the drill pipe , when the coordinates of the key points of the hand do not coincide with the frame selection area of the drill pipe, repeat step 4; when the coordinates of the key points of the arm coincide with the frame selection area of the drill pipe, judge whether there is an action to carry the drill pipe based on the movement trajectory of the key points of the whole body ; When it is judged that there is no action of moving drill pipes through the movement trajectory of the key points of the whole body, the number of drill pipes remains unchanged; plus 1.
  • the boundingbox coordinates of the output drill pipe are (X1, Y1, X2, Y2), where X1, Y1 are the coordinates of the upper left corner of the object frame, and X2, Y2 are the coordinates of the lower right corner of the object frame.
  • the center point of the box and the length and height of the box can be calculated.
  • the coordinates of the key points of the human body are also a set of (x, y) position coordinates, which can be detected later by logical judgment between the coordinates. Whether the overlap mentioned in the article, that is, whether the IOU between the coordinates or the boundingbox exceeds the threshold.
  • STN Spatial Transformer Network
  • Chinese meaning Space Transformer Network.
  • STN Session Transformer Network
  • For irregular human body image input after the STN operation, an accurate human frame is obtained. Input the candidate area to obtain high-quality candidate areas. That is, an anchor frame is performed on the human body picture data in the video stream. Since the characters in the video stream are always moving, the human body picture data obtained by decoding has tortuousness, that is, irregular shape.
  • STN is used to operate the picture data, allowing the neural network to learn how to perform spatial transformation on the input image , to enhance the geometric invariance of the model.
  • STN is a 2D affine transformation, defined as follows:
  • i represents the i-th coordinate point in the image data
  • s represents the new coordinate name
  • t represents the original coordinate name
  • is the coordinates before transformation specifically, is the abscissa of the pixel in the original character image data before transformation
  • 1 represents the default value of the vertical coordinate of the pixel in the character image data before 2D affine transformation
  • ⁇ 1 , ⁇ 2 and ⁇ 3 are transformation parameters, where,
  • SPPE Single Person Pose Estimation: The full name of SPPE is single person pose estimator, and the Chinese meaning is single person pose estimation.
  • SDTN Spatial Inverse Transform Network
  • SDTN is defined as follows:
  • ⁇ 1 , ⁇ 2 and ⁇ 3 are transformation parameters, and the relationship between ⁇ 1 , ⁇ 2 and ⁇ 3 and ⁇ 1 , ⁇ 2 and ⁇ 3 is as follows:
  • ⁇ 3 -1 ⁇ [ ⁇ 1 ⁇ 2 ] ⁇ 3 (4)
  • Pose-NMS Eliminate additional estimated poses.
  • the full name of Pose-NMS is parametric pose nonmaximum suppression.
  • the Chinese meaning is non-maximum suppression of parameter pose, which can be understood here as eliminating the additional estimated pose.
  • the i-th posture consist of m joint points, where i and m are both positive integers greater than or equal to 1, the set of the i-th posture is defined as:
  • k is location
  • location represents the joint positioning point
  • c is socre.
  • score represents the pose score of the current anchor point.
  • Elimination process The pose with the highest score is used as the benchmark, and the poses close to the benchmark pose are repeatedly eliminated until a single pose remains.
  • Elimination criterion The elimination criterion is used to repeatedly eliminate the remaining gestures, and the elimination criterion is:
  • f represents the elimination criterion.
  • the output is 1, the current posture Pi is deleted, otherwise it is retained; Pi and Pj respectively represent different postures; ⁇ represents the parameter set of the posture distance measurement; ⁇ represents the threshold value; d represents the posture distance measurement; ⁇ Indicates the weight of the balance attitude distance and space distance; f(.) represents the attitude point elimination criterion as a whole; d( ⁇ ) represents the attitude distance measurement function as a whole, and the attitude distance measurement function d( ) includes attitude distance and space distance, if d( ) is not greater than ⁇ , then the output of f( ) above is 1, indicating that P i needs to be eliminated because P i is too similar to the reference pose P j . It is defined as follows:
  • K sim represents the soft matching function, that is, the similarity between different features
  • ⁇ 1 and ⁇ 2 represent the learning rules, that is, the initialization of the gradient.
  • ⁇ 1, ⁇ 2, ⁇
  • the pose distance is used to eliminate poses that are too close and too similar to other poses. It is assumed that the bbox of P i is B i , and the bbox represents the position information of the selected frame of pose P i . It is defined as the following soft matching formula (similarity of scores between different features):
  • the purpose of the present invention is to intelligently analyze the video of drilling and withdrawing the rod at the back end, detect the continuous action of the worker removing the drill rod through human body posture recognition, and automatically calculate the number of drill rods taken out by the worker, instead of only recognizing the worker’s manual removal of the drill.
  • the characteristics of the 2-3 video frames of the drill pipe thereby improving the accuracy of the counting of the drill pipe by intelligent video analysis.
  • the advantages of the present invention are: (1) the number of drill pipes can be directly counted by analyzing video recordings, avoiding long-term, high-intensity manual counting; There is a moving action, and the number of drilling rods can be accurately counted, which has a very high accuracy rate; (3) It is suitable for the transformation plan of the original general-purpose camera on the drilling surface of the mine, and only needs to be intelligently analyzed on the back-end server Video recording is enough, the transformation cost is low, and the construction steps are simple.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于人体姿态识别的打钻视频退杆计数方法,利用前端矿用本安摄像仪获取打钻工作面退杆时的录像,进而采集到视频数据;视频数据通过环网传输到地面服务器上,地面服务器对视频数据进行分析处理;训练具有人体关键点检测功能的AlphaPose模型以及钻杆、钻机、人体识别模型;检测钻杆,对检测到的钻杆进行框选,记录目标框的相关参数;检测人,对检测到的人进行人体骨骼关键点检测,并记录人体关键点的坐标;后端服务器算法通过检测工人是否抓取钻杆和是否有搬运动作联合判断实际获取的钻杆。该方法通过人体姿态识别来检测工人取下钻杆这一连续动作,自动计算出工人取出钻杆的数量,从而提高智能视频分析钻杆计数的准确率。

Description

一种基于人体姿态识别的打钻视频退杆计数方法 技术领域
本发明涉及智能图像识别的技术领域,尤其是一种基于人体姿态识别的打钻视频退杆计数方法。
背景技术
随着井下视频监控的普及,智能图像识别在煤矿中的应用越来越广泛。智能图像识别技术借用矿井摄像仪采集的数字图像,通过内嵌入摄像机的智能化算法或后端服务器算法进行运算分析,实现对视频内容的感知,进而根据设定的规则,判断识别相应目标并进行相应报警。由于智能视频识别采用了无接触式检测,因此智能视频识别具备检测范围广、检测成本低等优势,且能24小时不间断工作,这使得工作效率大大提高。
但现在大多煤矿现场依然靠地面人为回看录像的方式对打钻退杆进行计数,煤矿井下打钻工作面为工人准备的手动按钮钻杆计数操作方式繁琐且效果不佳,无法进行有效计数;同时往往每段视频录像都长达1~2小时,且煤矿井下工作环境恶劣,光线昏暗,工作人员需要时刻集中精力回看录像,长期连续工作后,极易出现因疲劳而造成的漏检、误检。
虽然现在也有一些智能视频分析进行自动钻杆计数,但效果却不是很理想,其中最主要的原因是,这些方法往往都是截取工人取钻杆前后几帧的图片,并通过神经网络提取工人拿取钻杆时瞬间的特征,一旦视频中检测到工人手触碰到末端钻杆即进行+1计数。但往往实际工作作业时会出现工人手扶钻杆、错位重叠等现象,并未真正取下,此时如果依然计数就会造成误检。
发明内容
本发明要解决的技术问题是:为了克服现有技术中存在的不足,提供一种基于人体姿态识别的打钻视频退杆计数方法,通过人体姿态识别来检测工人取下钻杆这一连续动作,自动计算出工人取出钻杆的数量,从而提高智能视频分析钻杆计数的准确率。
本发明解决其技术问题所采用的技术方案是:一种基于人体姿态识别的打钻视频退杆计数方法,具体步骤如下:
步骤1、数据采集:利用前端矿用本安摄像仪获取打钻工作面退杆时的录像,进而采集到视频数据;
步骤2、数据预处理及标签制作:视频数据通过环网传输到地面服务器上,地面服务器对视频数据进行分析处理;
步骤3、训练具有人体关键点检测功能的AlphaPose模型以及钻杆、钻机、人体识别模型;
步骤4、检测钻杆,对检测到的钻杆进行框选,并记录目标框的相关参数;同时,检测人,对检测到的人进行人体骨骼关键点检测,并记录人体关键点的坐标;
步骤5、后端服务器算法通过检测工人是否抓取钻杆和是否有搬运动作联合判断实际获取的钻杆:判断手部关键点坐标与钻杆框选区域是否有重合,当手部关键点坐标与钻杆框选区域没有重合,则重复步骤4;当手臂关键点坐标与钻杆框选区域有重合,则通过全身关键点运动轨迹判断是否存在搬运钻杆的动作;当通过全身关键点运动轨迹判断不存在搬运钻杆的动作,则钻杆数保持不变;当通过全身关键点运动轨迹判断存在搬运钻杆的动作,则在已取钻杆数量的基础上加1。
进一步具体地限定,上述技术方案中,在步骤4中,目标框的相关参数包 括框的中心点位置、框的长度以及框的高度。
进一步具体地限定,上述技术方案中,在步骤4中,人体关键点的坐标包括人体头部的坐标、人体肩部的坐标、人体手部的坐标、人体膝盖部的坐标以及人体脚部的坐标。
进一步具体地限定,上述技术方案中,在步骤2中,采用labelImg工具对采集到的图片数据打标签,每一类相同的图片打上对应类别的标签。
本发明的有益效果是:本发明的一种基于人体姿态识别的打钻视频退杆计数方法,具有以下优点:
一、可通过分析视频录像直接对钻杆数量进行计数,避免长时间段、高强度的人工计数;
二、通过检测工人是否抓取钻杆并分析人体运动轨迹是否有搬运动作,对打钻退杆数量进行精确计数,拥有极高的准确率;
三、适用于对原先矿下打钻面通用的摄像仪进行改造方案,只需在后端服务器上智能分析录像即可,改造成本低,施工步骤简单。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明的算法流程图;
图2是本发明的原理示意图;
图3是本发明的算法效果示意图一;
图4是本发明的算法效果示意图二;
图5是本发明的算法效果示意图三。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
见图1、图2、图3、图4和图5,本发明的一种基于人体姿态识别的打钻视频退杆计数方法,包括前端矿用本安摄像仪、环网和后端服务器,其中的前端矿用本安摄像仪具备自动对焦、强光抑制、补光灯等功能。前端矿用本安摄像仪至少200万像素,防护等级至少为IP54。后端服务器的算法采用具有人体关键点检测功能的AlphaPose模型以及钻杆、钻机、人体识别模型,对钻杆和人体骨骼关键点同时进行检测,再通过分析人体运动轨迹判断是否搬取钻杆,以达到准确计数。前端矿用本安摄像仪安装于井下打钻工作面,用于记录打钻退杆时的录像。通过环网将前端矿用本安摄像仪采集到的数据传到地面,数据通过后端服务器进行算法分析,检测视频中工人获取钻杆数量,以达到自动计数的功能。
见图2,该基于人体姿态识别的打钻视频退杆计数方法,具体原理是:首先井下摄像仪获取退杆录像,然后工业环网传输数据,最终后端算法处理完成智能计数。
见图1,该基于人体姿态识别的打钻视频退杆计数方法,具体步骤如下:
步骤1、数据采集:利用前端矿用本安摄像仪获取打钻工作面退杆时的录像,进而采集到视频数据。
步骤2、数据预处理及标签制作:视频数据通过环网传输到地面服务器上,地面服务器对视频数据进行分析处理。实验中采用labelImg工具对采集到的图片数据打标签,每一类相同的图片打上对应类别的标签。如矿井中的设备被标记为“machine”,人物被标记为“person”,目标物钻杆被标记为“object”等。人体关键点标签采用MS COCO数据集中的Keypoint evaluation。
步骤3、训练具有人体关键点检测功能的AlphaPose模型以及钻杆、钻机、人体识别模型。
步骤4、检测钻杆,对检测到的钻杆进行框选,并记录目标框的相关参数,目标框的相关参数包括框的中心点位置、框的长度以及框的高度;同时,检测人,对检测到的人进行人体骨骼关键点检测,并记录人体关键点的坐标,人体关键点的坐标包括人体头部的坐标、人体肩部的坐标、人体手部的坐标、人体膝盖部的坐标以及人体脚部的坐标。
步骤5、后端服务器算法通过检测工人是否抓取钻杆和是否有搬运动作联合判断实际获取的钻杆,避免漏检和误检:判断手部关键点坐标与钻杆框选区域是否有重合,当手部关键点坐标与钻杆框选区域没有重合,则重复步骤4;当手臂关键点坐标与钻杆框选区域有重合,则通过全身关键点运动轨迹判断是否存在搬运钻杆的动作;当通过全身关键点运动轨迹判断不存在搬运钻杆的动作,则钻杆数保持不变;当通过全身关键点运动轨迹判断存在搬运钻杆的动作,则在已取钻杆数量的基础上加1。
例如算法检测到钻杆后,输出钻杆的boundingbox坐标为(X1,Y1,X2,Y2),其中X1,Y1为物体框左上角坐标,X2,Y2为物体框右下角坐标。框的中心点和框的长高即可计算得出。同理,人的人体关键点坐标也是一组(x,y)的位置坐标,后续通过坐标间的逻辑判断即可进行检测。文中提到的是否重合,即坐标 或boundingbox间IOU是否超过阈值。
其中,后端服务器算法的细节分为以下几部分:
(1)STN(空间变换网络):STN的全称为Spatial Transformer Network,中文含义为空间变换网络。对于不规则的人体图像输入,通过STN操作之后,得到准确人的框。输入候选区域,用于获取高质量的候选区域。即对视频流中的人体图片数据进行锚定画框。由于在视频流中的人物一直在移动,因此通过解码得到的人体图片数据具有曲折性,即不规则形,本发明中采用STN对图片数据进行操作,允许神经网络学习如何对输入图像执行空间变换,以增强模型的几何不变性。
STN为2D的仿射变换,定义如下:
Figure PCTCN2021118738-appb-000001
其中,i表示图片数据中第i个坐标点;s表示新的坐标名称;t表示原始坐标名称;
Figure PCTCN2021118738-appb-000002
为变换后坐标,具体地,
Figure PCTCN2021118738-appb-000003
为变换后人物图像数据中的横坐标,
Figure PCTCN2021118738-appb-000004
为变换后人物图像数据中的纵坐标;
Figure PCTCN2021118738-appb-000005
为变换前坐标,具体地,
Figure PCTCN2021118738-appb-000006
为变换前原始人物图像数据中像素点的横坐标,
Figure PCTCN2021118738-appb-000007
为变换前原始人物图像数据中像素点的纵坐标,1代表2D仿射变换前人物图像数据中像素点的的竖坐标默 认数值;θ 1、θ 2以及θ 3为变换参数,其中,
Figure PCTCN2021118738-appb-000008
(2)SPPE(单人姿态估计):SPPE的全称为single person pose estimat or,中文含义为单人姿态估计。
(3)SDTN(空间逆变换网络):将估计的姿态映射回原始的图像坐标。STD N的全称为Spatial Transformer Networks,中文含义为空间逆变换网络。
SDTN的定义如下:
Figure PCTCN2021118738-appb-000009
其中,γ 1、γ 2以及γ 3为变换参数,γ 1、γ 2以及γ 3与θ 1、θ 2以及θ 3的关系如下:
1 γ 2]=[θ 1 θ 2] -1    (3)
γ 3=-1×[γ 1 γ 23    (4)
(4)Pose-NMS:消除额外的估计到的姿态。Pose-NMS的全称为parametr ic pose nonmaximum suppression,中文含义为参数姿态非最大抑制,此处可以理解为消除额外的估计到的姿态。
定义:令第i个姿态由m个关节点组成,其中,i和m均为大于等于1的正整数,第i个姿态的集合,定义为:
Figure PCTCN2021118738-appb-000010
其中k为location,location表示关节定位点;c为socre。,score表示当前定位点的位姿得分。
消除过程:score最高的姿态作为基准,重复消除接近基准姿态的姿态,直到剩下单一的姿态。消除准则:消除标准用于重复消除剩余姿态,消除准则为:
f(Pi,Pj|Λ,η)=1[d(Pi,Pj|Λ,λ)≤η]    (5)
其中,f表示消除准则,当输出为1,则删除当前姿态Pi,反之保留;Pi与Pj分别表示不同的姿态;Λ表示姿态距离度量的参数集;η表示阈值;d表示姿态距离度量;λ表示平衡姿态距离和空间距离的权重;f(.)整体表示姿态点消除准则;d(·)整体表示姿态距离度量函数,姿态距离度量函数d(·)包括姿态距离和空间距离,若d(·)不大于η,则上面f(·)的输出为1,表明由于P i和基准姿态P j过于相似,因而P i需要被消除。其定义如下:
d(Pi,Pj|Λ)=K Sim(Pi,Pj|σ1)+λH sim(Pi,Pj|σ2)    (6)
其中,K sim表示软匹配函数,即不同特征之间的相似度;σ1和σ2分别表示学习规则,即梯度的初始化。Λ={σ1,σ2,λ}
姿态距离用于消除和其他姿态太近且太相似的姿态,假定P i的bbox是B i,bbox表示姿态P i选定框的位置信息。其定义为如下的soft matching公式(不同特征之间score的相似度):
Figure PCTCN2021118738-appb-000011
其中,i、j表示不同的姿态点;c表示集合;is witnin表示如果位姿P j的中心点在以
Figure PCTCN2021118738-appb-000012
的box中时,应进行清除;otherwise表示反之不予清除;
Figure PCTCN2021118738-appb-000013
为中心在
Figure PCTCN2021118738-appb-000014
的box,并且每个坐标
Figure PCTCN2021118738-appb-000015
为原始坐标B i的1/10;box表示 与位姿P i坐标同中心的框,其长度与宽度均为P i姿态框坐标的1/10。
得到具体人体关键点定位后,再根据其运动轨迹判定是否有搬运动作。
本发明的目的是在后端智能分析打钻退杆录像视频,通过人体姿态识别来检测工人取下钻杆这一连续动作,自动计算出工人取出钻杆的数量,而不是只识别工人手取钻杆这2-3视频帧的特征,从而提高智能视频分析钻杆计数的准确率。本发明的优点是:(1)可通过分析视频录像直接对钻杆数量进行计数,避免长时间段、高强度的人工计数;(2)通过检测工人是否抓取钻杆并分析人体运动轨迹是否有搬运动作,对打钻退杆数量进行精确计数,拥有极高的准确率;(3)适用于对原先矿下打钻面通用的摄像仪进行改造方案,只需在后端服务器上智能分析录像即可,改造成本低,施工步骤简单。
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。

Claims (4)

  1. 一种基于人体姿态识别的打钻视频退杆计数方法,其特征在于,具体步骤如下:
    步骤1、数据采集:利用前端矿用本安摄像仪获取打钻工作面退杆时的录像,进而采集到视频数据;
    步骤2、数据预处理及标签制作:视频数据通过环网传输到地面服务器上,地面服务器对视频数据进行分析处理;
    步骤3、训练具有人体关键点检测功能的AlphaPose模型以及钻杆、钻机、人体识别模型;
    步骤4、检测钻杆,对检测到的钻杆进行框选,并记录目标框的相关参数;同时,检测人,对检测到的人进行人体骨骼关键点检测,并记录人体关键点的坐标;
    步骤5、后端服务器算法通过检测工人是否抓取钻杆和是否有搬运动作联合判断实际获取的钻杆:判断手部关键点坐标与钻杆框选区域是否有重合,当手部关键点坐标与钻杆框选区域没有重合,则重复步骤4;当手臂关键点坐标与钻杆框选区域有重合,则通过全身关键点运动轨迹判断是否存在搬运钻杆的动作;当通过全身关键点运动轨迹判断不存在搬运钻杆的动作,则钻杆数保持不变;当通过全身关键点运动轨迹判断存在搬运钻杆的动作,则在已取钻杆数量的基础上加1。
  2. 根据权利要求1所述的一种基于人体姿态识别的打钻视频退杆计数方法,其特征在于:在步骤4中,目标框的相关参数包括框的中心点位置、框的长度以及框的高度。
  3. 根据权利要求1所述的一种基于人体姿态识别的打钻视频退杆计数方法,其特征在于:在步骤4中,人体关键点的坐标包括人体头部的坐标、人体肩部 的坐标、人体手部的坐标、人体膝盖部的坐标以及人体脚部的坐标。
  4. 根据权利要求1所述的一种基于人体姿态识别的打钻视频退杆计数方法,其特征在于:在步骤2中,采用labelImg工具对采集到的图片数据打标签,每一类相同的图片打上对应类别的标签。
PCT/CN2021/118738 2021-07-05 2021-09-16 一种基于人体姿态识别的打钻视频退杆计数方法 WO2023279531A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110755483.6 2021-07-05
CN202110755483.6A CN113591590B (zh) 2021-07-05 2021-07-05 一种基于人体姿态识别的打钻视频退杆计数方法

Publications (1)

Publication Number Publication Date
WO2023279531A1 true WO2023279531A1 (zh) 2023-01-12

Family

ID=78245846

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/118738 WO2023279531A1 (zh) 2021-07-05 2021-09-16 一种基于人体姿态识别的打钻视频退杆计数方法

Country Status (2)

Country Link
CN (1) CN113591590B (zh)
WO (1) WO2023279531A1 (zh)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090303055A1 (en) * 2008-06-05 2009-12-10 Hawkeye Systems, Inc. Above-water monitoring of swimming pools
CN110147743A (zh) * 2019-05-08 2019-08-20 中国石油大学(华东) 一种复杂场景下的实时在线行人分析与计数系统及方法
CN110725711A (zh) * 2019-10-29 2020-01-24 南京北路自动化系统有限责任公司 一种基于视频的打钻系统及辅助验钻方法
CN112116633A (zh) * 2020-09-25 2020-12-22 深圳爱莫科技有限公司 一种矿井打钻计数方法
CN112412440A (zh) * 2020-10-23 2021-02-26 中海油能源发展股份有限公司 一种钻进时期早期井涌检测方法
CN112528960A (zh) * 2020-12-29 2021-03-19 之江实验室 一种基于人体姿态估计和图像分类的吸烟行为检测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016077544A1 (en) * 2014-11-12 2016-05-19 Covar Applied Technologies, Inc. System and method for locating, measuring, counting, and aiding in the handling of drill pipes
CN111814601A (zh) * 2020-06-23 2020-10-23 国网上海市电力公司 一种将目标检测与人体姿态估计相结合的视频分析方法
CN112560741A (zh) * 2020-12-23 2021-03-26 中国石油大学(华东) 一种基于人体关键点的安全穿戴检测方法
CN112580609B (zh) * 2021-01-26 2022-03-15 南京北路智控科技股份有限公司 一种煤矿钻杆计数方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090303055A1 (en) * 2008-06-05 2009-12-10 Hawkeye Systems, Inc. Above-water monitoring of swimming pools
CN110147743A (zh) * 2019-05-08 2019-08-20 中国石油大学(华东) 一种复杂场景下的实时在线行人分析与计数系统及方法
CN110725711A (zh) * 2019-10-29 2020-01-24 南京北路自动化系统有限责任公司 一种基于视频的打钻系统及辅助验钻方法
CN112116633A (zh) * 2020-09-25 2020-12-22 深圳爱莫科技有限公司 一种矿井打钻计数方法
CN112412440A (zh) * 2020-10-23 2021-02-26 中海油能源发展股份有限公司 一种钻进时期早期井涌检测方法
CN112528960A (zh) * 2020-12-29 2021-03-19 之江实验室 一种基于人体姿态估计和图像分类的吸烟行为检测方法

Also Published As

Publication number Publication date
CN113591590B (zh) 2024-02-23
CN113591590A (zh) 2021-11-02

Similar Documents

Publication Publication Date Title
CN108564596B (zh) 一种高尔夫挥杆视频的智能比对分析系统及方法
WO2020253308A1 (zh) 矿井下皮带运输人员人机交互行为安全监控与预警方法
CN104517102B (zh) 学生课堂注意力检测方法及系统
CN106845357A (zh) 一种基于多通道网络的视频人脸检测和识别方法
CN102800126A (zh) 基于多模态融合的实时人体三维姿态恢复的方法
CN112270310A (zh) 一种基于深度学习的跨摄像头行人多目标跟踪方法和装置
CN112149512A (zh) 一种基于两阶段深度学习的安全帽佩戴识别方法
CN106682573B (zh) 一种单摄像头的行人跟踪方法
CN109758756B (zh) 基于3d相机的体操视频分析方法及系统
CN113139437B (zh) 一种基于YOLOv3算法的安全帽佩戴检查方法
CN110991315A (zh) 一种基于深度学习的安全帽佩戴状态实时检测方法
CN102509109B (zh) 一种唐卡图像与非唐卡图像的区分方法
CN104700088A (zh) 一种基于单目视觉移动拍摄下的手势轨迹识别方法
CN113920326A (zh) 基于人体骨骼关键点检测的摔倒行为识别方法
CN111105443A (zh) 一种基于特征关联的视频群体人物运动轨迹跟踪方法
CN113597614A (zh) 图像处理方法和装置、电子设备及存储介质
CN115115672A (zh) 基于目标检测和特征点速度约束的动态视觉slam方法
WO2023279531A1 (zh) 一种基于人体姿态识别的打钻视频退杆计数方法
CN114170686A (zh) 一种基于人体关键点的屈肘行为检测方法
CN109544632A (zh) 一种基于层次主题模型的语义slam对象关联方法
CN114639168B (zh) 一种用于跑步姿态识别的方法和系统
CN116311082A (zh) 基于关键部位与图像匹配的穿戴检测方法及系统
RU2802411C1 (ru) Способ подсчета извлечения штанг на видеозаписях бурения на основе распознавания жестов человеческого тела
CN113537019A (zh) 基于关键点识别变电站人员安全帽佩戴的检测方法
CN113076825A (zh) 一种变电站工作人员爬高安全监测方法

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE