CN116469037A - Computer vision-based method and system for early warning collision accident between excavator and person - Google Patents

Computer vision-based method and system for early warning collision accident between excavator and person Download PDF

Info

Publication number
CN116469037A
CN116469037A CN202310437824.4A CN202310437824A CN116469037A CN 116469037 A CN116469037 A CN 116469037A CN 202310437824 A CN202310437824 A CN 202310437824A CN 116469037 A CN116469037 A CN 116469037A
Authority
CN
China
Prior art keywords
excavator
coordinate system
human body
model
early warning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310437824.4A
Other languages
Chinese (zh)
Inventor
徐峰
田泽卉
张志鹏
胡昊
陶钰
黄鹤
馬文迪
刘啸宇
梅心语
戴磊
胡喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202310437824.4A priority Critical patent/CN116469037A/en
Publication of CN116469037A publication Critical patent/CN116469037A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Psychiatry (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Component Parts Of Construction Machinery (AREA)

Abstract

The invention relates to an excavator and human collision accident early warning method and system based on computer vision. According to the invention, firstly, 3D framework recognition of workers and excavators is carried out on a construction site video stream through computer vision algorithms such as target detection, 2D key point detection and 3D gesture recognition; secondly, based on the 3D motion gesture of the 3D framework of the excavator in the working state, a 3D dynamic dangerous area around the working state of the excavator is defined; finally, judging whether the key points of the 3D framework of the worker invade the surrounding 3D dynamic dangerous area under the working state of the excavator or not, and giving out early warning if the key points are dangerous due to collision of the excavator and the person. Compared with the prior art, the invention constructs the surrounding 3D dynamic dangerous area under the working state of the excavator through the computer vision technology, improves the accuracy and the effectiveness of automatic early warning of the collision accident between the excavator and the person, avoids the occurrence of erroneous judgment to the greatest extent, and is beneficial to the intelligent safety management of the construction site.

Description

Computer vision-based method and system for early warning collision accident between excavator and person
Technical Field
The invention relates to the technical field of computer vision and construction safety, in particular to an excavator and human collision accident early warning method and system based on computer vision.
Background
The construction industry belongs to labor-intensive industry, and project construction period requirements and prior art conditions cause that workers and heavy machinery often need to work at the same time and place, and certain man-machine interaction is inevitably generated. Because the labor skills and quality of workers are uneven, the production environment noise of the construction site is high, visual blind areas exist in the mechanical operation, the attention is not concentrated due to the working fatigue of the workers, and collision injury accidents are easy to occur when the constructors and heavy machinery work together. In construction machinery, the excavator has the characteristics of high degree of freedom of the mechanical arm, frequent interaction with constructors during operation and the like, so that the occurrence rate of collision accidents between the excavator and the human is high. Therefore, the early warning system is established for the collision accident of constructors and the excavator, and has important practical significance and practical value.
In recent years, with the development of computer software and hardware technology and the improvement of computing power, computer vision and deep learning technology are well-developed in the fields of face recognition, automatic driving and the like, and in some specific fields of target detection and the like, recognition accuracy approaching to or even exceeding that of human beings can be achieved. In the related field of civil engineering construction safety management, the wearing detection of the safety helmet and the protective clothing based on the computer vision is mature. At present, the application of the computer vision technology to the unsafe behavior of workers and the unsafe state detection of mechanical equipment is gradually increased, and the advantage that the computer vision technology does not interfere with on-site first-line production operation in construction safety supervision is reflected.
In the prior art, the traditional excavator and human collision accident early warning method based on the field inspection of the manager cannot achieve full-period full coverage, is easily influenced by subjective factors of the human, and is low in efficiency and poor in effect.
In addition, the method for preventing the collision of the heavy machinery and the person at the construction site also comprises two main methods, namely a sensor-based method and a visual information-based method. The excavator is construction heavy machinery with flexible mechanical arm and high degree of freedom, the whole movement is less during operation, the posture of the articulated mechanical arm is more varied, so that the excavator is often focused on the positioning of the position of the excavator bucket and the recognition of the posture of the mechanical arm rather than the whole positioning of the excavator when collision between the excavator and a person is prevented and detected, and the defect of poor detection effect exists.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a collision accident pre-warning method and system for an excavator and a person based on computer vision. The invention utilizes videos shot by the construction site monitoring facilities, and performs early warning on the collision accident of the excavator and the person under the condition of not interfering the first-line production operation of the construction site by using computer vision algorithms such as image processing, deep learning and the like, thereby reducing accident hidden danger of the construction site, preventing accident hazard and improving the safety management level of the construction site.
The aim of the invention can be achieved by the following technical scheme:
an excavator and human collision accident early warning method based on computer vision comprises the following steps:
acquiring a construction site video stream;
inputting the construction site video stream into an accident pre-warning model, and determining the movable dangerous area of the excavator;
judging whether a person exists in the dynamic dangerous area of the excavator, and if so, sending out an early warning;
the construction of the accident early warning model comprises the following steps:
s1, constructing an excavator target detection model and a human body target detection model; acquiring an excavator construction site picture data set, and training and verifying the excavator target detection model and the human body target detection model;
s2, constructing an excavator identification model based on deep learning, and extracting 2D key points of the excavator and restoring 3D coordinates of the key points;
s3, constructing a human body recognition model based on deep learning, and extracting 2D key points of the human body and restoring 3D coordinates of the key points;
s4, quantitatively extracting running state indexes of the excavator according to the 3D coordinate reduction result of the key points of the excavator, and defining a 3D dynamic dangerous area in the working state of the excavator;
and S5, unifying a human body coordinate system and an excavator coordinate system by using a PnP-based coordinate system unifying method, and identifying and monitoring behaviors of the 3D dynamic dangerous area under the working state of the personnel invading the excavator.
Further, in step S1, an ACID construction picture data set is adopted as a basic data set, and data expansion is performed on the basic data set through a Mosaic and CutMix data enhancement technology;
based on the excavator construction site picture data set, training the Yolox convolutional neural network to obtain an excavator target detection model and a human body target detection model.
Further, in step S2, constructing the deep learning-based excavator identification model includes the steps of:
training the Hourgass neural network by using an open source data set and a data enhancement method to obtain an excavator 2D key point detection Hourgass model capable of identifying 6 key points of an excavator mechanical arm and an excavator body;
modeling the excavator by using modeling software, binding the motion action of the excavator during construction operation, and writing a program to output the 2D coordinates of 6 key points in an image coordinate system and the 3D coordinates in a world coordinate system in the excavator motion process to generate a data set for converting the 2D key point coordinates of the excavator into the 3D coordinates;
and training a transducer model based on the data set of the conversion of the excavator 2D key point coordinates into the 3D coordinates, so as to obtain the conversion of the excavator 2D key point coordinates into the 3D coordinate transducer model.
Further, in step S3, constructing a human body recognition model based on deep learning includes the steps of:
training a Stacked Hourglass neural network by using an MSCOCO data set to obtain a human body 2D key point detection model;
and training the video Pose3D neural network by using the Human3.6M data set to obtain a human body 3D skeleton recognition model.
Further, in step S4, defining the 3D dynamic hazard zone in the working state of the excavator includes the following steps:
obtaining a 3D dynamic framework in the running process of the excavator from the construction site video stream through a Yolox-Hourglass-converter model, and performing post-treatment on the 3D dynamic framework;
using a Savgol filter to perform polynomial fitting on continuous signal points in a sliding window range, and filtering abnormal points in a 3D skeleton recognition result in an excavator running state;
decomposing the motion in the working state of the excavator into integral rotation and mechanical arm extension, and expressing the motion state of the excavator by using a local coordinate system and a global coordinate system;
the position parameters of skeleton key points describing the movement gesture under the working state of the excavator are converted into 4 angle parameters theta through a global coordinate system and a local coordinate system 1 、θ 2 、θ 3 And theta 4 The method comprises the steps of carrying out a first treatment on the surface of the Wherein θ 4 For expressing the global rotation of the excavator under a global coordinate system, theta 1 、θ 2 And theta 3 The mechanical arm stretching for expressing the excavator in the x-y plane of the local coordinate system;
calculating the angular speed of the excavator at the current moment of 4 angle parameters:
defining the average reaction time delta t of workers, and taking a 3D area possibly swept by the mechanical arm of the excavator within the average reaction time of the workers as a 3D dynamic dangerous area under the working state of the excavator, namely calculating the movement trend of the excavator based on delta t:
and obtaining a 3D region formed by 4 angle parameter value changes as a 3D dynamic dangerous region under the working state of the excavator.
Further, the local coordinate system of the excavator takes the vertical plane of the mechanical arm as an x-Y plane, the Y axis of the global coordinate system of the excavator is coincident with the Y axis of the local coordinate system, and the (x, Y, z) is carried out under the local coordinate system T In the global coordinate system (X, Y, Z) T The coordinate conversion between them is:
wherein, the boom represents the middle hinge key point of the mechanical arm of the excavator, and the cab represents the connecting key point of the cabin of the excavator and the mechanical arm.
Further, θ 1 、θ 2 And theta 3 The calculation formula of (2) is as follows:
where right_bucket_end represents the excavator bucket right end key point and left_bucket_end represents the excavator bucket left end key point.
Further, in step S5, a method for unifying a human body coordinate system and an excavator coordinate system using a PnP-based coordinate system includes the steps of:
if the homogeneous form of the 3D coordinates of a certain skeleton key point in the world coordinate system is P w =(X w ,Y w ,Z w ,1) T And the homogeneous form of the 2D coordinates of the skeleton key point in the image pixel coordinate system is p= (u, v, 1) T When the transformation relation between the 3D coordinates and the 2D coordinates of the skeleton key points is as follows:
wherein K is a camera reference matrix,
t is a transformation matrix from a world coordinate system to a camera coordinate system, is an optimization parameter to be solved,
wherein R is a rotation matrix, and t is a translation matrix;
calling a solvePnP function in an OpenCV computer vision library to solve, and recording the finally obtained optimized transformation matrix as T * The human body coordinate system and the excavator coordinate system are unified to the camera coordinate system by:
further, in step S5, identifying and monitoring the behavior of the 3D dynamic dangerous area in the working state of the personnel intrusion excavator includes the following steps:
transforming the coordinates of the human body and the excavator from the camera coordinate system to the excavator coordinate system by transforming the human body coordinate system and the excavator coordinate system into an inverse transformation of the camera coordinate system, so that the human body coordinate system is incorporated into the excavator coordinate system;
respectively obtaining projections of the surrounding 3D dynamic dangerous area on the X-Z plane of the global coordinate system and projections of the surrounding 3D dynamic dangerous area on the X-y plane of the local coordinate system under the working state of the excavator;
respectively obtaining projections of key points of the 3D skeleton of the human body on an X-Z plane of a global coordinate system and projections of key points of the 3D skeleton of the human body on an X-y plane of a local coordinate system;
if the key points of the human body 3D framework are overlapped with projections of surrounding 3D dynamic dangerous areas on the X-Z plane of the global coordinate system and the X-y plane of the local coordinate system in the working state of the excavator, the hidden danger of collision between the excavator and the worker is considered to exist, and accordingly early warning is sent out.
The system for early warning the collision accident of the excavator and the human based on the computer vision comprises a data acquisition module, a model training module and an early warning module;
the data acquisition module is used for acquiring a construction site video stream;
the model training module is used for training an accident early warning model;
the early warning module is used for inputting the construction site video stream into an accident early warning model and determining the movable dangerous area of the excavator; judging whether a person exists in the dynamic dangerous area of the excavator, if so, sending out an early warning;
the accident early warning model is constructed by the model training module, and comprises the following steps of:
s1, constructing an excavator target detection model and a human body target detection model; acquiring an excavator construction site picture data set, and training and verifying the excavator target detection model and the human body target detection model;
s2, constructing an excavator identification model based on deep learning, and extracting 2D key points of the excavator and restoring 3D coordinates of the key points;
s3, constructing a human body recognition model based on deep learning, and extracting 2D key points of the human body and restoring 3D coordinates of the key points;
s4, quantitatively extracting running state indexes of the excavator according to the 3D coordinate reduction result of the key points of the excavator, and defining a 3D dynamic dangerous area in the working state of the excavator;
and S5, unifying a human body coordinate system and an excavator coordinate system by using a PnP-based coordinate system unifying method, and identifying and monitoring behaviors of the 3D dynamic dangerous area under the working state of the personnel invading the excavator.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the existing human body 3D skeleton recognition computer vision algorithm is migrated and improved, so that the 2D key point recognition and 3D skeleton recognition algorithm model of the mechanical arm of the excavator is constructed, and the precision of the 3D gesture estimation algorithm under the working state of the excavator based on the monocular camera video stream is improved.
2. According to the invention, through the 3D coordinates of the key points of the excavator framework and the motion state thereof, a model for identifying and defining the surrounding 3D dynamic dangerous areas in the working state of the excavator is established, and the fine management of the surrounding 3D dynamic dangerous areas in the working state of the excavator is realized. A detection model of a 3D dynamic dangerous area around the working state of the worker invading the excavator is created, and accordingly an early warning frame of collision accidents between the excavator and the person is established, and support is provided for fine and intelligent safety management of a construction site.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a schematic diagram of an excavator skeleton model in an embodiment of the present invention.
Fig. 3 is a schematic diagram of a human skeleton model in an embodiment of the invention.
Fig. 4 is a schematic diagram of a global coordinate system and a local coordinate system of an excavator according to an embodiment of the present invention, where (a) is a three-dimensional view and (b) is a top view.
Fig. 5 is a schematic diagram of parameters of a skeleton posture angle in an operating state of the excavator according to an embodiment of the present invention, wherein (c) is a schematic diagram when the mechanical arm is extended, and (d) is a schematic diagram when the mechanical arm is integrally rotated.
Fig. 6 is a schematic diagram of determining whether a key point of a skeleton of a worker is located in a surrounding 3D dynamic dangerous area under an excavator working state in the embodiment of the present invention, where (e) is a schematic diagram of a local coordinate system X-y plane, and (f) is a schematic diagram of a global coordinate system X-Z plane.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.
Aiming at the defects in the prior art, the invention provides a set of collision accident pre-warning system for the excavator and the human based on computer vision. Firstly, carrying out 3D framework recognition on a construction site video stream by using computer vision algorithms such as target detection, 2D key point detection, 3D gesture recognition and the like; secondly, based on the 3D motion gesture of the 3D framework of the excavator in the working state, a 3D dynamic dangerous area around the working state of the excavator is defined; finally, judging whether the key points of the 3D framework of the worker invade the surrounding 3D dynamic dangerous area under the working state of the excavator or not, and giving out early warning if the key points are dangerous due to collision of the excavator and the person. According to the invention, the 3D dynamic dangerous area around the working state of the excavator is constructed by the computer vision technology, under the condition of no invasion interference to the first-line production operation of the construction site, the accuracy and the effectiveness of automatic early warning of the collision accident between the excavator and the person are improved, the occurrence of misjudgment is avoided to the greatest extent, and the intelligent safety management of the construction site is facilitated.
As shown in fig. 1, the method for pre-warning the collision accident between the excavator and the person based on computer vision comprises the following steps:
acquiring a construction site video stream;
inputting a construction site video stream into an accident early warning model, and determining a movable dangerous area of the excavator;
judging whether a person exists in the dynamic dangerous area of the excavator, if so, sending out an early warning;
the construction of the accident early warning model comprises the following steps:
s1, constructing an excavator target detection model and a human body target detection model; acquiring an excavator construction site picture data set, and training and verifying an excavator target detection model and a human body target detection model;
s2, constructing an excavator identification model based on deep learning, and extracting 2D key points of the excavator and restoring 3D coordinates of the key points;
s3, constructing a human body recognition model based on deep learning, and extracting 2D key points of the human body and restoring 3D coordinates of the key points;
s4, quantitatively extracting running state indexes of the excavator according to the 3D coordinate reduction result of the key points of the excavator, and defining a 3D dynamic dangerous area in the working state of the excavator;
and S5, unifying a human body coordinate system and an excavator coordinate system by using a PnP-based coordinate system unifying method, and identifying and monitoring behaviors of the 3D dynamic dangerous area under the working state of the personnel invading the excavator.
Specifically, the implementation process of the invention comprises the following steps:
in the computer vision-based collision accident pre-warning system for the excavator and the human, the basis for detecting the possible collision accident of the excavator and the human is a video stream shot by a monocular camera. Therefore, it is first necessary to detect the excavator target and the worker target in the video. As with other supervised deep learning algorithms, the target detection algorithm is trained on the data set, updating parameters by back propagation. In this process, the data set and the neural network model are the two most important components.
For the target detection of the excavator, the embodiment adopts the ACID construction picture dataset which is already in the open source as the basic dataset for the target detection of the excavator, and the basic picture dataset is subjected to data expansion by the data enhancement technology such as Mosaic, cutMix and the like, so as to form a training and verification dataset which meets the training requirement of the deep learning model.
And training the Yolox convolutional neural network by using the extended ACID construction picture data set to obtain a construction site excavator identification model with higher precision and generalization performance.
After the excavator target in the construction site video stream is obtained, 2D key point detection and 3D skeleton recognition are required to be carried out on the excavator.
For the 2D key point detection of the excavator, an open source data set is utilized, and a data enhancement method is assisted to train the Hourglass neural network, so that an excavator 2D key point detection Hourgass model capable of identifying 6 key points of an excavator mechanical arm and an excavator body is obtained.
For the 3D framework recognition task of the excavator, the acquisition difficulty of training data is high, so that the C4D software is used for modeling the excavator, then the motion action of the excavator during the binding construction operation is performed, and then a program is written by using a python secondary development interface provided by the C4D software to directly output the 2D coordinates of 6 key points of the excavator in an image coordinate system and the 3D coordinates in a world coordinate system in the motion process.
By using the data set generation method, a large number of data sets can be generated for the task of converting the coordinates of the 2D key points of the excavator into the 3D coordinates, and a transducer model is trained by using the data sets, so that the coordinate conversion of the 2D key points of the excavator into the 3D coordinate transducer model is obtained. The definition of 6 key points of the excavator is shown in fig. 2 and table 1.
Table 1 excavator key point definition
Sequence number Variable name Chinese name
(1) Body_end Excavator main body tail end point
(2) Cab_boom Cockpit and mechanical arm connection point
(3) Boom_arm Middle hinge point of mechanical arm
(4) Arm_bucket Connection point of mechanical arm and bucket
(5) Left_bucket_end Bucket left end point
(6) Right_bucket_end Bucket right end point
For human body target detection, 2D key point detection and 3D skeleton recognition, a method similar to the task of the excavator is adopted. The embodiment adopts a Yolox-based human body target detection model, a Stacked Hourglass-based human body 2D key point detection model and a videoPose 3D-based human body 3D skeleton recognition model, and performs training by using an MSCOCO data set and a Human3.6M data set. The definition of 17 key points of a human body is shown in fig. 3 and table 2.
TABLE 2 definition of human keypoints
Sequence number Variable name Chinese name
(1) root Pelvis
(2) right_hip Right buttocks
(3) right_knee Right knee
(4) right_foot Right foot
(5) left_hip Left buttocks
(6) left_knee Left knee
(7) left_foot Left foot
(8) spine Vertebra column
(9) thorax Chest cavity
(10) neck_base Cervical vertebra
(11) head Head part
(12) left_shoulder Left shoulder
(13) left_elbow Left elbow
(14) left_wrist Left wrist
(15) right_shoulder Right shoulder
(16) right_elbow Right elbow
(17) right_wrist Right wrist
After the 3D skeleton of the excavator is obtained, the surrounding 3D dynamic dangerous area during the operation of the excavator needs to be defined next.
Firstly, obtaining 3D dynamic frameworks in the running process of the excavator, namely 3D coordinate sequence data of framework key points in the working state of the excavator, from a construction site video stream through a Yolox-Hourglass-transducer model, and carrying out post-processing on the 3D dynamic frameworks. According to the embodiment, a Savgol filter is used for performing polynomial fitting on continuous signal points in the range of the sliding window so as to achieve the effect of smoothing, and part of abnormal points in the 3D framework recognition result in the running state of the excavator are filtered, so that the subsequent extraction of the running state of the excavator is facilitated.
And (3) smoothing the 3D coordinate sequence data of the skeleton key points under the working state of the excavator by using a Savgol filter, and then calculating the motion state parameters of the excavator. The excavator framework has 6 key points, 18 degrees of freedom exist if the excavator framework is independently analyzed, but the hinging characteristic of the excavator provides a plurality of constraints, the motion of the excavator in the working state can be decomposed into integral rotation and mechanical arm stretching according to the connection mode and the motion characteristic of the excavator, and the motion state of the excavator can be better expressed by using a local coordinate system and a global coordinate system.
As shown in FIG. 4, X-Y-Z is the global coordinate system and X-Y-X is the local coordinate system. Wherein the local coordinate system of the excavator takes the vertical plane of the mechanical arm as an x-Y plane, the Y-axis of the global coordinate system of the excavator is overlapped with the Y-axis of the local coordinate system, and the (x, Y, z) is carried out under the local coordinate system T In the global coordinate system (X, Y, Z) T The coordinate conversion is realized by the rotation angle θ obtained by the formula (1) and the matrix conversion shown by the formula (2).
18 position parameters of 6 skeleton key points originally describing the movement posture of the excavator in the working state can be reduced to 4 angle parameters theta shown in figure 5 through two coordinate systems of a global coordinate system and a local coordinate system 1 、θ 2 、θ 3 And theta 4 . Because the length between every two key points of the mechanical arm of the excavator is kept unchanged, the skeleton posture and the movement condition of the mechanical arm of the excavator in the working state can be described by only needing the 4 angle parameters. Wherein θ 4 The overall rotation of the excavator under the global coordinate system can be expressed, which is equal to the theta angle of rotation between the local coordinate system and the global coordinate system in fig. 4; and θ 1 、θ 2 And theta 3 The extension of the mechanical arm of the excavator in the x-y plane of the local coordinate system can be expressed, and the calculation can be performed by adopting the formulas shown in the formulas (3), (4) and (5).
By means of the transformation that 18 position parameters of the skeleton posture in the working state of the excavator are reduced to 4 angle parameters, the obtained 4 angle parameters are mutually independent, and the follow-up description of the motion state of the excavator is facilitated.
The angular velocity of the excavator at the current moment of 4 angle parameters can be obtained according to the formula (6), and then the movement trend of the excavator can be obtained according to a first-order Taylor formula shown in the formula (7). Referring to the prior art, in this embodiment, the 3D area that the mechanical arm of the excavator may sweep within 2.5s of the average reaction time of the worker is used as the 3D dynamic dangerous area around the working state of the excavator, i.e. Δt=2.5 s is substituted into the 3D area formed by the 4 angle parameter value changes calculated by the formula (7).
After the 3D dynamic dangerous area around the working state of the excavator and the 3D framework of the worker are obtained, the coordinate systems of the worker and the excavator are unified by using a PnP algorithm, and then whether the worker invades the 3D dynamic dangerous area around the working state of the excavator can be judged.
The PnP algorithm (perselective-n-Point) can estimate the position and motion of the camera itself through 3D-2D Point pairs. Specifically, under the condition that n point pairs with one-to-one correspondence between 3D space and 2D image are known, the pose of the camera in the 3D space can be estimated, and then the pose coordinates of the point pairs under the camera coordinate system can be obtained through inverse transformation of the coordinate system. Because the excavator and the person are photographed at the same time by the same camera known as the internal reference, the excavator and the worker can be unified under a coordinate system based on the position of the camera by the PnP method.
Knowing that the homogeneous form of the 3D coordinate of a certain skeleton key point under the world coordinate system is P w =(X w ,Y w ,Z w ,1) T And the homogeneous form of the 2D coordinates of the skeleton key point in the image pixel coordinate system is p= (u, v, 1) T And when the coordinate is in the three-dimensional coordinate, the conversion relation between the 3D coordinates and the 2D coordinates of the skeleton key points is shown as a formula (8).
In formula (8), K is a camera reference matrix, which is known in practical problems;
t is a transformation matrix from a world coordinate system to a camera coordinate system, and is an optimization parameter to be solved;
wherein R is a rotation matrix, and t is a translation matrix.
Regarding the optimization solving method of PnP, in this embodiment, the software PnP function in the OpenCV computer vision library is called to solve, and the finally obtained optimization transformation matrix is recorded as T * . Obtaining T * Then, the worker and the excavator coordinates can be unified under the camera coordinate system by the formula (9).
In order to judge whether the key points of the 3D framework of the worker invade the surrounding 3D dynamic dangerous area under the working state of the excavator, firstly, the coordinates of the worker and the excavator can be transformed from a camera coordinate system to the excavator coordinate system through the inverse transformation of the transformation, so that the worker coordinate system is integrated into the excavator coordinate system; secondly, the projection of the surrounding 3D dynamic hazard zone on the global coordinate system X-Z plane and the projection on the local coordinate system X-y plane in the working state of the excavator can be obtained respectively, as shown by the shaded portion in fig. 6; again, the projections of the worker's 3D skeleton key points on the global coordinate system X-Z plane and the local coordinate system X-y plane can be obtained separately, as P in FIG. 6 1 、P 2 The dots are shown; finally, if the projections of the key points of the 3D skeleton of the worker on the X-Z plane of the global coordinate system and the X-y plane of the local coordinate system are all in the projections of the surrounding 3D dynamic dangerous area on the two plane coordinate systems under the working state of the excavator, such as P in fig. 6 1 And if the points are shown, the hidden danger of collision between the excavator and the worker is considered to exist, so that early warning is sent out.
The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims (10)

1. The computer vision-based method for early warning the collision accident between the excavator and the person is characterized by comprising the following steps:
acquiring a construction site video stream;
inputting the construction site video stream into an accident pre-warning model, and determining the movable dangerous area of the excavator;
judging whether a person exists in the dynamic dangerous area of the excavator, and if so, sending out an early warning;
the construction of the accident early warning model comprises the following steps:
s1, constructing an excavator target detection model and a human body target detection model; acquiring an excavator construction site picture data set, and training and verifying the excavator target detection model and the human body target detection model;
s2, constructing an excavator identification model based on deep learning, and extracting 2D key points of the excavator and restoring 3D coordinates of the key points;
s3, constructing a human body recognition model based on deep learning, and extracting 2D key points of the human body and restoring 3D coordinates of the key points;
s4, quantitatively extracting running state indexes of the excavator according to the 3D coordinate reduction result of the key points of the excavator, and defining a 3D dynamic dangerous area in the working state of the excavator;
and S5, unifying a human body coordinate system and an excavator coordinate system by using a PnP-based coordinate system unifying method, and identifying and monitoring behaviors of the 3D dynamic dangerous area under the working state of the personnel invading the excavator.
2. The method for pre-warning the collision accident of the excavator and the person based on the computer vision according to claim 1, wherein in the step S1, an ACID construction picture data set is adopted as a basic data set, and the basic data set is subjected to data expansion by using a Mosaic and CutMix data enhancement technology;
based on the excavator construction site picture data set, training the Yolox convolutional neural network to obtain an excavator target detection model and a human body target detection model.
3. The computer vision-based method for pre-warning of collision accidents between an excavator and a person according to claim 1, wherein in step S2, constructing an excavator recognition model based on deep learning comprises the steps of:
training the Hourgass neural network by using an open source data set and a data enhancement method to obtain an excavator 2D key point detection Hourgass model capable of identifying 6 key points of an excavator mechanical arm and an excavator body;
modeling the excavator by using modeling software, binding the motion action of the excavator during construction operation, and writing a program to output the 2D coordinates of 6 key points in an image coordinate system and the 3D coordinates in a world coordinate system in the excavator motion process to generate a data set for converting the 2D key point coordinates of the excavator into the 3D coordinates;
and training a transducer model based on the data set of the conversion of the excavator 2D key point coordinates into the 3D coordinates, so as to obtain the conversion of the excavator 2D key point coordinates into the 3D coordinate transducer model.
4. The computer vision-based method for early warning of collision accidents between an excavator and a person according to claim 1, wherein in step S3, constructing a deep learning-based human body recognition model comprises the steps of:
training a Stacked Hourglass neural network by using an MSCOCO data set to obtain a human body 2D key point detection model;
and training the video Pose3D neural network by using the Human3.6M data set to obtain a human body 3D skeleton recognition model.
5. The computer vision-based pre-warning method for collision accidents between an excavator and a person according to claim 1, wherein in step S4, defining a 3D dynamic dangerous area in the working state of the excavator comprises the following steps:
obtaining a 3D dynamic framework in the running process of the excavator from the construction site video stream through a Yolox-Hourglass-converter model, and performing post-treatment on the 3D dynamic framework;
using a Savgol filter to perform polynomial fitting on continuous signal points in a sliding window range, and filtering abnormal points in a 3D skeleton recognition result in an excavator running state;
decomposing the motion in the working state of the excavator into integral rotation and mechanical arm extension, and expressing the motion state of the excavator by using a local coordinate system and a global coordinate system;
the position parameters of skeleton key points describing the movement gesture under the working state of the excavator are converted into 4 angle parameters theta through a global coordinate system and a local coordinate system 1 、θ 2 、θ 3 And theta 4 The method comprises the steps of carrying out a first treatment on the surface of the Wherein θ 4 For expressing the global rotation of the excavator under a global coordinate system, theta 1 、θ 2 And theta 3 The mechanical arm stretching for expressing the excavator in the x-y plane of the local coordinate system;
calculating the angular speed of the excavator at the current moment of 4 angle parameters:
defining the average reaction time delta t of workers, and taking a 3D area possibly swept by the mechanical arm of the excavator within the average reaction time of the workers as a 3D dynamic dangerous area under the working state of the excavator, namely calculating the movement trend of the excavator based on delta t:
and obtaining a 3D region formed by 4 angle parameter value changes as a 3D dynamic dangerous region under the working state of the excavator.
6. The computer vision-based method for early warning of collision accidents between an excavator and a person according to claim 5, wherein the local coordinate system of the excavator uses the vertical plane of the mechanical arm as the x-Y plane, the Y axis of the global coordinate system of the excavator coincides with the Y axis of the local coordinate system, and the Y axis of the global coordinate system of the excavator coincides with the (x, Y, z) axis of the local coordinate system T In the global coordinate system (X, Y, Z) T The coordinate conversion between them is:
wherein, the boom represents the middle hinge key point of the mechanical arm of the excavator, and the cab represents the connecting key point of the cabin of the excavator and the mechanical arm.
7. The computer vision-based method for early warning of an accident of collision between an excavator and a person according to claim 5, wherein θ 1 、θ 2 And theta 3 The calculation formula of (2) is as follows:
where right_bucket_end represents the excavator bucket right end key point and left_bucket_end represents the excavator bucket left end key point.
8. The computer vision-based method for pre-warning collision accidents between an excavator and a person according to claim 1, wherein in step S5, a human body coordinate system and an excavator coordinate system are unified by using a PnP-based coordinate system unification method, comprising the steps of:
if the homogeneous form of the 3D coordinates of a certain skeleton key point in the world coordinate system is P w =(X w ,Y w ,Z w ,1) T And the homogeneous form of the 2D coordinates of the skeleton key point in the image pixel coordinate system is p= (u, v, 1) T When the frame is closedThe conversion relation between the 3D coordinates and the 2D coordinates of the key points is as follows:
wherein K is a camera reference matrix,
t is a transformation matrix from a world coordinate system to a camera coordinate system, is an optimization parameter to be solved,
wherein R is a rotation matrix, and t is a translation matrix;
calling a solvePnP function in an OpenCV computer vision library to solve, and recording the finally obtained optimized transformation matrix as T * The human body coordinate system and the excavator coordinate system are unified to the camera coordinate system by:
9. the computer vision-based pre-warning method for collision accidents between an excavator and a person according to claim 1, wherein in step S5, the identifying and monitoring of the behavior of the person invading the 3D dynamic dangerous area in the working state of the excavator comprises the following steps:
transforming the coordinates of the human body and the excavator from the camera coordinate system to the excavator coordinate system by transforming the human body coordinate system and the excavator coordinate system into an inverse transformation of the camera coordinate system, so that the human body coordinate system is incorporated into the excavator coordinate system;
respectively obtaining projections of the surrounding 3D dynamic dangerous area on the X-Z plane of the global coordinate system and projections of the surrounding 3D dynamic dangerous area on the X-y plane of the local coordinate system under the working state of the excavator;
respectively obtaining projections of key points of the 3D skeleton of the human body on an X-Z plane of a global coordinate system and projections of key points of the 3D skeleton of the human body on an X-y plane of a local coordinate system;
if the key points of the human body 3D framework are overlapped with projections of surrounding 3D dynamic dangerous areas on the X-Z plane of the global coordinate system and the X-y plane of the local coordinate system in the working state of the excavator, the hidden danger of collision between the excavator and the worker is considered to exist, and accordingly early warning is sent out.
10. The system for early warning the collision accident of the excavator and the human based on the computer vision is characterized by comprising a data acquisition module, a model training module and an early warning module;
the data acquisition module is used for acquiring a construction site video stream;
the model training module is used for training an accident early warning model;
the early warning module is used for inputting the construction site video stream into an accident early warning model and determining the movable dangerous area of the excavator; judging whether a person exists in the dynamic dangerous area of the excavator, if so, sending out an early warning;
the accident early warning model is constructed by the model training module, and comprises the following steps of:
s1, constructing an excavator target detection model and a human body target detection model; acquiring an excavator construction site picture data set, and training and verifying the excavator target detection model and the human body target detection model;
s2, constructing an excavator identification model based on deep learning, and extracting 2D key points of the excavator and restoring 3D coordinates of the key points;
s3, constructing a human body recognition model based on deep learning, and extracting 2D key points of the human body and restoring 3D coordinates of the key points;
s4, quantitatively extracting running state indexes of the excavator according to the 3D coordinate reduction result of the key points of the excavator, and defining a 3D dynamic dangerous area in the working state of the excavator;
and S5, unifying a human body coordinate system and an excavator coordinate system by using a PnP-based coordinate system unifying method, and identifying and monitoring behaviors of the 3D dynamic dangerous area under the working state of the personnel invading the excavator.
CN202310437824.4A 2023-04-23 2023-04-23 Computer vision-based method and system for early warning collision accident between excavator and person Pending CN116469037A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310437824.4A CN116469037A (en) 2023-04-23 2023-04-23 Computer vision-based method and system for early warning collision accident between excavator and person

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310437824.4A CN116469037A (en) 2023-04-23 2023-04-23 Computer vision-based method and system for early warning collision accident between excavator and person

Publications (1)

Publication Number Publication Date
CN116469037A true CN116469037A (en) 2023-07-21

Family

ID=87173171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310437824.4A Pending CN116469037A (en) 2023-04-23 2023-04-23 Computer vision-based method and system for early warning collision accident between excavator and person

Country Status (1)

Country Link
CN (1) CN116469037A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117294556A (en) * 2023-09-11 2023-12-26 长江生态环保集团有限公司 AI (automatic identification) recognition edge computing intelligent gateway and method for safety management and standardized construction of intelligent construction site
CN118170146A (en) * 2024-05-09 2024-06-11 山东科技大学 Excavator running control method based on extended artificial potential field

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117294556A (en) * 2023-09-11 2023-12-26 长江生态环保集团有限公司 AI (automatic identification) recognition edge computing intelligent gateway and method for safety management and standardized construction of intelligent construction site
CN118170146A (en) * 2024-05-09 2024-06-11 山东科技大学 Excavator running control method based on extended artificial potential field

Similar Documents

Publication Publication Date Title
CN116469037A (en) Computer vision-based method and system for early warning collision accident between excavator and person
Son et al. Integrated worker detection and tracking for the safe operation of construction machinery
CN111784967A (en) Building construction safety early warning protection system based on BIM
CN107253192A (en) It is a kind of based on Kinect without demarcation human-computer interactive control system and method
Li et al. Standardized use inspection of workers' personal protective equipment based on deep learning
Ray et al. Coarse head pose estimation of construction equipment operators to formulate dynamic blind spots
CN108846891B (en) Man-machine safety cooperation method based on three-dimensional skeleton detection
CN113386128B (en) Body potential interaction method for multi-degree-of-freedom robot
JP2011186576A (en) Operation recognition device
CN116259002A (en) Human body dangerous behavior analysis method based on video
CN113608663A (en) Fingertip tracking method based on deep learning and K-curvature method
CN114170686A (en) Elbow bending behavior detection method based on human body key points
CN114565852A (en) Industrial robot safety protection system and method based on machine vision
CN111126321A (en) Electric power safety construction protection method and device and computer equipment
CN114022871A (en) Unmanned aerial vehicle driver fatigue detection method and system based on depth perception technology
KR101862545B1 (en) Method and system for providing rescue service using robot
Liu et al. A new measurement method of real-time pose estimation for an automatic hydraulic excavator
CN112597902A (en) Small target intelligent identification method based on nuclear power safety
CN117252353A (en) Shield construction management platform and management method
CN117173791A (en) Distribution network constructor violation detection method and system based on action recognition
CN115761803A (en) Peripheral pedestrian safety early warning method for electric power operation area
CN114821806A (en) Method and device for determining behavior of operator, electronic equipment and storage medium
CN113920020A (en) Human point cloud real-time repairing method based on depth generation model
You et al. Research and Implementation of Human-Computer Interaction System Based on Human Body Attitude Recognition Algorithm
Tang et al. Method on pose estimation of excavators based on onboard depth camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination