CN112036324A

CN112036324A - A method and system for human posture determination for complex multi-person scenes

Info

Publication number: CN112036324A
Application number: CN202010905107.6A
Authority: CN
Inventors: 张新峰; 杨文强; 李斌
Original assignee: Yangzhou University
Current assignee: Yangzhou University
Priority date: 2020-09-01
Filing date: 2020-09-01
Publication date: 2020-12-04
Anticipated expiration: 2040-09-01
Also published as: CN112036324B

Abstract

The invention discloses a human body posture judgment method and system for a complex multi-person scene. The method firstly divides the obtained key points of the human body into a plurality of sets, such as: a key point set of the upper half of the human body, a key point set of the lower half of the human body and the like; then extracting the statistical characteristics of the geometric distribution of the human body key points in each set, such as: minimum circumscribed rectangles, convex hulls, and the like; and finally, according to the calculated statistical characteristics, such as: and judging the posture of the human body according to the rotation angle of the minimum circumscribed rectangle, the horizontal included angle of the convex hull and the like. The invention has the advantages that the human body posture can be rapidly and accurately judged under the conditions that the positions of key points of the human body are inaccurate or partial key points are missing, and the like, and the abnormal behaviors are as follows: detection of falls, collisions, etc. provides accurate information.

Description

A method and system for human posture determination for complex multi-person scenes

技术领域technical field

本发明属于模式识别和人工智能技术领域，具体涉及一种鲁棒的人体姿态判定方法，特别涉及一种用于复杂多人场景的人体姿态判定方法及系统。The invention belongs to the technical field of pattern recognition and artificial intelligence, in particular to a robust human body posture determination method, in particular to a human body posture determination method and system for complex multi-person scenes.

背景技术Background technique

视频监控设备广泛部署在了各种公共场所，如：幼儿园、养老院、车站、商场等。这些设备在日常运行中会产生海量的监控数据，仅靠人工监视视频场景的方式显然不切实际。监控场景中人体姿态的语义描述，即判定不同的人体姿态，可以帮助人们快速了解场景中个体的状态以及发生的事件，对实时检测意外和突发状况意义重大。Video surveillance equipment is widely deployed in various public places, such as: kindergartens, nursing homes, stations, shopping malls, etc. These devices generate massive amounts of surveillance data in their daily operations, and it is obviously impractical to monitor video scenes manually. The semantic description of human posture in monitoring scenes, that is, determining different human postures, can help people quickly understand the state of individuals in the scene and the events that occur, which is of great significance for real-time detection of accidents and emergencies.

传统的人工监控方式不仅效率低下，而且难以及时对于一些突发状况做出迅速的反应，随着人工智能以及模式识别技术的发展，出现了一些人体姿态的识别方法。根据是否将人体视为“连杆”模型，人体姿态识别方法大致可以分为两类：基于视频特征提取的方法和基于人体关键点提取的方法。下面将对这两类方法进行展开讨论。The traditional manual monitoring method is not only inefficient, but also difficult to respond quickly to some emergencies in time. With the development of artificial intelligence and pattern recognition technology, some methods of human posture recognition have emerged. According to whether the human body is regarded as a "link" model, human pose recognition methods can be roughly divided into two categories: methods based on video feature extraction and methods based on human body key point extraction. Both types of methods are discussed below.

在基于视频特征提取的方法中，文献“Action recognition by densetrajectories[C].IEEE Conference on Computer Vision and Pattern Recognition”提出利用光流场跟踪密集的采样点来获得人体运动的轨迹，进而完成人体姿态的识别。该方法为获取人体运动轨迹需要实时跟踪采样点，而在出现行人交错的场景，这种跟踪很容易出错。文献“Human posture recognition based on projection histogram and SupportVector Machine”利用椭圆拟合人体轮廓，然后沿着椭圆长轴和短轴方向构建直方图用于描述人体形状，最后采用支持向量机识别人体姿态。对于无法准确提取边界框的情况，比如：多人相互咬合或者遮挡，该方法提取的人体形状会发生严重变化，导致姿态识别性能急剧下降。文献“A bio-inspired event-based size and position invariant humanposture recognition algorithm”提出了一种基于简化线段Hausdorff距离的人体姿态识别方法。该方法把视频序列中两个连续的图像帧作为输入，通过比较两帧之间的差异得到活动对象，并将活动对象的轮廓分解为矢量线段。该方法简化了计算，提高了效率，但是无法检测到前后帧之间不动身体部位导致识别的准确率下降。文献“Posture recognitioninvariant to background,cloth textures,body size,and camera distance usingmorphological geometry”利用提取的人体轮廓的长度和宽度识别人体姿态，避免了穿着背景等细节信息的影响。由于人体轮廓长度和宽度只能对粗略的描述人体姿态，因此该方法识别姿态的准确率不高。基于视频特征提取的方法在判断人体姿态之前需要输入视频是对目标进行跟踪拍摄的或者首先通过跟踪算法提取目标的运动过程，导致此类方法很难适用于目标跟踪困难人数较多的复杂场景。In the method based on video feature extraction, the document "Action recognition by densetrajectories[C].IEEE Conference on Computer Vision and Pattern Recognition" proposes to use optical flow field to track dense sampling points to obtain the trajectory of human motion, and then complete the detection of human posture. identify. This method requires real-time tracking of sampling points to obtain human motion trajectories, and this tracking is prone to errors in scenes where pedestrians intersect. The document "Human posture recognition based on projection histogram and SupportVector Machine" uses an ellipse to fit the outline of the human body, then constructs a histogram along the long and short axes of the ellipse to describe the shape of the human body, and finally uses a support vector machine to recognize the human posture. For situations where the bounding box cannot be accurately extracted, such as multiple people occluding or occluding each other, the shape of the human body extracted by this method will change severely, resulting in a sharp drop in the performance of gesture recognition. The document "A bio-inspired event-based size and position invariant humanposture recognition algorithm" proposes a method for human posture recognition based on simplified line segment Hausdorff distance. The method takes two consecutive image frames in a video sequence as input, obtains the moving object by comparing the difference between the two frames, and decomposes the outline of the moving object into vector line segments. This method simplifies the calculation and improves the efficiency, but the inability to detect the moving body parts between the frames leads to a decrease in the accuracy of the recognition. The paper "Posture recognition invariant to background, cloth textures, body size, and camera distance using morphological geometry" uses the length and width of the extracted human silhouette to recognize human posture, avoiding the influence of detailed information such as wearing background. Since the length and width of the human body outline can only roughly describe the human posture, the accuracy of the method in identifying the posture is not high. The method based on video feature extraction needs to input the video to track the target before judging the human posture, or first extract the movement process of the target through the tracking algorithm, which makes it difficult for such methods to be suitable for complex scenes with a large number of people who have difficulty in target tracking.

在基于人体关键点提取的方法中，文献“Realtime Multi-person 2D PoseEstimation Using Part Affinity Fields”提出了一种同时预测身体部位位置和各部位位置关系的模型来提取人体关键点；文献“A Coarse-Fine Network for KeypointLocalization”提出了一种由粗到精多级监督的网络CFN(Coarse-Fine Network)用于提取人体关键点。这类方法无需跟踪目标，因此能够适用于多人场景中的人体姿态识别。文献“Neural Network Approach for 2-Dimension Person Pose Estimation With EncodedMask and Keypoint Detection”利用卷积深度神经网络从图像分割蒙版中提取关键点，学习关键点之间的相互连接关系，实现了将图像分割与自底而上策略相结合来提取人体关键点。基于人体关键点提取的方法大多只是提取特征点，却没做出人体姿态的判定。然而在实际应用中，往往只得到人体的特征点是不够的，给出姿态的判定结果可以帮助人们快速地了解场景中个体的状态以及发生的事件，同时也是智能监控系统自动地对事态做出进一步分析和判断的重要依据。因此，出现了一些根据关键点信息判定人体姿态的方法。文献“Human posture classification using skeleton information”利用人体关键点之间的距离以及关键点连线的夹角判定人体姿态。但是在实际应用中经常会出现提取的关键点存在误差以及由于遮挡造成的关键点缺失等问题，这会使得现有姿态判定方法性能的急剧下降甚至失效。In the method based on human body key point extraction, the document "Realtime Multi-person 2D PoseEstimation Using Part Affinity Fields" proposes a model that simultaneously predicts the position of body parts and the positional relationship of each part to extract human key points; the document "A Coarse- Fine Network for Keypoint Localization" proposes a coarse-to-fine multi-level supervised network CFN (Coarse-Fine Network) for extracting human key points. This kind of method does not need to track the target, so it can be suitable for human gesture recognition in multi-person scenes. The document "Neural Network Approach for 2-Dimension Person Pose Estimation With EncodedMask and Keypoint Detection" uses convolutional deep neural network to extract key points from image segmentation masks, learn the mutual connection between key points, and realizes the image segmentation and A bottom-up strategy is combined to extract human keypoints. Most of the methods based on human body key point extraction only extract feature points, but do not make a judgment of human body posture. However, in practical applications, it is often not enough to only obtain the feature points of the human body. Giving the judgment result of the posture can help people quickly understand the state of the individual in the scene and the events that have occurred, and it is also the intelligent monitoring system that automatically makes decisions about the situation. important basis for further analysis and judgment. Therefore, there are some methods to determine the human pose based on the key point information. The document "Human posture classification using skeleton information" uses the distance between the key points of the human body and the angle between the key points to determine the posture of the human body. However, in practical applications, there are often problems such as errors in the extracted key points and missing key points due to occlusion, which will make the performance of the existing attitude determination methods drop sharply or even fail.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对上述现有技术存在的问题，提供一种用于复杂多人场景的人体姿态判定方法及系统，从而快速的了解场景中每个个体的状态以及发生的事件。The purpose of the present invention is to provide a method and system for judging human body posture in a complex multi-person scene, so as to quickly understand the state of each individual in the scene and the events that occur.

实现本发明目的的技术解决方案为：一种用于复杂多人场景的人体姿态判定方法，所述方法包括以下步骤：The technical solution to achieve the purpose of the present invention is: a method for determining human body posture for complex multi-person scenes, the method comprises the following steps:

步骤1，检测人体关键点，并划分关键点集合；Step 1, detect human key points, and divide the key point set;

步骤2，提取各集合中人体关键点几何分布的统计特征；Step 2, extracting the statistical features of the geometric distribution of the human body key points in each set;

步骤3，根据统计特征判定人体姿态。Step 3: Determine the posture of the human body according to the statistical features.

进一步地，步骤1所述检测人体关键点，并划分关键点集合，具体过程包括：Further, the human body key points are detected in step 1, and the key point set is divided, and the specific process includes:

步骤1-1，构建样本训练集，该集合包括标注好人体关键点的若干人体图像；Step 1-1, construct a sample training set, the set includes several human body images marked with key points of the human body;

步骤1-2，利用所述样本训练集训练深度卷积神经网络；Step 1-2, using the sample training set to train a deep convolutional neural network;

步骤1-3，利用训练好的深度卷积神经网络检测待检测人体图像中的关键点；Steps 1-3, using the trained deep convolutional neural network to detect key points in the human image to be detected;

步骤1-4，划分关键点集合，具体为：Steps 1-4, divide the set of key points, specifically:

剔除非人体躯干的关键点，包括手部，形成全身关键点集合；Eliminate key points of non-human torso, including hands, to form a set of whole body key points;

将人体躯干关键点划分为上半身关键点集合、下半身关键点集合、左臂关键点集合和右臂关键点集合；Divide the key points of the human torso into a set of upper body key points, a set of lower body key points, a set of left arm key points and a set of right arm key points;

将下半身关键点继续划分为左腿关键点集合、右腿关键点集合、左侧大腿关键点集合以及右侧大腿关键点集合。The key points of the lower body are further divided into a set of key points of the left leg, a set of key points of the right leg, a set of key points of the left thigh, and a set of key points of the right thigh.

进一步地，步骤2所述提取各集合中人体关键点几何分布的统计特征，具体过程包括：Further, in step 2, the statistical features of the geometric distribution of the key points of the human body in each set are extracted, and the specific process includes:

步骤2-1-1，计算各关键点集合的凸包；Step 2-1-1, calculate the convex hull of each key point set;

步骤2-1-2，从构成凸包的关键点中找出距离最大的两个关键点；Step 2-1-2, find the two key points with the largest distance from the key points forming the convex hull;

步骤2-1-3，计算所述两个关键点连线沿顺时针方向旋转与水平方向的夹角，作为该凸包的水平角度。Step 2-1-3: Calculate the angle between the clockwise rotation and the horizontal direction of the connecting line of the two key points as the horizontal angle of the convex hull.

进一步地，步骤3所述根据统计特征判定人体姿态，具体为：根据各关键点集合凸包的水平角度，判定手臂、腿部以及全身的姿态：Further, in step 3, determining the posture of the human body according to the statistical features is specifically: according to the horizontal angle of the convex hull of each key point set, determining the posture of the arms, legs and the whole body:

假设全身关键点集合凸包的水平夹角为α₀；上半身关键点集合凸包水平夹角为α₁；下半身关键点集合凸包的水平夹角为α₂；左臂关键点集合凸包的水平夹角为α₃；右臂关键点集合凸包的水平夹角为α₄；左腿关键点集合凸包的水平夹角为α₅；右腿关键点集合凸包的水平夹角为α₆；左侧大腿关键点集合凸包的水平夹角为α₇；右侧大腿关键点集合凸包的水平夹角为α₈；Assume that the horizontal included angle of the convex hull of the whole body key point set is α ₀ ; the horizontal included angle of the upper body key point set convex hull is α ₁ ; the horizontal included angle of the lower body key point set convex hull is α ₂ ; the left arm key point set convex hull is α 2 ; The horizontal included angle is α ₃ ; the horizontal included angle of the convex hull of the right arm key point set is α ₄ ; the horizontal included angle of the left leg key point set convex hull is α ₅ ; the horizontal included angle of the right leg key point set convex hull is α ₆ ; the horizontal included angle of the convex hull of the left thigh key point set is α ₇ ; the horizontal included angle of the convex hull of the right thigh key point set is α ₈ ;

步骤2-2-1，计算各关键点集合的最小外接矩形；Step 2-2-1, calculate the minimum circumscribed rectangle of each key point set;

步骤2-2-2，分别计算最小外接矩形的长边和短边沿顺时针方向旋转与水平方向构成的夹角；Step 2-2-2, respectively calculate the angle formed by the clockwise rotation of the long side and the short side of the minimum circumscribed rectangle and the horizontal direction;

步骤2-2-3，取两个夹角中的较小值作为最小外接矩形的旋转角度。Step 2-2-3, take the smaller of the two included angles as the rotation angle of the minimum circumscribed rectangle.

进一步地，步骤3所述根据统计特征判定人体姿态，具体为：根据各关键点集合最小外接矩形的旋转角度，判定手臂、腿部以及全身的姿态：Further, in step 3, determining the posture of the human body according to the statistical features is specifically: determining the postures of the arms, legs and the whole body according to the rotation angle of the minimum circumscribed rectangle of each key point set:

假设全身关键点集合最小外接矩形的水平夹角为β₀；上半身关键点集合最小外接矩形水平夹角为β₁；下半身关键点集合最小外接矩形的水平夹角为β₂；左臂关键点集合最小外接矩形的水平夹角为β₃；右臂关键点集合最小外接矩形的水平夹角为β₄；左腿关键点集合最小外接矩形的水平夹角为β₅；右腿关键点集合最小外接矩形的水平夹角为β₆；左侧大腿关键点集合最小外接矩形的水平夹角为β₇；右侧大腿关键点集合最小外接矩形的水平夹角为β₈；Assume that the horizontal included angle of the minimum circumscribed rectangle of the whole body key point set is β ₀ ; the horizontal included angle of the minimum circumscribed rectangle of the upper body key point set is β ₁ ; the horizontal included angle of the minimum circumscribed rectangle of the lower body key point set is β ₂ ; the left arm key point set The horizontal included angle of the minimum circumscribed rectangle is β ₃ ; the horizontal included angle of the minimum circumscribed rectangle of the right arm key point set is β ₄ ; the horizontal included angle of the minimum circumscribed rectangle of the left leg key point set is β ₅ ; The horizontal included angle of the rectangle is β ₆ ; the horizontal included angle of the minimum circumscribed rectangle of the left thigh key point set is β ₇ ; the horizontal included angle of the minimum circumscribed rectangle of the right thigh key point set is β ₈ ;

若

或

则判定为抬臂姿态；若

或

则判定为踢腿姿态；若

且

则判定为站立姿态；若β₁|∈[λ₁,γ₁]，则判定为弯腰姿态；若

或

则判定为蹲坐姿态；若|β₀|∈[θ₀,Θ₀]、β₁|∈[θ₁,Θ₁]且β₂|∈[θ₂,Θ₂]，则判定为平躺姿态。like

or

It is judged to be the arm raising posture; if

or

It is judged as kicking posture; if

and

It is judged as standing posture; if β ₁ |∈[λ ₁ ,γ ₁ ], it is judged as bending posture;

or

If |β ₀ |∈[θ ₀ ,Θ ₀ ], β ₁ |∈[θ ₁ ,Θ ₁ ] and β ₂ |∈[θ ₂ ,Θ ₂ ], then it is determined as lying down attitude.

一种用于复杂多人场景的人体姿态判定系统，所述系统包括：A human body posture determination system for complex multi-person scenes, the system includes:

关键点划分模块，用于检测人体关键点，并划分关键点集合；The key point division module is used to detect the key points of the human body and divide the set of key points;

统计特征提取模块，用于提取各集合中人体关键点几何分布的统计特征；The statistical feature extraction module is used to extract the statistical features of the geometric distribution of the key points of the human body in each set;

姿态判定模块，用于根据统计特征判定人体姿态。The posture determination module is used to determine the posture of the human body according to the statistical features.

本发明与现有技术相比，其显著优点为：1)本发明采用了划分人体关键点集合的思想，相对于基于视频特征提取的方法，避免了目标提取的难题，可以适用于多人场景下的人体姿态判定；2)相对于现有基于人体关键点提取的方法，将人体姿态判定问题转换成了关键点集合的统计关系，使得本发明对于部分关键点的缺失和移位不敏感，即使在部分身体被遮挡导致关键点不准确的情况下，依然能够根据集合的统计几个特征得到正确的姿态判定结果，从而大大提高了姿态判定的鲁棒性；3)本发明的判定速度较快，可以实现监控范围内人体姿态的实时判定，帮助人们快速了解当前场景中的个体状态以及发生的事件。Compared with the prior art, the present invention has the following significant advantages: 1) The present invention adopts the idea of dividing the human body key point set, compared with the method based on video feature extraction, it avoids the difficult problem of target extraction, and can be applied to multi-person scenarios 2) Compared with the existing method based on human body key point extraction, the problem of human body posture determination is converted into the statistical relationship of the key point set, so that the present invention is insensitive to the deletion and displacement of some key points, Even if some of the body is occluded and the key points are inaccurate, the correct attitude determination result can still be obtained according to the statistical features of the set, thereby greatly improving the robustness of attitude determination; 3) The determination speed of the present invention is relatively high. Fast, it can realize real-time determination of human body posture within the monitoring range, and help people quickly understand the individual status and events in the current scene.

下面结合附图对本发明作进一步详细描述。The present invention will be described in further detail below with reference to the accompanying drawings.

附图说明Description of drawings

图1为一个实施例中用于复杂多人场景的人体姿态判定方法的流程图。FIG. 1 is a flowchart of a method for determining human posture in a complex multi-person scene in one embodiment.

图2为一个实施例中人体关键点集合的划分示意图。FIG. 2 is a schematic diagram of division of a human body key point set in one embodiment.

图3为一个实施例中计算关键点集合凸包的水平角度示意图。FIG. 3 is a schematic diagram of a horizontal angle of calculating a convex hull of a set of key points in one embodiment.

图4为一个实施例中计算关键点集合最小外接矩形的旋转角度示意图。FIG. 4 is a schematic diagram of calculating the rotation angle of the minimum circumscribed rectangle of the key point set in one embodiment.

图5为一个实施例中单个人基本姿态的判定结果图；其中，人体关键点用圆点标注；姿态的判定结果标注在了人的头部，图(a)至(c)分别为站立姿态、蹲坐姿态和弯腰姿态。Fig. 5 is the judgment result diagram of the basic posture of a single person in one embodiment; wherein, the key points of the human body are marked with dots; the judgment result of the posture is marked on the head of the person, and Figures (a) to (c) are standing postures respectively. , squatting posture and bending posture.

图6为一个实施例中单个人复杂姿态的判定结果图，其中，人体关键点用圆点标注，姿态的判定结果标注在人的头部，图(a)至(c)分别为站立抬臂姿态、站立踢腿姿态和站立抬臂踢腿姿态。Fig. 6 is the judgment result diagram of the complex posture of a single person in one embodiment, wherein, the key points of the human body are marked with dots, and the judgment result of the posture is marked on the head of the person, and Figures (a) to (c) are standing and raising arms respectively. Stance, Standing Kick Stance, and Standing Arm Kick Stance.

图7为一个实施例中多人复杂姿态的判定结果图，其中，人体关键点用圆点进行了标注，姿态的判定结果标注在人的头部。FIG. 7 is a diagram showing the judgment result of complex postures of multiple people in one embodiment, wherein the key points of the human body are marked with dots, and the judgment result of the posture is marked on the head of the person.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the present application more clearly understood, the present application will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

需要说明，若本发明实施例中有涉及“第一”、“第二”等的描述，则该“第一”、“第二”等的描述仅用于描述目的，而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外，各个实施例之间的技术方案可以相互结合，但是必须是以本领域普通技术人员能够实现为基础，当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在，也不在本发明要求的保护范围之内。It should be noted that if there are descriptions involving "first", "second", etc. in the embodiments of the present invention, the descriptions of "first", "second", etc. are only used for description purposes, and should not be understood as instructions or Implicit their relative importance or implicitly indicate the number of technical features indicated. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In addition, the technical solutions between the various embodiments can be combined with each other, but must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that the combination of such technical solutions does not exist. , is not within the scope of protection required by the present invention.

在一个实施例中，结合图1，提供了一种用于复杂多人场景的人体姿态判定方法，所述方法包括以下步骤：In one embodiment, with reference to FIG. 1 , a method for determining human posture in a complex multi-person scene is provided, and the method includes the following steps:

进一步地，在其中一个实施例中，步骤1所述检测人体关键点，并划分关键点集合，具体过程包括：Further, in one of the embodiments, the human body key points are detected in step 1, and the key point set is divided, and the specific process includes:

步骤1-4，结合图2，划分关键点集合，具体为：Steps 1-4, combined with Figure 2, divide the set of key points, specifically:

这里，采用文献“Realtime Multi-person 2D Pose Estimation Using PartAffinity Fields”中的方法提取人体关键点。Here, the method in the document "Realtime Multi-person 2D Pose Estimation Using PartAffinity Fields" is used to extract human key points.

这里，人体躯干关键点集合包括左眼、右眼、嘴部、颈部、左髋关节、右髋关节、左膝关节、右膝关节、左踝关节、右踝关节，共十个关键点；Here, the set of key points of the human torso includes ten key points in total: left eye, right eye, mouth, neck, left hip joint, right hip joint, left knee joint, right knee joint, left ankle joint, and right ankle joint;

上半身关键点集合包括左眼、右眼、嘴部、颈部、左髋关节、右髋关节，共六个关键点；The upper body key point collection includes six key points: left eye, right eye, mouth, neck, left hip joint, and right hip joint;

下半身关键点集合包括左髋关节、右髋关节、左膝关节、右膝关节、左踝关节、右踝关节，共六个关键点；The lower body key point set includes six key points in total: left hip joint, right hip joint, left knee joint, right knee joint, left ankle joint, and right ankle joint;

左臂关键点集合包括左腕关节、左肘关节、左肩关节，共三个关键点；The set of key points of the left arm includes the left wrist joint, the left elbow joint, and the left shoulder joint, a total of three key points;

右臂关键点集合包括右腕关节、右肘关节、右肩关节，共三个关键点；The right arm key point set includes the right wrist joint, right elbow joint, right shoulder joint, a total of three key points;

左腿关键点集合包括左踝关节、左膝关节、左髋关节，共三个关键点；The set of key points of the left leg includes the left ankle joint, the left knee joint, and the left hip joint, a total of three key points;

右腿关键点集合包括右踝关节、右膝关节、右髋关节，共三个关键点；The set of key points of the right leg includes the right ankle joint, the right knee joint, and the right hip joint, a total of three key points;

左侧大腿关键点集合包括左膝关节、左髋关节，共两个关键点；The set of key points of the left thigh includes the left knee joint and the left hip joint, a total of two key points;

右侧大腿关键点集合包括右膝关节、右髋关节，共两个关键点。The right thigh key point set includes the right knee joint and the right hip joint, a total of two key points.

进一步地，在其中一个实施例中，结合图3，步骤2所述提取各集合中人体关键点几何分布的统计特征，具体过程包括：Further, in one of the embodiments, with reference to FIG. 3 , in step 2, the statistical features of the geometric distribution of the human body key points in each set are extracted, and the specific process includes:

进一步地，在其中一个实施例中，步骤3所述根据统计特征判定人体姿态，具体为：根据各关键点集合凸包的水平角度，判定手臂、腿部以及全身的姿态：Further, in one of the embodiments, the step 3 determines the posture of the human body according to the statistical features, specifically: according to the horizontal angle of the convex hull of each key point set, the posture of the arm, the leg and the whole body is determined:

这里优选地，(a₃,b₃)＝(0.25,+∞)，(a₅,b₅)＝(0.25,5)，(a₀,b₀)＝(a₁,b₁)＝(a₂,b₂)＝(4,+∞)，[a₇,b₇]＝[0,0.25]，[c₀,d₀]＝[c₁,d₁]＝[c₂,d₂]＝[0,0.25]。Here preferably, (a ₃ ,b ₃ )=(0.25,+∞), (a ₅ ,b ₅ )=(0.25,5), (a ₀ ,b ₀ )=(a ₁ ,b ₁ )=( a ₂ ,b ₂ )=(4,+∞), [a ₇ ,b ₇ ]=[0,0.25], [c ₀ ,d ₀ ]=[c ₁ ,d ₁ ]=[c ₂ ,d ₂ ]=[0,0.25].

进一步地，在其中一个实施例中，结合图4，步骤2所述提取各集合中人体关键点几何分布的统计特征，具体过程包括：Further, in one of the embodiments, with reference to FIG. 4 , in step 2, the statistical features of the geometric distribution of the human body key points in each set are extracted, and the specific process includes:

进一步地，在其中一个实施例中，步骤3所述根据统计特征判定人体姿态，具体为：根据各关键点集合最小外接矩形的旋转角度，判定手臂、腿部以及全身的姿态：Further, in one of the embodiments, the step 3 determines the posture of the human body according to the statistical features, specifically: according to the rotation angle of the minimum circumscribed rectangle of each key point set, the posture of the arm, the leg and the whole body is determined:

若

或

则判定为抬臂姿态；若

或

则判定为踢腿姿态；若

且

或

or

It is judged to be the arm raising posture; if

or

It is judged as kicking posture; if

and

or

这里优选地，

[λ₁,γ₁]＝[0°,25°]，

[θ₀,Θ₀]＝[θ₁,Θ₁]＝[θ₂,Θ₂]＝[0°,25°]。Here preferably,

[λ ₁ ,γ ₁ ]=[0°, 25°],

[θ ₀ , Θ ₀ ]=[θ ₁ , Θ ₁ ]=[θ ₂ , Θ ₂ ]=[0°, 25°].

在一个实施例中，提供了一种用于复杂多人场景的人体姿态判定系统，所述系统包括：In one embodiment, a system for determining human posture in a complex multi-person scene is provided, the system comprising:

进一步地，在其中一个实施例中，所述关键点划分模块包括：Further, in one of the embodiments, the key point division module includes:

训练集构建单元，用于构建样本训练集，该集合包括标注好人体关键点的若干人体图像；The training set construction unit is used to construct a sample training set, the set includes several human body images marked with key points of the human body;

训练单元，用于利用所述样本训练集训练深度卷积神经网络；a training unit for training a deep convolutional neural network using the sample training set;

关键点检测单元，利用训练好的深度卷积神经网络检测待检测人体图像中的关键点；The key point detection unit uses the trained deep convolutional neural network to detect the key points in the human image to be detected;

划分单元，用于划分关键点集合，具体包括：The division unit is used to divide the set of key points, including:

第一划分子单元，用于剔除非人体躯干的关键点，包括手部，形成全身关键点集合；The first division subunit is used to eliminate key points of non-human torso, including hands, to form a set of key points of the whole body;

第二划分子单元，用于将人体躯干关键点划分为上半身关键点集合、下半身关键点集合、左臂关键点集合和右臂关键点集合；The second division subunit is used to divide the key points of the human torso into a set of key points of the upper body, a set of key points of the lower body, a set of key points of the left arm and a set of key points of the right arm;

第三划分子单元，用于将下半身关键点继续划分为左腿关键点集合、右腿关键点集合、左侧大腿关键点集合以及右侧大腿关键点集合。The third division subunit is used to further divide the key points of the lower body into a set of key points of the left leg, a set of key points of the right leg, a set of key points of the left thigh, and a set of key points of the right thigh.

进一步地，在其中一个实施例中，所述统计特征提取模块包括：Further, in one of the embodiments, the statistical feature extraction module includes:

第一计算单元，用于计算各关键点集合的凸包；a first calculation unit, used to calculate the convex hull of each key point set;

关键点筛选单元，用于从构成凸包的关键点中找出距离最大的两个关键点；The key point screening unit is used to find the two key points with the largest distance from the key points that constitute the convex hull;

第二计算单元，用于计算所述两个关键点连线沿顺时针方向旋转与水平方向的夹角，作为该凸包的水平角度；The second calculation unit is used to calculate the angle between the clockwise rotation and the horizontal direction of the connecting line of the two key points, as the horizontal angle of the convex hull;

所述姿态判定模块，用于根据各关键点集合凸包的水平角度，判定手臂、腿部以及全身的姿态：The attitude determination module is used to determine the attitude of the arms, legs and the whole body according to the horizontal angle of the convex hull of each key point set:

第三计算单元，用于计算各关键点集合的最小外接矩形；The third calculation unit is used to calculate the minimum circumscribed rectangle of each key point set;

第四计算单元，用于分别计算最小外接矩形的长边和短边沿顺时针方向旋转与水平方向构成的夹角；The fourth calculation unit is used to calculate the angle formed by the clockwise rotation of the long side and the short side of the minimum circumscribed rectangle and the horizontal direction respectively;

第五计算单元，用于取两个夹角中的较小值作为最小外接矩形的旋转角度；The fifth calculation unit is used to take the smaller value of the two included angles as the rotation angle of the minimum circumscribed rectangle;

所述姿态判定模块，用于根据各关键点集合最小外接矩形的旋转角度，判定手臂、腿部以及全身的姿态：The attitude determination module is used to determine the attitude of the arm, the leg and the whole body according to the rotation angle of the minimum circumscribed rectangle of each key point set:

若

或

则判定为抬臂姿态；若

或

则判定为踢腿姿态；若

且

或

or

It is judged to be the arm raising posture; if

or

It is judged as kicking posture; if

and

or

关于用于复杂多人场景的人体姿态判定系统的具体限定可以参见上文中对于用于复杂多人场景的人体姿态判定方法的限定，在此不再赘述。上述用于复杂多人场景的人体姿态判定系统中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the human body posture determination system used in the complex multi-person scene, please refer to the definition of the human body posture determination method used in the complex multi-person scene above, which will not be repeated here. Each module in the above-mentioned human posture determination system for complex multi-person scenarios can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

作为一种具体示例，在其中一个实施例中，对本发明进一步验证说明，利用上述最小外界矩形的方法判定监控视频中的人体姿态，测试结果如图5、图6、图7和表1所示。As a specific example, in one of the embodiments, the present invention is further verified and explained, the above-mentioned method of minimum external rectangle is used to determine the human body posture in the surveillance video, and the test results are shown in Figure 5, Figure 6, Figure 7 and Table 1 .

表1 人体姿态判定结果统计Table 1 Statistics of human posture determination results

由图5单个人多种基本姿态的判定结果和图6单人多种复杂姿态的判定结果可以看出，本发明能够较为准确的判定单个人的姿态。由图7存在相互遮挡的多人复杂姿态的判定结果可以看出，本发明能够同时判定多人的姿态，同时对遮挡等影响不敏感。如表1所示，本发明判定人体姿态的平均正确率为92.833％，其中，对于站立、蹲坐、抬臂、平躺等区分度较明显的姿态判别的正确率较高，而对于弯腰、踢腿等易混淆的姿态判别的正确率稍低。It can be seen from the judgment results of multiple basic postures of a single person in FIG. 5 and the judgment results of multiple complex postures of a single person in FIG. 6 that the present invention can more accurately determine the posture of a single person. It can be seen from the determination result of complex postures of multiple people with mutual occlusion in FIG. 7 that the present invention can simultaneously determine the postures of multiple people, and is insensitive to influences such as occlusion. As shown in Table 1, the present invention has an average correct rate of 92.833% in determining the human body posture. Among them, the correct rate for judging postures with obvious distinctions such as standing, squatting, raising arms, and lying down is higher, while for bending over , kicking and other confusing gestures, the accuracy rate is slightly lower.

以上显示和描述了本发明的基本原理、主要特征及优点。本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下，本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments, and the descriptions in the above-mentioned embodiments and the description are only to illustrate the principle of the present invention. Without departing from the spirit and scope of the present invention, the present invention will have Various changes and modifications fall within the scope of the claimed invention. The claimed scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. A human body posture judgment method for a complex multi-person scene is characterized by comprising the following steps:

step 1, detecting key points of a human body, and dividing a key point set;

step 2, extracting the statistical characteristics of the geometric distribution of the human body key points in each set;

and 3, judging the posture of the human body according to the statistical characteristics.

2. The method for determining the human body posture in the complex multi-person scene according to claim 1, wherein the step 1 of detecting the human body key points and dividing the key point set comprises the following specific processes:

step 1-1, constructing a sample training set, wherein the set comprises a plurality of human body images marked with human body key points;

step 1-2, training a deep convolutional neural network by using the sample training set;

step 1-3, detecting key points in a human body image to be detected by using a trained deep convolution neural network;

step 1-4, dividing a key point set, specifically:

removing key points of non-human body trunk, including hands, to form a whole body key point set;

dividing the key points of the human body trunk into an upper body key point set, a lower body key point set, a left arm key point set and a right arm key point set;

and continuously dividing the lower half key points into a left leg key point set, a right leg key point set, a left thigh key point set and a right thigh key point set.

3. The method for determining human body posture in a complex multi-person scene according to claim 1 or 2, wherein the step 2 of extracting statistical features of the geometric distribution of the human body key points in each set comprises the following specific steps:

step 2-1-1, calculating convex hulls of each key point set;

step 2-1-2, finding out two key points with the largest distance from the key points forming the convex hull;

and 2-1-3, calculating an included angle between the clockwise rotation and the horizontal direction of the connecting line of the two key points, and taking the included angle as the horizontal angle of the convex hull.

4. The method for determining human body posture in a complex multi-person scene as claimed in claim 3, wherein the step 3 of determining human body posture according to the statistical features specifically comprises: judging the postures of the arms, the legs and the whole body according to the horizontal angle of the convex hull of each key point set:

assuming that the horizontal included angle of the convex hull of the whole body key point set is alpha₀(ii) a The horizontal included angle of the convex hull of the upper half key point set is alpha₁(ii) a The horizontal included angle of the convex hull of the lower half key point set is alpha₂(ii) a The horizontal included angle of the convex hull of the left arm key point set is alpha₃(ii) a The horizontal included angle of the right arm key point set convex hull is alpha₄(ii) a The horizontal included angle of the convex hull of the left leg key point set is alpha₅(ii) a The horizontal included angle of the right leg key point set convex hull is alpha₆(ii) a The horizontal included angle of the convex hull of the left thigh key point set is alpha₇(ii) a The horizontal included angle of the right thigh key point set convex hull is alpha₈；

5. The method for determining human body posture in a complex multi-person scene according to claim 1 or 2, wherein the step 2 of extracting statistical features of the geometric distribution of the human body key points in each set comprises the following specific steps:

step 2-2-1, calculating the minimum circumscribed rectangle of each key point set;

step 2-2-2, respectively calculating included angles formed by the rotation of the long side and the short side of the minimum circumscribed rectangle along the clockwise direction and the horizontal direction;

and 2-2-3, taking the smaller value of the two included angles as the rotation angle of the minimum circumscribed rectangle.

6. The method for determining human body posture in a complex multi-person scene as claimed in claim 5, wherein the step 3 of determining human body posture according to the statistical features specifically comprises: judging the postures of the arms, the legs and the whole body according to the rotation angle of the minimum circumscribed rectangle of each key point set:

assuming that the horizontal included angle of the minimum circumscribed rectangle of the whole body key point set is beta₀(ii) a The minimum external rectangle horizontal included angle of the upper half body key point set is beta₁(ii) a The horizontal included angle of the minimum external rectangle of the lower half key point set is beta₂(ii) a The horizontal included angle of the minimum circumscribed rectangle of the left arm key point set is beta₃(ii) a The horizontal included angle of the minimum circumscribed rectangle of the right arm key point set is beta₄(ii) a The horizontal included angle of the minimum circumscribed rectangle of the left leg key point set is beta₅(ii) a Right leg jointThe horizontal included angle of the minimum circumscribed rectangle of the key point set is beta₆(ii) a The horizontal included angle of the minimum circumscribed rectangle of the left thigh key point set is beta₇(ii) a The horizontal included angle of the minimum circumscribed rectangle of the right thigh key point set is beta₈；

If it is

Or

Judging the posture of the arm lifting; if it is

Or

Judging the posture of the kicking leg; if it is

And is

Judging the posture of standing; if beta is₁|∈[λ₁,γ₁]Judging the posture of bending waist; if it is

Or

Judging the squatting posture; if beta₀|∈[θ₀,Θ₀]、β₁|∈[θ₁,Θ₁]And beta is₂|∈[θ₂,Θ₂]Then, the user is determined to be in the lying posture.

7. A body pose determination system for a complex multi-person scene, the system comprising:

the key point dividing module is used for detecting key points of a human body and dividing a key point set;

the statistical feature extraction module is used for extracting the statistical features of the geometric distribution of the human body key points in each set;

and the posture judgment module is used for judging the human body posture according to the statistical characteristics.

8. The system of claim 7, wherein the keypoint segmentation module comprises:

the training set constructing unit is used for constructing a sample training set, and the set comprises a plurality of human body images marked with human body key points;

a training unit for training a deep convolutional neural network using the sample training set;

the key point detection unit is used for detecting key points in the human body image to be detected by using the trained deep convolutional neural network;

the dividing unit is used for dividing the key point set, and specifically comprises:

the first dividing unit is used for removing key points of the trunk of the non-human body, including hands, and forming a whole body key point set;

the second dividing subunit is used for dividing the key points of the human body into an upper body key point set, a lower body key point set, a left arm key point set and a right arm key point set;

and the third dividing subunit is used for continuously dividing the lower body key points into a left leg key point set, a right leg key point set, a left thigh key point set and a right thigh key point set.

9. The system of claim 8, wherein the statistical feature extraction module comprises:

the first calculating unit is used for calculating convex hulls of all the key point sets;

the key point screening unit is used for finding out two key points with the largest distance from the key points forming the convex hull;

the second calculation unit is used for calculating an included angle between the clockwise rotation and the horizontal direction of the connecting line of the two key points as the horizontal angle of the convex hull;

the gesture judging module is used for judging the gestures of the arms, the legs and the whole body according to the horizontal angles of the convex hulls of the key point sets:

10. The system of claim 8, wherein the statistical feature extraction module comprises:

the third calculation unit is used for calculating the minimum circumscribed rectangle of each key point set;

the fourth calculation unit is used for calculating included angles formed by the rotation of the long side and the short side of the minimum circumscribed rectangle along the clockwise direction and the horizontal direction respectively;

the fifth calculation unit is used for taking the smaller value of the two included angles as the rotation angle of the minimum circumscribed rectangle;

the gesture judging module is used for judging the gestures of the arms, the legs and the whole body according to the rotation angle of the minimum circumscribed rectangle of each key point set:

assuming that the horizontal included angle of the minimum circumscribed rectangle of the whole body key point set is beta₀(ii) a The minimum external rectangle horizontal included angle of the upper half body key point set is beta₁(ii) a The horizontal included angle of the minimum external rectangle of the lower half key point set is beta₂(ii) a The horizontal included angle of the minimum circumscribed rectangle of the left arm key point set is beta₃(ii) a The horizontal included angle of the minimum circumscribed rectangle of the right arm key point set is beta₄(ii) a The horizontal included angle of the minimum circumscribed rectangle of the left leg key point set is beta₅(ii) a The horizontal included angle of the minimum circumscribed rectangle of the right leg key point set is beta₆(ii) a The horizontal included angle of the minimum circumscribed rectangle of the left thigh key point set is beta₇(ii) a The horizontal included angle of the minimum circumscribed rectangle of the right thigh key point set is beta₈；

If it is

Or

Judging the posture of the arm lifting; if it is

Or

Judging the posture of the kicking leg; if it is

And is

Judging the posture of standing; if beta is₁|∈[λ₁,γ₁]Judging the posture of bending waist; if it isOr