CN112733761B

CN112733761B - Human body state matching method based on machine learning

Info

Publication number: CN112733761B
Application number: CN202110054577.0A
Authority: CN
Inventors: 卢书芳; 王宏升; 高飞; 丁雪峰
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-01-15
Filing date: 2021-01-15
Publication date: 2024-03-19
Anticipated expiration: 2041-01-15
Also published as: CN112733761A

Abstract

The invention discloses a human body state matching method based on machine learning, which comprises the following steps: (1) sample set pretreatment; (2) Open Pose detects human body key points; (3) training sample set alignment; (4) establishing a human body standard posture model; (5) PCA analysis dimension reduction; (6) Training SVM classifier, calculating deformation factor b of each item in data set _C Calculating b _C A threshold value of (2); (7) Calculating a deformation factor b of the human body posture shape to be verified _C If b _C The gesture shape to be verified is the predefined standard gesture if the value is within the threshold range, if b _C If the value is beyond the threshold range, the gesture shape to be verified is not a standard gesture. By using the method and the device, whether the new gesture is the predefined standard gesture can be quickly determined, and the matching accuracy is high.

Description

Human body state matching method based on machine learning

Technical Field

The invention belongs to the technical field of human body posture matching, and particularly relates to a human body state matching method based on machine learning.

Background

Human body pose matching techniques, i.e., determining whether a new pose is a predefined standard pose. This technique is required for many applications, such as virtual fitting, somatosensory games, etc.

The prior method is based on a template matching method, the template matching method firstly needs to establish a template library, and then compares the similarity of each sample in the template library and a human body target to be identified.

The Chinese patent document with publication number CN112101243A discloses a human motion recognition method based on key gestures and DTW, which comprises the following steps: s10, acquiring joint coordinate data of a 3D skeleton of human motion by using a depth sensor, and describing a static posture by using a characteristic joint relative distance; s20, extracting key gesture frames from an original action gesture sequence by using an X-Means clustering algorithm based on time constraint to describe the key gesture frames; s30, establishing a standard action template library, calculating the similarity distance between a key gesture frame sequence forming the action to be recognized and the action key gesture frame sequence in the standard action template library based on a DTW algorithm, and dividing the action to be recognized into action types with the minimum similarity distance.

The chinese patent document with publication number CN110598556a discloses a human body shape and posture matching method and apparatus for matching human body shape and posture of RGBD images, comprising: extracting two-dimensional human mask and two-dimensional key point information of RGBD images through a convolutional neural network; extracting depth information of the RGBD image using a depth map of the RGBD image; fusing the two-dimensional human mask information and the depth information to obtain three-dimensional human mask information; fusing the two-dimensional key point information and the depth information to obtain three-dimensional key point information; and comparing the three-dimensional human mask information and the three-dimensional key point information with information in a standard library to obtain the matching degree of the human body shape and posture.

However, the template-based method is rough, each gesture in the template library is a rigid gesture, and when performing gesture matching, the user is required to be as the preset gesture, but the same gesture may also have great difference in space in consideration of the diversity of human gesture actions and the multi-scale structure of sample data. The accuracy of template-based methods is very limited.

Disclosure of Invention

The invention provides a human body state matching method based on machine learning, which can quickly determine whether a new gesture is a predefined standard gesture, and has high matching accuracy.

A human body state matching method based on machine learning comprises the following steps:

(1) Collecting front views of human bodies with a plurality of standard postures as positive samples, and establishing a standard posture sample set D; collecting the front view of the human body with nonstandard posture on the same scale as a negative sample; detecting and storing key point information of all positive samples and negative samples;

(2) Training the alignment of the standard posture sample set to obtain a pairA set of aligned pose shape data

(3) Computing aligned pose shape data setsMean gesture shape ∈>And +.>Is +.>And average gesture shape->The change between is expressed as->Calculating a covariance matrix S;

(4) Performing feature decomposition on the covariance matrix S by using principal component analysis PCA to find out principal components of the deformation of the attitude shape; the transformed pose shape is approximately expressed as:

wherein Q is _C Representing the feature vector corresponding to the maximum feature value C, b _C As a deformation factor, b _C The smaller the size of the product,relative to->The smaller the deformation of (2);

(5) Preparation ofA training data set comprising positive and negative samples, an SVM classifier is trained, and a deformation factor b for each term in the data set is calculated _C Marking qualified gesture as 1, marking nonstandard gesture as 0, and calculating b _C A threshold value of (2);

(6) For the human body posture shape p to be verified _new First, it is combined with the average pose shapeAlignment, the shape after alignment is expressed as +.>

(7) Calculating a deformation factor b of the human body posture shape to be verified _C If b _C The gesture shape to be verified is the predefined standard gesture if the value is within the threshold range, if b _C If the value is beyond the threshold range, the gesture shape to be verified is not a standard gesture.

In the step (1), the detecting and saving the key point information of all positive samples and negative samples includes: firstly, compressing each picture to be below 20K, then using Open Pose to estimate key points of a human body, providing 25 body key points in total, and storing key point information of the corresponding picture in a json format with the same name after the key point estimation is completed.

The specific process of the step (2) is as follows:

(2-1) representing the standard pose sample set as follows:

D＝{p _i |0≤i≤n}

where n is the total number of samples, p _i Representing the i-th sample;

(2-2) randomly selecting an initialization gesture shape p _m E D, dividing the sample set by p _m Other posture shapes are sequentially connected with p _m Alignment, finally obtaining an aligned gesture shape data set

In the step (2-2), p is removed from the sample set _m Other posture shapes are sequentially connected with p _m The specific process of alignment is as follows:

(2-2-1) dividing p in the Standard gesture sample set D _m External p _t Carrying out affine transformation, wherein t is more than or equal to 0 and less than or equal to n, and t is not equal to m, A is an affine transformation matrix,is the transformed pose shape, E represents the transformed shape +.>And a selected shape p _m Is a distinction between (a);

(2-2-2) finding an affine transformation matrix a such that the difference E is sufficiently small;

(2-2-3) passing throughObtaining transformed shape->At this time, a->And p is as follows _m Has been aligned.

In step (3), an aligned pose shape dataset is computedMean gesture shape ∈>The formula of (2) is:

in [ of ]] ^T Representing the transpose of the matrix,representation->X corresponding to all gesture shapes in the model _k Average value->Representation +.>Y corresponding to all the gesture shapes in the model _k Average value.

The specific process of the step (4) is as follows:

the process of principal component analysis PCA is represented as follows:

Sq _k ＝λ _k q _k k＝0,1,2,…,2n-1

wherein q _k Is a feature vector lambda _k Is a characteristic value; the new spatial basis is expressed as q= [ Q ] ₀ q ₁ … q _2n-1 ]Thus any transformed pose shapeExpressed as:

by lambda _k Characteristic value pair q _k Ordering to lambda ₁ ≥λ ₂ ≥λ ₃ ≥…≥λ _2n Feature vector Q corresponding to maximum feature value C _C The transformed pose shape is approximately expressed as:

wherein the average shapeAnd feature vector Q _C Is a gesture shape model->Parameters b of (b) _C Is a deformation factor.

Compared with the prior art, the invention has the following beneficial effects:

the invention establishes a model with higher latitude, has high matching accuracy, and simultaneously has simple algorithm realization, low requirement on machine operation and higher speed.

Drawings

FIG. 1 is a schematic flow chart of the method of the present invention;

fig. 2 is a schematic diagram of key points of a human body according to an embodiment of the present invention.

Detailed Description

The invention will be described in further detail with reference to the drawings and examples, it being noted that the examples described below are intended to facilitate the understanding of the invention and are not intended to limit the invention in any way.

As shown in fig. 1, a human body state matching method based on machine learning includes the following steps:

1) Preprocessing the sample human body posture picture.

(1.1) the invention collects the front view of the human body with a plurality of standard postures as a positive sample for establishing a standard posture sample set. The persons in these photographs have different body proportions but all exhibit a standard pose shape; at the same time, the same scale of non-standard human front view is also collected as a negative sample.

(1.2) to save computer resources, each picture is compressed below 20K.

(1.3) estimating key points of a human body using Open Pose, which is a powerful library for detecting key points of a human body in real time. It can provide 25 body keypoints, including the head and the foot. After the key point estimation is completed, the key point information of the corresponding picture is stored in a json format with the same name, and the key point of the gesture estimation is shown in fig. 2.

2) Training the alignment of the standard posture sample set.

(2.1) representing the human body standard posture sample set as follows:

D＝{p _i |0≤i≤n}

where n is the total number of samples, p _i Representing the i-th sample.

(2.2) aligning the two poses. Randomly selecting an initialization gesture shape p _m E D, p _t E D (m not equal to t) and p _m Alignment.

(2.2.1) p pair _t Performing affine transformation, wherein A is an affine transformation matrix,is the transformed pose shape, E represents the transformed shape +.>And a selected shape p _m Is a distinction between (a) and (b).

(2.2.2) find the affine transformation matrix a so that the difference E is small enough.

(2.2.3) passing throughObtaining transformed shape->At this time, the person is strapped with (his/her own)>And p is as follows _m Has been aligned.

(2.3) sequentially combining other gesture shapes in the sample set with the initialized gesture shape p in step (2.2) _m Alignment, finally obtaining an aligned gesture shape data set

3) Calculating an average pose shape

4)、Is of the shape and the average shape>The variation between them is expressed asCalculating a covariance matrix S:

the matrix is used to representThe difference between the middle pose shape and the average shape.

5) And performing feature decomposition on S by using Principal Component Analysis (PCA) to find out principal components of the deformation of the posture shape. The process of PCA can be expressed as follows:

Sq _k ＝λ _k q _k k＝0,1,2,…,2n-1

wherein q _k Is a feature vector lambda _k Is a characteristic value. The new spatial basis is expressed as q= [ Q ] ₀ q ₁ … q _2n-1 ]Thus any transformed pose shapeCan be expressed as:

by lambda _k Characteristic value pair q _k Ordering to lambda ₁ ≥λ ₂ ≥λ ₃ ≥…≥λ _2n The largest change in pose shape can be described by the first few feature vectors. In the method, only the Cmax characteristic value is considered, and the transformed gesture shape can be approximately expressed as:

wherein the average shapeAnd feature vector Q _C Is a gesture shape model->Is a parameter of (a).

6) Verifying whether the new human body posture shape matches the standard posture shape. The specific method comprises the following steps:

(6.1) for a new shape pose p _new First, it is combined with the average shapeAlignment. The aligned shape is denoted +.>

(6.2)Can be approximately formed by the average shape +.>And feature vector Q _C The representation is:

adjusting the parameter positions:

b in the formula _C As a deformation factor, b _C The smaller the size of the product,relative to->The smaller the deformation of (c).

(6.3) training the SVM classifier to set b _C The specific method is as follows:

a training dataset is prepared comprising two types of human body pose picture data, namely a standard pose map and a non-standard pose map. Calculating a deformation factor b for each item in the dataset _C Marking qualified gesture as 1, marking nonstandard gesture as 0, and calculating b _C Is set to a threshold value of (2).

(6.4) if b of new shape _C The new pose shape is the predefined standard pose if the value is within the range, b of the new shape _C The value is out of range and the new pose shape is not a standard pose.

The foregoing embodiments have described in detail the technical solution and the advantages of the present invention, it should be understood that the foregoing embodiments are merely illustrative of the present invention and are not intended to limit the invention, and any modifications, additions and equivalents made within the scope of the principles of the present invention should be included in the scope of the invention.

Claims

1. The human body state matching method based on machine learning is characterized by comprising the following steps of:

(2) Training alignment of standard gesture sample sets to obtain aligned gesture shape data sets

(3) Computing aligned pose shape data setsMean gesture shape ∈>And +.>Each of the gesture shapesAnd average gesture shape->The change between is expressed as->Calculating a covariance matrix S;

(5) Preparing a training data set containing positive and negative samples, training SVM classifier, and countingDeformation factor b for each item in the dataset _C Marking qualified gesture as 1, marking nonstandard gesture as 0, and calculating b _C A threshold value of (2);

2. The machine learning based human body state matching method according to claim 1, wherein in the step (1), the detecting and saving key point information of all positive samples and negative samples comprises: firstly, compressing each picture to be below 20K, then using Open Pose to estimate key points of a human body, providing 25 body key points in total, and storing key point information of the corresponding picture in a json format with the same name after the key point estimation is completed.

3. The machine learning based human body state matching method of claim 1, wherein the specific process of step (2) is:

(2-1) representing the standard pose sample set as follows:

D＝{p _i |0≤i≤n}

where n is the total number of samples, p _i Representing the i-th sample;

4. The machine learning based human body state matching method of claim 3, wherein in step (2-2), p is divided from the sample set _m Other posture shapes are sequentially connected with p _m The specific process of alignment is as follows:

5. The machine learning based human state matching method of claim 3, wherein in step (3), an aligned pose shape dataset is calculatedMean gesture shape ∈>The formula of (2) is:

6. The machine learning based human body state matching method of claim 5, wherein the specific process of step (4) is as follows:

the process of principal component analysis PCA is represented as follows:

Sq _k ＝λ _k q _k k＝0，1，2，...，2n-1

wherein q _k Is a feature vector lambda _k Is a characteristic value; the new spatial basis is expressed as q= [ Q ] ₀ q ₁ …q _2n-1 ]Thus any transformed pose shapeExpressed as: