CN108392207B

CN108392207B - Gesture tag-based action recognition method

Info

Publication number: CN108392207B
Application number: CN201810133363.0A
Authority: CN
Inventors: 徐嘉晨; 张晓云; 刘小通; 周建益
Original assignee: Northwestern University
Current assignee: Northwestern University
Priority date: 2018-02-09
Filing date: 2018-02-09
Publication date: 2020-12-11
Anticipated expiration: 2038-02-09
Also published as: CN108392207A

Abstract

The invention provides a gesture tag-based action recognition method, which abstracts action recognition into gesture recognition, abstracts gestures into gesture tags based on a key node relative position method, and finds out actions sent by human beings by comparing gesture changes of the human beings for a certain time; the method reduces the difficulty of establishing the template library, greatly reduces the speed and the operation requirement of the action recognition, and improves the universality of the action recognition on the individual recognition. The method has important application value in the fields of human-computer interaction, virtual reality, video monitoring and motion characteristic analysis.

Description

Gesture tag-based action recognition method

Technical Field

The invention belongs to the technical field of motion recognition, and relates to a motion recognition method based on a posture label.

Background

Action recognition is a hot point of research in recent years, and research results generated in the existing action recognition field are applied to various fields such as people's air defense security, human life habit research, man-machine interaction, virtual reality and the like, and have great positive effects. In the traditional action recognition, an image (including a video, a plurality of photos and the like) is directly analyzed through a technical method related to image processing, and the action recognition is finally realized through the steps of image segmentation, feature extraction, action feature classification and the like. Although the existing action recognition method is greatly developed, certain problems still exist, such as huge calculation amount; the action characteristic library is not well established, and a professional is required to input materials; when people with body heights different from the body height of the material are identified, the precision is greatly reduced, and the like.

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide an action recognition method based on attitude tags, which solves the problems of large calculation amount, difficult establishment of a template library and poor universality of the template library in the prior action recognition technology.

In order to achieve the purpose, the invention adopts the following technical scheme:

a method of decomposing an action into gesture tags, comprising the steps of:

step 1, adopting skeleton tracking equipment to obtain position data of key nodes of human trunk actions at each moment, wherein the position data of the key nodes are data in a skeleton tracking equipment coordinate system; the key nodes at least comprise a key node HEAD, a key node SHOULDER CENTER, a key node SPINE, a key node HIP CENTER, a key node SHOULDE RIGHT, a key node SHOULDER LEFT, a key node ELBOW RIGHT, a key node ELBOW LEFT, a key node WRIST RIGHT, a key node WRIST LEFT, a key node HAND RIGHT, a key node HANDLEFT, a key node HIPRIGHT, a key node HIPLEFT, a key node KNIEE RIGHT, a key node KNIEE LEFT, a key node ANIKLE RIGHT, a key node ANIKLE LEFT, a key node FOOT RIGHT and a key node FOOT LEFT;

step 2, respectively converting the position data of the key nodes at each moment obtained in the step 1 into position data of the key nodes in a morphological coordinate system; the morphological coordinate system takes the facing direction of the human body trunk as the positive direction of a Z axis, the direction of the morphological upper end of the human body trunk as the positive direction of a Y axis, the left direction of the human body as the positive direction of an X axis and the key node HIP CENTER as the origin;

step 3, respectively obtaining attitude labels at each moment by using the position data of the key nodes under the morphological coordinate system at each moment, which are obtained in the step 2, wherein the attitude labels comprise main body attitude labels GL_bodyLeft forelimb posture label GL_lfRight front limb posture label GL_rfLeft hind limb posture label GL_lbAnd right hind limb posture label GL_rb。

Optionally, the subject posture label GL in step 3_bodyThe method for obtaining (1) is as follows:

selecting X_F，Y_FAnd Z_FFinding the coordinate value with the maximum absolute value, and finding the GL corresponding to the interval to which the coordinate value belongs_bodyIs the subject posture label GL_bodyThe following formula is adopted:

wherein, X_F，Y_FAnd Z_FCoordinates of 3 coordinate axes of the unit vector F respectively; unit vector

A vector formed by the key node HEAD and the key node HIP CENTER in the skeleton tracking device coordinate system;

the left forelimb posture label GL in the step 3_lfRight front limb posture label GL_rfLeft hind limb posture label GL_lbAnd right hind limb posture label GL_rbThe method for obtaining (1) is as follows:

the four posture labels all comprise three key nodes which are marked as a key node 1, a key node 2 and a key node 3, and the posture label GL of the left forelimb_lfThree key nodes included are ELBOW LEFT, WRIST LEFT and HAND LEFT, respectively, for the right forelimb posture label GL_rfThree key nodes included are KNIEE LEFT, ANIKLE LEFT and FOOT LEFT, respectively, for LEFT hind limb posture label GL_lbThe three key nodes are ELBOW LEFT, WRIST LEFT and HAND LEFT, and the posture mark of right hind limbLabel GL_rbThe three key nodes included are KNIEE LEFT, ANIKLE LEFT and FOOT LEFT, respectively.

The data of the three key nodes in the morphological coordinate system are respectively used as (X)₁,Y₁,Z₁)(X₂,Y₂,Z₂)(X₃,Y₃,Z₃) Represents; the four attitude tags each include height tag G1, orientation tag G2, and curl tag G3;

the height label G1 is obtained by the following method:

G1＝(g₁+g₂+g₃) And/3, rounding, wherein,

wherein n is 1,2,3, Y_HIs a Y-axis coordinate in the morphological coordinate system of the key node HEAD_HCThe Y-axis coordinate of the key node SHOULDER CENTER under the morphological coordinate system;

the orientation label G2 is obtained as follows:

counting the symbols of the X-axis coordinate and the Z-axis coordinate of the key node 1, the key node 2 and the key node 3, and solving an orientation label G2 by adopting the following formula:

the method for obtaining the crimp label G3 is as follows:

introducing a key node 4 according to the key node 1, the key node 2 and the key node 3, respectively calculating the distances D between the key node 1, the key node 2 and the key node 3 and the key node 4₁，D₂，D₃(ii) a For left forelimb posture label GL_lfThe key node 4 is SHOULDER LEFT, for the right forelimb posture label GL_rfThe key node 4 is SHOULDER RIGHT, for left hind limb posture label GL_lbThe key node 4 is HIPLEFT, and the right hind limb posture label GL_rbThe key node 4 is HIPRIGHT；

The value of the curl label G3 is given by the following formula:

the invention also provides a method for obtaining the action template library, which comprises the following steps:

step 1, making standard actions for multiple times, and decomposing the standard actions made each time into attitude tags at each moment; selecting the attitude tag at the initial moment as an initial frame attitude tag, and selecting the attitude tag at the termination moment as a termination frame attitude tag; taking the standard action made for the first time as a comparison standard action, and taking the standard actions made for other times as reference standard actions; comparing the initial frame posture label corresponding to the standard action as an initial frame comparison posture label, and using the termination frame posture label corresponding to the first made standard action as a termination frame comparison posture label;

the standard action made each time is decomposed into attitude tags at various moments, wherein the attitude tags are obtained according to the method of claim 1;

step 2, solving an initial frame similarity coefficient group, wherein the specific method is as follows:

respectively calculating the similarity Sl1(A) of each attribute of the start frame attitude tag and the start frame contrast attitude tags of a plurality of reference standard actions_nThe formula used is as follows:

Sl1(A)_n＝A_n×z1_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3)

Wherein A is_nFor initialized similarity coefficient values, A_nN denotes the serial number of the attribute, and the serial numbers 1 to 13 are the body posture labels GL of the posture labels, respectively_bodyLeft forelimb posture label GL_lfHeight label G1 in (1), left forelimb posture label GL_lfMiddle orientation label G2, left forelimb posture label GL_lfMiddle curl tag G3, left hind limb posture tag GL_lbHeight label G1 in (1), left hind limb posture labelLabel GL_lbMiddle orientation label G2, left hind limb posture label GL_lbMiddle curl label G3, right front limb posture label GL_rfHeight label G1, right front limb posture label GL_rfOrientation label G2, right forelimb posture label GL_rfThe curl label G3, the right hind limb posture label GL_rbHeight label G1, right hind limb posture label GL_rbOrientation label G2, right hind limb posture label GL_rbCrimp label G3; z1_nThe absolute value of the difference value of the corresponding attributes of the initial frame attitude tag and the initial frame contrast attitude tag of the reference standard action;

for each attribute n, selecting similarity Sl1(A) calculated according to the starting frame attitude labels of a plurality of reference standard actions_nThe second largest value in (b) is taken as the similarity coefficient value A1 under the attribute_n. The similarity coefficient value A1 corresponding to each attribute n_nForming an initial frame similarity coefficient set A_star＝{A1_n,n∈Z,n＝1,2,...,13}；

And step 3, solving a similarity coefficient group of the termination frame, wherein the specific method comprises the following steps:

respectively calculating the similarity Sl2(A) of each attribute of the terminal frame attitude tag and the terminal frame contrast attitude tags of a plurality of reference standard actions_nThe formula used is as follows:

Sl2(A)_n＝A_n×z2_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3)

Wherein, z2_nThe absolute value of the difference between the corresponding attributes of the terminating frame pose tag and the terminating frame contrast pose tag for the reference standard action;

selecting similarity Sl2(A) calculated according to the attitude labels of the termination frames of a plurality of reference standard actions for each attribute n_nThe second largest value in (b) is taken as the similarity coefficient value A2 under the attribute_n(ii) a The similarity coefficient value A2 corresponding to each attribute n_nForming a group A of termination frame similarity coefficients_stop＝{A2_n,n∈Z,n＝1,2,...,13}；

And 4, aiming at a plurality of standard actions, obtaining an initial frame similarity coefficient group and an ending frame similarity coefficient group corresponding to each standard action according to the method in the step 1-3, wherein the initial frame similarity coefficient group and the ending frame similarity coefficient group corresponding to all the standard actions form an action template library.

The invention also provides an action recognition method based on the attitude tag, which comprises the following steps:

step 1, decomposing the action to be recognized into attitude tags at each moment aiming at the action to be recognized; the gesture label for decomposing the action to be recognized into each moment is obtained according to the method of claim 1;

step 2, selecting a certain standard action in the action template library, calculating the similarity SL (B) of each attribute between the termination frame attitude tag obtained in the step 1 and the termination frame attitude tag of the selected standard action_nAnd recording the attitude tag of the termination frame as the attitude tag of the tth frame, wherein the adopted formula is as follows:

SL(B)_n＝A1_n×z3_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3)

Wherein, z3_nThe absolute value of the difference value of the corresponding attributes of the termination frame attitude tag obtained in the step 1 and the termination frame attitude tag of the selected standard action;

calculating the overall similarity S (B) between the attitude tag of the termination frame obtained in the step 1 and the attitude tag of the termination frame of the selected standard action, wherein the formula is as follows:

step 3, if the overall similarity S (B) is greater than the set threshold MAXBLUR, returning to the step 2; otherwise, executing step 4;

step 4, calculating the similarity SL (C) of each attribute between the previous frame attitude tag of the ending frame attitude tag and the starting frame attitude tag of the selected standard action_nAnd recording a frame attitude tag as a t-1 frame attitude tag, wherein the adopted formula is as follows:

SL(C)_n＝A2_n×z4_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3)

Wherein, z4_nThe absolute value of the difference value of the corresponding attributes of the previous frame attitude tag and the selected initial frame attitude tag of the standard action is obtained;

calculating the overall similarity S (C) between the pose label of the last frame and the pose label of the initial frame of the selected standard action by the following formula:

step 5, if the overall similarity S (C) is less than a set threshold MAXBLUR, the action to be recognized is consistent with the selected standard action; and if the overall similarity S (C) is greater than the set threshold MAXBLUR, returning to the step 4, replacing the posture label of the t-1 frame of the processed object with the posture label of the t-2 frame of the processed object until the overall similarity S (C) is greater than the set threshold MAXBLUR when the processed object is the posture label of the first frame, and returning to the step 2.

Compared with the prior art, the invention has the following technical effects: the method abstracts action recognition into gesture recognition, abstracts gestures into gesture tags based on a key node relative position method, and finds out actions sent by human beings by comparing gesture changes of the human beings for a certain time; the method reduces the difficulty of establishing the template library, greatly reduces the speed and the operation requirement of the action recognition, and improves the universality of the action recognition on the individual recognition. The method has important application value in the fields of human-computer interaction, virtual reality, video monitoring and motion characteristic analysis.

The invention will be explained and explained in more detail below with reference to the figures and exemplary embodiments.

Drawings

FIG. 1 is a schematic representation of a skeletal tracking device coordinate system used in the present invention.

Fig. 2 is a schematic diagram of twenty key bone node positions obtained by the present invention.

Detailed Description

The invention provides a method for decomposing actions into attitude tags, which comprises the following steps:

step 1, adopting skeleton tracking equipment to obtain position data of key nodes of human trunk actions, wherein the position data of the key nodes are data in a skeleton tracking equipment coordinate system. The bone tracking device can adopt Kinect, key node data of actions are obtained by adopting the Kinect according to a certain frequency, the position data of the key nodes represent the positions of twenty specific bone nodes of bones, and the node names and the serial numbers of the key nodes are shown in the following table:

the skeleton tracking equipment coordinate system takes an equipment camera as an original point, the direction opposite to the camera is the positive direction of a Z axis, the reverse direction of gravity is the positive direction of a Y axis, the left direction of the camera is the positive direction of an X axis, and the unit length is 1 meter. The skeletal tracking device coordinate system is a static coordinate system.

Step 2, respectively converting the position data of the key nodes at each moment obtained in the step 1 into position data of the key nodes in a morphological coordinate system; the formula used is as follows:

wherein (X, y, z) ═ X-X_HC，Y-Y_HC，Z-Z_HC) Coordinates of vectors between any key NODEs NODE in the skeleton tracking equipment coordinate system obtained in the step 1 are shown, (X, Y, Z) shows position data of the key NODEs NODE, (X_HC，Y_HC，Z_HC) Location data representing key node HIPCENTER; alpha, beta and gamma are in the morphological coordinate system respectivelyThe rotation angle of each coordinate axis relative to the skeletal tracking device coordinate system.

The position data of the key node in the morphological coordinate system is (x ', y ', z ').

The morphological coordinate system uses the facing direction of the human trunk as the positive direction of the Z axis, the direction of the morphological upper end of the human trunk as the positive direction of the Y axis, the left direction of the human as the positive direction of the X axis, and the key node HIP CENTER as the origin.

The morphological upper end of the human body trunk refers to the morphological upper end of the part which reaches earlier, that is, the part which reaches later, with the head of the human body as the starting point, and extends downwards and outwards along the body. For example, when a person stands upright, the hands of the person naturally hang down, and the three parts of the left shoulder, the left elbow and the left hand are as follows: the left shoulder is the upper morphological end of the left elbow, which is the upper morphological end of the left hand.

Step 3, solving the main body posture label GL at each moment_bodyLeft forelimb posture label GL_lfRight front limb posture label GL_rfLeft hind limb posture label GL_lbAnd right hind limb posture label GL_rb。

Specifically, in still another embodiment, the determination method of the facing direction of the human trunk and the morphological upper end direction of the human trunk in step 2 is as follows:

the position data of the key node SHOULDER RIGHT obtained in the step 1 is (X)_SR，Y_SR，Z_SR) And the position data of the key node SHOULDER LEFT is (X)_SL，Y_SL，Z_SL) The position data of the key node HIP CENTER is (X)_HC，Y_HC，Z_HC) The three key nodes can determine a plane, which is the plane of the human body.

Normal vector of human body trunk plane

Wherein,

calculating the vector of the key node HEAD and the key node HIP CENTER in the coordinate system of the skeletal tracking device

Since the head of the Kinect equipment always inclines forwards, the Kinect equipment will be

Multiplication by

If the value is positive, the control circuit is switched to the normal state,

taking a positive sign, if the value is negative,

taking the negative sign.

The direction of (a) is the human body trunk facing direction,

the direction of (a) is the morphological upper end direction of the human body trunk.

Specifically, the body posture label GL_bodyThe method for obtaining (1) is as follows:

Let unit vector

since F is a unit vector, then

X_F，Y_FAnd Z_FOne of the two values is 0, and when the other two values are equal, the two values which are equal are obtained as

Then X_F，Y_FAnd Z_FIs greater than

Left forelimb posture label GL_lfRight front limb posture label GL_rfLeft hind limb posture label GL_lbAnd right hind limb posture label GL_rbThe method for obtaining (1) is as follows:

the four posture labels all comprise three key nodes which are marked as a key node 1, a key node 2 and a key node 3, and the posture label GL of the left forelimb_lfThree key nodes included are ELBOW LEFT, WRIST LEFT and HAND LEFT, respectively, for the right forelimb posture label GL_rfThree key nodes included are KNIEE LEFT, ANIKLE LEFT and FOOT LEFT, respectively, for LEFT hind limb posture label GL_lbThe three key nodes are ELBOW LEFT, WRIST LEFT and HAND LEFT, right hind limb posture label GL_rbThe three key nodes are KNIEE LEFT, ANIKLE LEFT and FOOT LEFT。

The data of the three key nodes in the morphological coordinate system are respectively used as (X)₁,Y₁,Z₁)(X₂,Y₂,Z₂)(X₃,Y₃,Z₃) Represents; the four attitude tags each include height tag G1, orientation tag G2, and curl tag G3.

The height label G1 is obtained by the following method:

G1＝(g₁+g₂+g₃) The rounding of/3, the smaller the value of G1, indicates that the position of the part is closer to the upper end of morphology. Wherein:

wherein n is 1,2,3, Y_HIs a Y-axis coordinate in the morphological coordinate system of the key node HEAD_HCIs the Y-axis coordinate of the key node SHOULDER CENTER in the morphological coordinate system, and Y is_H＞Y_HC。

The orientation label G2 is obtained as follows:

the method for obtaining the crimp label G3 is as follows:

introducing a key node 4 according to the key node 1, the key node 2 and the key node 3, respectively calculating the distances D between the key node 1, the key node 2 and the key node 3 and the key node 4₁，D₂，D₃. For left forelimb posture label GL_lfThe key node 4 is SHOULDER LEFT, for the right forelimb posture label GL_rfThe key node 4 is SHOULDER RIGHT, for left hind limb posture label GL_lbThe key node 4 is HIPLEFT, and the right hind limb posture label GL_rbAnd the key node 4 is HIPRIGHT.

The value of the curl label G3 is given by the following formula:

another aspect of the present invention provides a method for obtaining an action template library, including the following steps:

step 1, making standard actions for multiple times, and decomposing the standard actions made each time into attitude tags at various moments according to the method for decomposing the actions into the attitude tags; and selecting the attitude tag at the initial moment as an initial frame attitude tag, and selecting the attitude tag at the termination moment as a termination frame attitude tag. And taking the standard action made for the first time as a comparison standard action, and taking the standard actions made for other times as reference standard actions. And comparing the initial frame posture label corresponding to the standard action as an initial frame comparison posture label, and using the termination frame posture label corresponding to the first-made standard action as a termination frame comparison posture label.

Sl1(A)_n＝A_n×z1_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3) (6)

Wherein A is_nFor initialized similarity coefficient values, A_nN denotes the serial number of the attribute, and the serial numbers 1 to 13 are the body posture labels GL of the posture labels, respectively_bodyLeft forelimb posture label GL_lfHeight label G1 in (1), left forelimb posture label GL_lfMiddle orientation label G2, left forelimb posture label GL_lfMiddle curl tag G3, left hind limb posture tag GL_lbHeight label G1 in, left hind limb posture label GL_lbMiddle orientation label G2, left hind limb posture labelLabel GL_lbMiddle curl label G3, right front limb posture label GL_rfHeight label G1, right front limb posture label GL_rfOrientation label G2, right forelimb posture label GL_rfThe curl label G3, the right hind limb posture label GL_rbHeight label G1, right hind limb posture label GL_rbOrientation label G2, right hind limb posture label GL_rbCrimp label G3; z1_nAbsolute value of difference of start frame pose tag and corresponding attribute of start frame contrast pose tag for reference standard action, e.g. z1₁Subject attitude tag GL in start frame attitude tag for referencing standard action_bodySubject pose tag GL comparing pose tags with start frame_bodyThe absolute value of the difference of (a).

For each attribute n, selecting similarity Sl1(A) calculated according to the starting frame attitude labels of a plurality of reference standard actions_nThe second largest value in (b) is taken as the similarity coefficient value A1 under the attribute_n. The similarity coefficient value A1 corresponding to each attribute n_nForming an initial frame similarity coefficient set A_star＝{A1_n,n∈Z,n＝1,2,...,13}。

respectively calculating the similarity Sl2(A) of each attribute of the terminal frame attitude tag and the terminal frame contrast attitude tags of a plurality of reference standard actions_nThe formula adopted is as follows;

Sl2(A)_n＝A_n×z2_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3) (7)

Wherein, z2_nAbsolute value of difference of corresponding attribute of terminating frame pose tag and terminating frame contrast pose tag for reference standard action, e.g. z2₁Subject pose tag GL in termination frame pose tag for referencing standard actions_bodySubject pose tag GL comparing pose tags with termination frames_bodyThe absolute value of the difference of (a).

Selecting similarity Sl2(A) calculated according to the attitude labels of the termination frames of a plurality of reference standard actions for each attribute n_nThe second largest value in (b) is taken as the similarity coefficient value A2 under the attribute_n. The similarity coefficient value A2 corresponding to each attribute n_nForming a group A of termination frame similarity coefficients_stop＝{A2_n,n∈Z,n＝1,2,...,13}。

A third aspect of the present invention provides a motion recognition method, including the steps of:

step 1, aiming at the action to be recognized, the action to be recognized is decomposed into the attitude tags at each moment according to the method for decomposing the action into the attitude tags.

(8)

Wherein, z3_nAnd (4) obtaining the absolute value of the difference value of the corresponding attributes of the termination frame attitude tag obtained in the step (1) and the termination frame attitude tag of the selected standard action.

SL(C)_n＝A2_n×z4_n÷ln(n∈Z，n∈[1，13]when n is 1,4,7,10,13, ln is 5, and the rest is 3) (10)

Wherein, z4_nThe absolute value of the difference value of the corresponding attributes of the previous frame attitude tag and the selected initial frame attitude tag of the standard action.

step 5, if the overall similarity S (C) is less than a set threshold MAXBLUR, the action to be recognized is consistent with the selected standard action; and if the overall similarity S (C) is greater than the set threshold MAXBLUR, returning to the step 4, replacing the posture label of the t-1 frame of the processed object with the posture label of the t-2 frame of the processed object until the overall similarity S (C) is greater than the set threshold MAXBLUR when the processed object is the posture label of the first frame, and returning to the step 2. MAXBLUR represents the fuzzy degree of the action matching algorithm, and the value is 0.25-0.05.

Examples

And (3) performing action recognition by adopting a traditional method:

the using equipment is a single Kinect, when the identification action is right hand saluting, a template library is established by using a traditional method, and the height of a tester a is 173cm, the weight of the tester a is 60kg, the height of a tester b is 191cm, the weight of the tester b is 100kg, the height of a tester c is 181cm, and the weight of the tester c is 80 kg. The first 50 samples are recorded for a tester a, the samples with the sample serial numbers of 51-80 are recorded for a tester b, the time of recording the samples is about 2 minutes each time, the recorded action is selected during recording, the recorder takes right hand salutation action when standing 1.5 meters in front of the device, the sample library and the test points are all 20 skeleton nodes of Kinect

During testing, a tester is specified to stand 1.5 meters in front of the equipment, standard-meeting actions are performed as far as possible, each tester performs right hand salutation actions ten times when a new sample optimization template library is input, and recognition results are counted. The recognition statistics are shown in table 1:

TABLE 1

Number of samples

20

30

40

50

60

70

80

Tester a

70％

80％

90％

100％

90％

80％

Tester b

0％

30％

70％

90％

Tester c

30％

40％

50％

From the test results it can be seen that: when the tester a is used as a tester and is used as a sample logger, the recognition success rate is obviously increased along with the increase of logging times, the final recognition success rate reaches 100% when the number of samples is 50, the success rates of other testers are basically unchanged, when the tester b is used as the sample logger, the success rate of the tester b is greatly improved, and the success rate of the tester a is reduced. The tester c has a low recognition success rate because it is not involved in the entry, but the success rate increases as the number of samples increases. Test 1 took a total of 4 hours and 20 minutes.

The method of the invention is used for action recognition:

the used equipment is a single Kinect, the recognized actions are right hand salutation and double hand waving, the method is used for establishing an attitude tag library, an action-attitude library, all 20 nodes are used, and the total time consumed for establishing the template library is 30 minutes, and the method comprises six actions: standing, left hand high lifting, right hand high lifting, both hands high lifting, left arm saluting and right arm saluting. Wherein the right hand high lift is similar to the right arm salute action, the double hand high lift is a complex action of three postures, the action not only meets the left hand high lift requirement, but also meets the right hand high lift requirement, and the test is added for increasing the test difficulty. Tester a was 173cm in height and 60kg in weight, tester b was 191cm in height and 100kg in weight, and tester c was 181cm in height and 80kg in weight, in accordance with test 1.

During testing, a tester is specified to stand 1.5 meters ahead of the device and make standard-meeting actions as far as possible, each person makes right hand salutation and two-hand waving actions ten times respectively, and the template library is not updated in the whole test, so that each person does not need to make multiple rounds of actions and count recognition results. The recognition statistics are shown in table 2:

TABLE 2

Wherein the tester c mistakenly recognizes one right arm salute action as right-hand high lift, and one both-hand high lift action as right-hand high lift, in relation to the relevant action settings in the action-posture library.

The overall test has a much higher recognition success rate than test 1, and has a good success rate for three testers with different body types. The overall test takes 1 hour and 10 minutes in total, and the recognition action is richer and more difficult than the test 1.

The method has good universality for testers, and the input (design) of the template library is simpler and more convenient.

Claims

1. A method of decomposing an action into gesture tags, comprising the steps of:

step 3, respectively obtaining attitude labels at each moment by using the position data of the key nodes under the morphological coordinate system at each moment, which are obtained in the step 2, wherein the attitude labels comprise main body attitude labels GL_bodyLeft forelimb posture label GL_1fRight front limb posture label GL_rfLeft hind limb posture label GL_1bAnd right hind limb posture label GL_rb；

The subject posture label GL in the step 3_bodyThe method for obtaining (1) is as follows:

the left forelimb posture label GL in the step 3_1fRight front limb posture label GL_rfLeft hind limb posture label GL_1bAnd right hind limb posture label GL_rbThe method for obtaining (1) is as follows:

the four posture labels all comprise three key nodes which are marked as a key node 1, a key node 2 and a key node 3, and the posture label GL of the left forelimb_1fThree key nodes included are ELBOW LEFT, WRIST LEFT and HAND LEFT, respectively, for the right forelimb posture label GL_rfThree key nodes included are KNIEE LEFT, ANIKLE LEFT and FOOT LEFT, respectively, for LEFT hind limb posture label GL_1bThe three key nodes are ELBOW LEFT, WRIST LEFT and HAND LEFT, right hind limb posture label GL_rbThe three key nodes are KNIEE LEFT, ANIKLE LEFT and FOOT LEFT respectively;

the data of the three key nodes in the morphological coordinate system are respectively used as (X)₁，Y₁，Z₁)(X₂，Y₂，Z₂)(X₃，Y₃，Z₃) Represents; the four attitude tags each include height tag G1, orientation tag G2, and curl tag G3;

the height label G1 is obtained by the following method:

G1＝(g₁+g₂+g₃) And/3, rounding, wherein,

the orientation label G2 is obtained as follows:

the method for obtaining the crimp label G3 is as follows:

introducing a key node 4 according to the key node 1, the key node 2 and the key node 3, respectively calculating the distances D between the key node 1, the key node 2 and the key node 3 and the key node 4₁，D₂，D₃(ii) a For left forelimb posture label GL_1fThe key node 4 is SHOULDER LEFT, for the right forelimb posture label GL_rfThe key node 4 is SHOULDER RIGHT, for left hind limb posture label GL_1bThe key node 4 is HIPLEFT, and the right hind limb posture label GL_rbThe key node 4 is HIPRIGHT;

the value of the curl label G3 is given by the following formula:

2. a method for obtaining an action template library, comprising the steps of:

Wherein A is_nFor initialized similarity coefficient values, A_nN denotes the serial number of the attribute, and the serial numbers 1 to 13 are the body posture labels GL of the posture labels, respectively_bodyLeft forelimb posture label GL_1fHeight label G1 in (1), left forelimb posture label GL_1fMiddle orientation label G2, left forelimb posture label GL_1fMiddle curl tag G3, left hind limb posture tag GL_1bHeight label G1 in, left hind limb posture label GL_1bMiddle orientation label G2, left hind limb posture label GL_1bMiddle curl label G3, right front limb posture label GL_rfHeight label G1, right front limb posture label GL_rfOrientation label G2, right forelimb posture label GL_rfThe curl label G3, the right hind limb posture label GL_rbHeight label G1, right hind limb posture label GL_rbOrientation label G2, right hind limb posture label GL_rbCrimp label G3; z1_nThe absolute value of the difference value of the corresponding attributes of the initial frame attitude tag and the initial frame contrast attitude tag of the reference standard action;

for each attribute n, selecting similarity Sl1(A) calculated according to the starting frame attitude labels of a plurality of reference standard actions_nThe second largest value in (b) is taken as the similarity coefficient value A1 under the attribute_nThe similarity coefficient value A1 corresponding to each attribute n_nForming an initial frame similarity coefficient set A_star＝{A1_n，n∈Z，n＝1，2，...，13}；

selecting similarity Sl2(A) calculated according to the attitude labels of the termination frames of a plurality of reference standard actions for each attribute n_nThe second largest value in (b) is taken as the similarity coefficient value A2 under the attribute_n(ii) a The similarity coefficient value A2 corresponding to each attribute n_nForming a group A of termination frame similarity coefficients_stop＝{A2_n，n∈Z，n＝1，2，...，13}；

3. A motion recognition method based on attitude tags is characterized by comprising the following steps:

step 2, selecting a certain standard action in the action template library, calculating the similarity SL (B) n of each attribute between the termination frame attitude tag obtained in the step 1 and the termination frame attitude tag of the selected standard action, and recording the termination frame attitude tag as a tth frame attitude tag, wherein the adopted formula is as follows:

Wherein, z3_nSelecting similarity Sl1(A) calculated for a plurality of initial frame attitude tags of reference standard actions for the absolute value of the difference between the attributes corresponding to the end frame attitude tag obtained in step 1 and the end frame attitude tag of the selected standard action_nThe second largest value in (b) is taken as the similarity coefficient value A1 under the attribute_n；

Wherein, z4_nSelecting similarity Sl2(A) calculated for a plurality of end frame attitude tags of reference standard actions for the absolute value of the difference between the corresponding attributes of the previous frame attitude tag and the start frame attitude tag of the selected standard action_nThe second largest value in (b) is taken as the similarity coefficient value A2 under the attribute_n；