CN112527118B

CN112527118B - Head posture recognition method based on dynamic time warping

Info

Publication number: CN112527118B
Application number: CN202011485090.XA
Authority: CN
Inventors: 李淮周; 王宏; 李森; 曹祥红; 胡海燕; 武东辉; 温书沛; 吴彦福; 李晓彬
Original assignee: Zhengzhou University of Light Industry
Current assignee: Zhengzhou University of Light Industry
Priority date: 2020-12-16
Filing date: 2020-12-16
Publication date: 2022-11-25
Anticipated expiration: 2040-12-16
Also published as: CN112527118A

Abstract

The invention provides a head posture identification method based on dynamic time warping, which comprises the following steps: acquiring characteristic data of acceleration and angular velocity of the head action posture in the X direction, the Y direction and the Z direction through an inertial sensor fixed on the head, and storing the characteristic data in a data set; preprocessing data in the data set, detecting the starting time and the ending time of head movements, and extracting movement intervals of the head movements; constructing a head action template; calculating a regular path through the detected head action data and the obtained head action template data; and the standard template head action type corresponding to the minimum value of the regular path DTW is the head action type of the data to be identified. The head motion type of the test object can be accurately estimated by depending on the acceleration and angular velocity information measured by the inertial sensor, and the identification accuracy rate of the head motion of the human body can be effectively improved; and the price is low, the data processing capacity is small, the reaction is fast, and the identification accuracy is high.

Description

Head posture recognition method based on dynamic time warping

Technical Field

The invention relates to the technical field of pattern recognition, in particular to a head posture recognition method based on dynamic time warping.

Background

With the development of artificial intelligence technology, the production life style of human is greatly changed, and the traditional keyboard and mouse input mode cannot meet the requirements of all people, such as people with unhealthy upper limbs. Therefore, the development of a head posture-based motion recognition technology attracts extensive attention of researchers.

The types of devices used for head pose calculation can be divided into two categories: one class of methods based on wearing inertial sensors, such as the invention patent application publication No. CN103076045B, provides a head posture sensing apparatus and method; the invention patent CN105943052A discloses provides a fatigue driving detection method and device based on a deflection angle; the method has the advantages of high precision and good real-time performance, and has the disadvantages that a user needs to wear an inertial sensor, and the biases are on posture estimation, and a head motion recognition technology is not provided; another method is based on a machine vision method, such as a Tan method and the like, and provides an invention patent with a publication number of CN102737235A, wherein the head posture estimation method is based on depth information and a color image, the head posture is estimated through a camera or a depth camera, the method has the advantages that the method is not in contact with a test object, the defect is that the imaging of the camera is easily affected by illumination, background and expression, and compared with the former method, the image processing is generally large in calculation amount and low in accuracy rate and needs to be further improved.

A Dynamic Time Warping (DTW) is a method based on dynamic programming, and is widely used in the field of voice and gesture recognition. The dynamic time warping algorithm can warp data under a time axis, and extension or shortening of a time sequence is achieved, so that better alignment is achieved, and accuracy and robustness of the algorithm are improved. Head movements can cause changes of movement time lengths due to personal habits and different current states, and are a typical unequal-length time sequence identification problem.

Disclosure of Invention

Aiming at the technical problems of large calculation amount and low accuracy rate of the existing head gesture recognition, the invention provides a head gesture recognition method based on dynamic time warping, different head actions are recognized by evaluating the time sequence warping path distance between the different actions and a standard template through a DTW method, the data processing amount is small, and the recognition accuracy rate is high.

In order to achieve the purpose, the technical scheme of the invention is realized as follows: a head posture recognition method based on dynamic time warping comprises the following steps:

step S1: data acquisition: acquiring characteristic data of acceleration and angular velocity of the head action posture in the X direction, the Y direction and the Z direction through an inertial sensor fixed on the head, and storing the characteristic data in a data set;

step S2: end point detection of head motion: preprocessing data in the data set, detecting the start time and the end time of head movement according to preprocessed head inertia data and angular speed information, and extracting a movement interval of the head movement;

and step S3: calculating a head action time series template: according to the head motion data detected by the end point detection in the step S2 and the related motion labels, building acceleration and angular velocity head motion templates in the X direction, the Y direction and the Z direction;

and step S4: calculating a regular path: calculating a regular path by the test set in the data set through the head action data detected in the step S2 and the head action template data obtained in the step S3 respectively;

step S5: judging the head action type: and the standard template head action type corresponding to the minimum value of the regular path DTW is the head action type of the data to be identified.

The inertial sensor is arranged on the glasses leg close to the front part of the head, when data are collected, a testee takes the glasses and sits on the bench to naturally and respectively make head action postures of nodding, pitching, left shaking, right shaking, left turning and right turning; the format of the data set is: [ data, label ], wherein, data is a 6-dimensional matrix which is acceleration and angular velocity of x, y and z axes of the sensor respectively, and the length is not fixed under different labels; label is a type variable, corresponding to 6 types of head actions.

The preprocessing method in the step S2 comprises the following steps:

step S21: data normalization

y'(t)＝arctan(x(t))*2/π (1)

Wherein y' (t) is normalized data, and x (t) is acceleration or angular velocity data acquired by the inertial sensor.

Step S22: sliding median filtering

Wherein l is the median filter window length; l =2N-1 represents an odd number, and N is a natural number set; l =2n represents an even number, mean () is a median function, y (t) is a median value within a sliding window length, and y' (t- (l-1)/2.

The method for detecting the end point of the head action in the step S2 comprises the following steps: determining the start time of the head action as:

wherein the content of the first and second substances,

is the overall description of the angular velocity change of each direction at the time t, and reflects the overall change degree of the angle of the head action, ang _x (t)、ang _y (t) and ang _z (t) representing the angular velocity components in the X direction, the Y direction and the Z direction on the three-dimensional coordinate axis respectively; ang _min Is a threshold for head start action; t is t _start Is the start time of the head movement;

determining a head action end time:

wherein，sum(ang([t-t _min ,t))＜ang _min ) Calculates ang (t) at [ t-t _min T) less than a threshold ang in a time interval _min The number of (2); t is t _min Is the minimum time of duration of the head movement; fs is the sampling frequency of the sensor; if the value of all sampling points in the minimum duration is less than the threshold ang after the head action is started _min The head movement is considered to be finished, and the finish time is t _end ；

Judging the validity of the head action:

(t _end -t _start ＞t _min ) And (t) _end -t _start ＞t _max ) There is a head motion;

wherein, t _min Is the minimum time for which the head movement lasts; t is t _max Is the maximum time that the head movement lasts.

Acquiring 26 head action data, and randomly dividing the data into a training set and a testing set, wherein 18 persons exist in the training set and 8 persons exist in the testing set;

if the data processed in the step S2 belong to a training set or template data collected by individual dependence, calculating a head action time sequence template through a step S3; if the motion sequence belongs to the test set or the real-time data collected in real time, judging the head motion type to which the motion sequence belongs through the DTW value in the steps S4 and S5.

The implementation method of the step S3 comprises the following steps:

step S31: according to the time sequence of the head action extracted in the step S2, obtaining each action time sequence and a label thereof according to a set threshold;

step S32: for a head movement in the training set, let a group of data in the time series be S _a ＝{s ₁ ,s ₂ ,…,s _a }，S _a The matrix is a 6X a matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; the column vectors correspond to head motion features; the total time series set of the training set is S = { S = } _a ,S _b ,…,S _n N is the number of the actions in the training set; a. b, \ 8230;, k represent the sequence S _a 、S _b 、S _n Length of (d);

step S33: let the sequence length vector be S _len = { a, b, \8230;, n }, then the template time series length is T _len ＝median(S _len ) Wherein, mean () is a median function;

step S34: let the standard template of the head action be T _i Wherein i =1,2, \ 8230, 6, corresponds to six head motion types; the matrix is a 6X matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; the column vector corresponds to the head motion characteristic, and the length x is determined according to the data length in the training set; by means of the mean value formula

To obtain T _ik ，T _ik Front T of _len The data is used as a standard template time sequence of the action, wherein T _ik Representing the kth line of data in the ith action template; s _jk A kth line of data representing a jth object action type; due to S _jk The duration of the action is not equal among the testers, and the pair S of the binary () function is used _jk Carrying out binarization {1,0}, thereby calculating the number of elements at the same position;

step S34: repeating the steps S32 and S33, standard templates of other action types can be obtained.

The implementation method of the step S4 comprises the following steps:

step S41: calculating a distance matrix D: let the time series of the header actions in the test set be S = { S = ₁ ,s ₂ ,…,s _n }; the time sequence of the template to be matched is standard template data T = { T = ₁ ,t ₂ ,…,t _m }; the Euclidean distance between any two points is

Wherein s is _i Is the ith column vector in the time series S; t is t _j Is the jth column vector in the standard template T; s _ik Is the kth row element of any ith column vector in time series S; t is t _jk Is the kth row element of any jth column vector in the standard template T; computing stationThere are possibilities to form an n × m distance matrix D; the method is changed into the method for solving the shortest path problem from the starting point D (1, 1) to the end point D (n, m) by applying a dynamic programming method.

Step S42: let regular path W = { W ₁ ,w ₂ ,w ₃ ,…,w _y In which w _e Represents the distance between a point in the time series S and the standard template T, y is the warping path length, range: y is more than or equal to max (m, n) and less than or equal to m + n; obtaining an optimal planning path according to the constraint condition of the regular path;

step S43: and calculating by adopting a dynamic planning idea of accumulated distance to solve the optimal regular path.

The steps S41, S42, and S43 are repeated to calculate DTW values between the head movement time series S and the movement time series of the 6 kinds of standard templates, respectively.

The warping path needs to satisfy several constraints:

(1) boundary conditions are as follows: the path is from the starting point w ₁ To end point w (= D (1, 1)) _y ＝D(n,m)；

(2) Continuity: if w _e-1 = D (a, b), then the next point w of the path _e D (a ', b') needs to satisfy | a-a '| ≦ 1, | b-b' | ≦ 1, i.e., no match across a certain dot zone;

(3) monotonicity: if w _e-1 = D (a, b), then the next point w of the path _e D (= D (a ', b') is required to satisfy a '-a ≧ 0, b' -b ≧ 0, i.e., the point on the regular path W must proceed monotonically with time;

thus, know from w _e-1 D (a, b) has only three paths to the next point: d (a +1, b), D (a +1, b + 1), D (a, b + 1). Then the optimal warping path is:

the cumulative distance is:

r(e,f)＝d(s _e ,t _f )+min{r(e-1,f),r(e-1,f-1),r(e,f-1)}；

wherein e =1,2,3, \8230, n; f =1,2,3, \ 8230;, m; s _e Representing the e-th column vector in the matrix S to be detected; t is t _f An e-th column vector representing a certain head action in the template matrix T to be detected; r (e, f) is the cumulative distance.

The invention has the beneficial effects that: the invention provides a head movement endpoint detection and head action identification method based on dynamic time warping, which is characterized in that an inertial sensor placed at a glasses leg or an ear is used for collecting acceleration, angular velocity and the like of head actions in X, Y and Z directions, the collected angular velocity data is calculated to obtain the resultant angular velocity, a threshold method is used for carrying out endpoint automatic detection, and abnormal data are removed. The invention supports to generate an individual-dependent head action template or introduce an empirical head action template, then calculates the DTW of the dynamic time warping path of each template data for the automatically detected head action test data, and compares the DTW with the minimum DTW value to determine the action type. The head motion type of the test object, such as nodding, pitching, left shaking, right shaking and the like, can be accurately estimated by means of the acceleration and angular velocity information measured by the inertial sensor, and the identification accuracy of the head motion of the human body can be effectively improved. Compared with the head action recognition technology based on a camera and a depth camera, the invention has the advantages of low price, small data processing amount, quick response and high recognition accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of the present invention.

FIG. 2 is a schematic diagram of head motion gesture types according to the present invention.

FIG. 3 is a time series template of the head movements of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

As shown in fig. 1, a head pose recognition method based on dynamic time warping includes the following steps:

step S1: data acquisition: the characteristic data of the acceleration and the angular velocity of the head action posture in the X direction, the Y direction and the Z direction are collected through an inertial sensor fixed on the head and are stored in a data set.

The head posture change is sensed through the inertial sensor, the types of the collected data are 6 types of characteristic data including acceleration and angular velocity in X, Y and Z directions, and the inertial sensor is arranged on the glasses legs close to the front part of the head, as shown in a normal view in the middle of fig. 2. When data is collected, a subject takes the glasses and sits on the stool to naturally and respectively make head actions of nodding, pitching, left shaking, right shaking, left turning and right turning, which are respectively shown as a view close to the outside in fig. 1.

In order to verify the effectiveness of the method provided by the invention, 26 persons of head motion data are collected and randomly divided into a training set and a testing set for testing, wherein the training set comprises 18 persons and the testing set comprises 8 persons. The data set format is: [ data, label ], wherein, data is a 6-dimensional matrix which is acceleration and angular velocity of x, y and z axes of the sensor respectively, and the length is not fixed under different labels; label is a type variable, corresponding to 6 types of head actions.

Step S2: end point detection of head motion: preprocessing data in the data set, detecting the start time and the end time of head movement according to preprocessed head inertia data and angular velocity information, and extracting the movement interval of the head movement, wherein the steps are as follows:

step S21: data normalization

y'(t)＝arctan(x(t))*2/π (1)

Where y' (t) is normalized data, and x (t) is acceleration or angular velocity data collected by the inertial sensor.

Step S22: sliding median filtering

Wherein l is the median filter window length; l =2N-1 represents an odd number, and N is a natural number set; l =2n represents an even number, mean () is a median function, y (t) is a median value within a sliding window length, and y' (t- (l-1)/2. The sliding median filtering has the function of reducing salt-pepper noise of the inertial sensor, so that the possibility of misjudgment of subsequent action identification and endpoint detection is reduced.

Step S23: determining the start time of the head action as:

wherein the content of the first and second substances,

is the overall description of the angular velocity change of each direction at the time t, and reflects the overall change degree of the angle of the head action, ang _x (t)、ang _y (t) and ang _z (t) representing the angular velocity components in the X direction, the Y direction and the Z direction on the three-dimensional coordinate axis respectively; ang _min Is a threshold for head start action; t is t _start Is the start time of the head movement.

Step S24: determining a head action end time:

of these, sum (ang ([ t-t ]) _min ,t))＜ang _min ) Calculates ang (t) at [ t-t _min T) less than a threshold ang in a time interval _min The number of (2); t is t _min Is the minimum time of duration of the head movement; fs is the sampling frequency of the sensor. If the values of all the sampling points in the minimum duration are less than the threshold ang after the start of the head movement _min Consider the head movement to be over, and the end time is t _end 。

Step S25: judging that the head action is effective:

(t _end -t _start ＞t _min ) And (t) _end -t _start ＞t _max ) Presence of head movement (5)

Wherein, t _min The minimum time of the head action duration is used for eliminating the spike noise of the waveform; t is t _max Is the longest duration of the head action for rejecting actions of abnormal or incomplete duration. Thereby completing the extraction of the head action data, if the data belongs to a training set or template data collected by individuals depending on, calculating a head action time sequence template through the step S3; if the head motion sequence belongs to the test set or the real-time data acquired in real time, judging the head motion type of the motion sequence through the DTW values in the steps S4 and S5.

And step S3: calculating a head action time series template: according to the head motion data detected by the end point detection in the step S2 and the relevant motion labels, head motion templates of acceleration and angular velocity in the X direction, the Y direction and the Z direction are constructed, and the steps are as follows:

step S31: according to the end point detection method described in the step 2 of the invention, the time sequence of the head action is extracted, and the threshold value selection reference values are respectively as follows: l =0.2s, ang _min ＝0.2rad/s，t _min ＝0.6s，t _max Each action time series and its label are obtained by =3s, fs = 100hz.

Step S32: for example, one head movement in the training set is taken as an example, a group of data in the time sequence is S _a ＝{s ₁ ,s ₂ ,…,s _a A is the length of the group of data, S _a The matrix is a 6X a matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; in the column directionThe amount corresponds to the head movement characteristic. The total time series set of the training set is S = { S = } _a ,S _b ,…,S _n N is the number of the actions in the training set; a, b, \ 8230;, k respectively represent the length of the sequences located.

Step S33: let the sequence length vector be S _len = { a, b, \8230;, n }, the length of the template time series is T _len ＝median(S _len ) Wherein mean () is the median function.

Step S34: let the standard template of the head action be T _i Wherein i =1,2, \ 8230, 6, corresponds to six head motion types; the matrix is a 6X matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; the column vectors correspond to head motion features and the length x is determined according to the length of the data in the training set. Can be calculated by the mean value formula

To obtain T _ik ，T _ik Front T of _len The data is used as a standard template time sequence of the action, wherein T _ik Representing the kth line of data in the ith action template; s _jk A kth line of data representing a jth object action type; due to S _jk The duration of the action is not equal among the testers, and the pair S of the binary () function is used _jk Binarization {1,0} is performed, thereby calculating the number of same-position elements.

Step S34: repeating the steps S32 and S33, standard templates of other action types can be obtained, as shown in fig. 3. Acc in FIG. 3 _x (t)、acc _y (t) and acc _z (t) and acc (t) respectively represent the acceleration and resultant acceleration in X, Y and Z directions; ang _x (t)、ang _y (t) and ang _z (t)、ang _t (t) represents angular velocities and resultant angular velocities in the X, Y and Z directions, respectively.

The user can also make corresponding head motion data for a plurality of times through system voice prompt, and the personal dependency head motion time sequence template is calculated according to the steps S31, S32, S33 and S34.

And step S4: calculating a regular path, wherein the test set calculates the regular path by the head motion data detected in the step S2 and the head motion template data obtained in the step S3 respectively, and the detailed steps are as follows:

step S41: a distance matrix D is calculated. Let the time series of the head actions in the test set be S = { S = { S } ₁ ,s ₂ ,…,s _n Is a 6 × n matrix; the time sequence of the template to be matched is standard template data T = { T = ₁ ,t ₂ ,…,t _m Is a 6 × m matrix; the Euclidean distance between any two points is

Wherein s is _i Is the ith column vector in the matrix S; t is t _j Is the jth column vector in the matrix T; s _ik Is the kth row element of any ith column vector in vector S; t is t _jk Is the kth row element of any jth column vector in vector T. All the possibilities are calculated to form an n x m distance matrix D. Therefore, the two time series similarity problems are converted into a problem of solving the shortest path from the starting point D (1, 1) to the end point D (n, m) by applying a dynamic programming method, that is, a regular path (warp path), which is denoted by W.

Step S42: let regular path W = { W ₁ ,w ₂ ,w ₃ ,…,w _y In which w _e Represents the distance between a point in the time series S and the standard template T, y is the warping path length, range: y is more than or equal to max (m, n) and less than or equal to m + n. And the following constraints need to be satisfied:

(3) monotonicity: if w _e-1 = D (a, b), then the next point w of the path _e D (= D (a ', b') is required to satisfy a '-a ≧ 0, b' -b ≧ 0, i.e., the point on W must proceed monotonically with time;

thus, know from w _e-1 ＝D(a,b) There are only three paths to the next point: d (a +1, b), D (a +1, b + 1), D (a, b + 1). Then the optimal warping path is:

step S43: to solve the optimal warping path, i.e., the solution equation (6), the dynamic programming idea of the cumulative distance (cumulative distance) is used for calculation, and the cumulative distance equation is defined as:

r(e,f)＝d(s _e ,t _f )+min{r(e-1,f),r(e-1,f-1),r(e,f-1)} (7)

wherein e =1,2,3, \8230;, n; f =1,2,3, \8230;, m; s _e Representing the e-th column vector in the matrix S to be detected; t is t _f An e-th column vector representing a certain head action of the template matrix T to be detected; r (e, f) is the cumulative distance which is actually a recursion relation, so the optimal warping path distance of the two sets of time series of S and T is DTW (S, T) = r (n, m), which solves the measurement problem of time series similarity due to non-uniform time series length and non-aligned feature position.

The steps S41, S42, and S43 are repeated to calculate DTW values between the head movement time series S and the movement time series of the 6 kinds of standard templates.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A head posture recognition method based on dynamic time warping is characterized by comprising the following steps:

step S2: end point detection of head motion: preprocessing data in the data set, detecting the start time and the end time of head movement according to preprocessed head inertia data and angular velocity information, and extracting the movement interval of the head movement;

wherein the content of the first and second substances,

determining a head action end time:

of these, sum (ang ([ t-t ]) _min ,t))＜ang _min ) Calculates the ang (t) at t-t _min T) less than a threshold ang in a time interval _min The number of (2); t is t _min Is the minimum time of duration of the head movement; fs is the sampling frequency of the sensor; if the values of all the sampling points in the minimum duration are less than the threshold ang after the start of the head movement _min The head movement is considered to be finished, and the finish time is t _end ；

Judging the validity of the head action:

(t _end -t _start ＞t _min ) And (t) _end -t _start ＞t _max ) There is a head action;

wherein, t _min Is the minimum time for which the head movement lasts; t is t _max Is the maximum time that the head movement lasts;

and step S3: calculating a head action time series template: according to the head motion data detected by the end point detection in the step S2 and the related motion labels, constructing head motion templates of acceleration and angular velocity in the X direction, the Y direction and the Z direction;

the implementation method of the step S3 comprises the following steps:

step S31: according to the time sequence of the head action extracted in the step S2, obtaining each head action time sequence and a label thereof according to a set threshold;

step S32: for a head movement in the training set, let a group of data in the time series be S _a ＝{s ₁ ,s ₂ ,…,s _a }，S _a The matrix is a 6X a matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; the column vectors correspond to head motion features; the total time series set of the training set is S = { S = } _a ,S _b ,…,S _n N is the number of the head movements in the training set; a. b and k represent time series S respectively _a 、S _b 、S _n The length of (d);

step S33: let the sequence length vector be S _len = { a, b, \8230;, n }, the length of the template time series is T _len ＝median(S _len ) Wherein, mean () is a median function;

To obtain T _ik ，T _ik Front T of _len The data is used as a standard template time sequence of the head action, wherein T _ik Representing the kth line of data in the ith action template; s _jk A kth line of data representing a jth object action type; due to S _jk The duration of the action is not equal among the testers, and the pair S of the binary () function is used _jk Carrying out binarization {1,0}, thereby calculating the number of elements at the same position;

step S34: repeating the steps S32 and S33 to obtain standard templates of other action types;

2. The method for recognizing the head posture based on the dynamic time warping as claimed in claim 1, wherein the inertial sensors are mounted on the glasses legs near the front of the head, and when collecting data, the subject takes the glasses and sits on the stool to naturally and respectively make head actions of nodding, pitching, left shaking, right shaking, left turning and right turning; the format of the data set is: [ data, label ], wherein data is a 6-dimensional matrix which is acceleration and angular velocity of x, y and z axes of the sensor respectively, and the length of the sensor is not fixed under different labels; label is a type variable, corresponding to 6 types of head actions.

3. The method for recognizing head pose based on dynamic time warping as claimed in claim 1, wherein the preprocessing in step S2 comprises:

step S21: data normalization

y'(t)＝arctan(x(t))*2/π (1)

Wherein y' (t) is normalized data, and x (t) is acceleration or angular velocity data acquired by the inertial sensor;

step S22: sliding median filtering

Wherein, l is the length of a median filtering window; l =2N-1 represents an odd number, N being a natural number set; l =2n represents an even number, median () is a median function, y (t) is a median value within the length of the sliding window, and y' (t- (l-1)/2.

4. The head posture recognition method based on dynamic time warping as claimed in claim 1 or 3, characterized in that 26 persons of head motion data are collected and randomly divided into training sets and testing sets, wherein 18 persons exist in the training sets and 8 persons exist in the testing sets;

if the data processed in the step S2 belong to a training set or template data collected by individual dependence, calculating a head action time sequence template through a step S3; if the real-time data belongs to the test set or the real-time data collected in real time, judging that the action sequence formed by the real-time data belongs to the head action type through the DTW value according to the step S4 and the step S5.

5. The method for recognizing head pose based on dynamic time warping as claimed in claim 4, wherein the step S4 is realized by:

step S41: calculating a distance matrix D: let the time series of the head actions in the test set be S = { S = { S } ₁ ,s ₂ ,…,s _n }; the time sequence of the template to be matched is standard template data T = { T = ₁ ,t ₂ ,…,t _m }; the Euclidean distance between any two points is

Wherein s is _i Is the ith column vector in the time series S; t is t _j Is the jth column vector in the standard template T; s is _ik Is any ith row in the time series SThe kth line element of the vector; t is t _jk Is the kth row element of any jth column vector in the standard template T; calculating all possibilities to form an n × m distance matrix D; the method is changed into the method for solving the shortest path problem from the starting point D (1, 1) to the end point D (n, m) by applying a dynamic programming method;

step S42: let regular path W = { W ₁ ,w ₂ ,w ₃ ,…,w _e ,…,w _y In which w _e Represents the distance between a point in the time series S and the standard template T, y is the warping path length, range: y is more than or equal to max (m, n) and less than or equal to m + n; obtaining an optimal planning path according to the constraint conditions of the regular path;

step S43: calculating by adopting a dynamic planning idea of accumulated distance, and solving an optimal regular path;

6. The method for head pose recognition based on dynamic time warping as claimed in claim 5, wherein the warping path needs to satisfy the following constraints:

(1) boundary conditions: the path is from the starting point w ₁ To end point w (= D (1, 1)) _y ＝D(n,m)；

thus, know from w _e-1 There are only three paths for D (a, b) to the next point: d (a +1, b), D (a +1, b + 1), D (a, b + 1); then the optimal warping path is:

the cumulative distance is:

r(e,f)＝d(s _e ,t _f )+min{r(e-1,f),r(e-1,f-1),r(e,f-1)}；

wherein e =1,2,3, \8230;, n; f =1,2,3, \8230;, m; s _e Representing the e-th column vector in the matrix S to be detected; t is t _f An e-th column vector representing a certain head action in the template matrix T to be detected; r (e, f) is the cumulative distance.