CN113052083A

CN113052083A - Action behavior segmentation method for multi-neighbor graph constraint matrix decomposition

Info

Publication number: CN113052083A
Application number: CN202110326354.5A
Authority: CN
Inventors: 张妍
Original assignee: Shaanxi Stride Industrial Co ltd
Current assignee: Shaanxi Stride Industrial Co ltd
Priority date: 2021-03-26
Filing date: 2021-03-26
Publication date: 2021-06-29

Abstract

The invention relates to the technical field of human behavior segmentation, and discloses an action behavior segmentation method for multi-neighbor graph constraint matrix decomposition, which comprises the following steps of: s1, designing a constraint regular term of the multi-graph constraint structure through the behavior action sequence data; s2, constructing a regular term constraint semi-NMF matrix decomposition model; s3, solving the semi-NMF matrix decomposition model to obtain the low-dimensional representation of the behavior sequence; s4, calculating sequence point similarity weight by using low-dimensional representation to generate a similarity relation matrix; and S5, obtaining behavior segmentation by using a graph segmentation method according to the similarity relation matrix, wherein the behavior segmentation method has few phenomena of wrong segmentation in a motion sequence and can accurately distinguish similar motions in human body behaviors.

Description

Action behavior segmentation method for multi-neighbor graph constraint matrix decomposition

Technical Field

The invention relates to the technical field of human behavior segmentation, in particular to an action behavior segmentation method based on multi-neighbor graph constraint matrix decomposition.

Background

Human behavior and action recognition is a research direction crossed by multiple subjects, relates to multiple subjects such as image processing, computer vision, pattern recognition, machine learning, artificial intelligence and the like, and is an important research subject in the field of computer vision. With the rapid development of image processing technology and intelligent hardware manufacturing technology, human behavior data is more and more abundant and more various. There can be generally divided into two categories, sensor and vision. Behavior data information is obtained with the respective sensors worn based on the behavior recognition of the sensors. The behavior recognition based on the vision can be divided into recognition based on a single-frame image and recognition based on a video, the behavior recognition based on the single-frame image cannot effectively acquire coherent information between the behavior images, misjudgment can be generated generally, the behavior recognition based on the video can well acquire space and time information in the video, and the accuracy rate is greatly improved. The behavior analysis based on the video has been widely researched and applied due to strong expansibility and high flexibility. In practice, the human behavior data, whether it is sensor or visually acquired data, is a sequence data set.

The human body behavior segmentation is the accurate behavior segmentation of the original human body motion capture data, is the basis of the structural analysis, understanding and application of the human body motion data, and is a hot spot concerned by researchers at home and abroad. A great deal of research is carried out to deeply discuss the problem, and a series of results are obtained, so that three types of methods are roughly formed.

Aiming at the difficult problems of high dimensionality and large processing calculation amount of original human motion capture data, some researchers provide a segmentation method based on data dimension reduction. Barbic et al suggested that different body motion behaviors can be represented by different intrinsic dimensions, and proposed methods such as PCA, PPCA, and GMM. The methods reduce the dimension of human motion data by Principal Component Analysis (PCA). Suppose that: the intrinsic dimension of the motion sequence containing a single behavior should be smaller than the intrinsic dimension of the motion sequence containing multiple behaviors, and therefore, the position of the behavior segmentation point can be detected by a projection error in the subspace. Yantao et al propose a method for extracting keyframes from motion capture data based on hierarchical curve simplification. The method based on PCA has low requirements on hardware, is relatively simple to realize, and can be easily realized for some applications which accord with assumptions.

Methods based on deep learning are a new focus of research in recent years. Like the TS-CNN method, spatio-temporal CNN (ST-CNN) of low-level coded visual information is used for fine-grained tasks of coding object states, positions, and relationships between objects. A semi-Markov model and a Conditional Random Field (CRF) are introduced to capture high-level temporal information, and the actions are jointly segmented and classified. The TCN method introduces ED-TCN and related TCN coding and decoding network. Misra I et al utilize temporal coherence information to train a depth model for behavior recognition and pose estimation. In summary, the method based on deep learning generally uses a combination of CNN (or an automatic encoder) and other machine learning methods to implement behavior segmentation, which is superior to other methods in effect, but has a high requirement on hardware configuration, and relies on a large amount of data, which is difficult to implement.

Compared with the two ideas, the clustering-based method has the characteristics of relative simplicity and good effect, and is also the key point for researching human behavior segmentation. Aca (aligned clusterinonalysis), for example, is based on two extended kmeans clustering that ensures that a cluster can contain a variable number of features. In order to overcome the shortcomings of ACA, Zhou et al have proposed HACA methods. As an important method, subspace clustering (subspace clustering) is often used to solve the motion segmentation problem. Xia et al propose a method based on SSC (sparse subspace clustering), perform subspace clustering through SSC, and solve the problem that similar frames cannot be grouped into the same cluster in different time periods by using triangle constraint, thereby ensuring temporal continuity. Considering that a human behavior sequence is a time series data, and similarity information is contained between adjacent sequence points, some methods design a clustering algorithm for the sequence data and apply to human behavior segmentation. Such as osc (ordered Subspace clustering) embeds neighborhood structure information into the optimization target of the Subspace representation. Li et al propose TSC (temporal spatial clustering) algorithm, and adopt different constraint regular terms to embed similarity constraint between sequence data into a subspace projection process, so as to obtain better action representation.

However, the difficulty of human behavior segmentation is: 1) it is difficult to design a reasonable and easy-to-calculate behavior sequence point similarity metric criterion. Because the behavior movement is continuous sequence data over a period of time, the similarity of the behavior data must consider the similarity problem between the behavior movements besides the similarity between the sequence points in the behavior movements, and the comparison of the similarity between the behavior movements needs to consider the problems of alignment, clipping and the like; 2) generally, the behavior sequence data is high-dimensional sequence data, and preprocessing such as similarity measurement, data alignment, data dimension reduction and the like needs to be performed on the high-dimensional data, so that actual implementation is difficult.

Disclosure of Invention

The invention provides a motion behavior segmentation method based on multi-neighbor graph constraint matrix decomposition, which has the advantage that the phenomenon of wrong segmentation exists in a motion sequence, and can accurately distinguish similar motions in human body behaviors.

The invention provides an action behavior segmentation method for multi-neighbor graph constraint matrix decomposition, which comprises the following steps of:

s1, designing a constraint regular term of the multi-graph constraint structure through behavior action high-dimensional sequence data;

s2, constructing a regular term constraint non-semi-negative matrix semi-NMF decomposition model;

s3, solving a non-semi-negative matrix semi-NMF decomposition model to obtain a low-dimensional expression matrix H of the behavior sequence;

s4, generating a relational graph G of sequence points by using a low-dimensional representation matrix H;

s5, dividing the relation graph G of the sequence points by using a graph cutting method to obtain the division of the action behaviors.

The specific step of designing the constraint regular term of the multi-graph constraint structure in step S1 includes:

s11, calculating a neighbor constraint graph R₁

Let input X ∈ R^d×nThe low dimensional representation is written as: h is belonged to R^p×nWherein d represents the original dimension of each data point in the sequence, p represents the dimension of the data points in the low-dimensional space, and n represents the total number of the data points in the sequence; because the human body behavior is formed by continuous sequences, the adjacent sequence points have higher similarity; the closer the sequence points are, the higher the similarity is; the farther the sequence points are, the lower the similarity is;

let the current data point i, the inclusion ratioThe sum of the number of the adjacent neighbors is q, q is a positive even number, and the error between i and the adjacent neighbors is as small as possible, namely:

and (5) approaching 0, constructing structural constraint by using the current data point and a plurality of neighbors thereof, and defining a banded matrix R₁∈R^n×n；

When i is less than or equal to q/2, the number of neighbors before the current sequence point is less than q/2, and at the moment, q neighbors after the sequence point are complemented in sequence; similarly, when i is larger than n-q/2, the number of neighbors behind the current sequence point is less than q/2, and q neighbors ahead of the sequence point are complemented in sequence;

s12, calculating a similarity graph R₂

For input X ∈ R^d×nCalculating the similarity between each node and all other nodes, wherein the first q nodes with the highest similarity form a similarity graph R₂∈R^n×nAnd is used for representing a plurality of node sets similar to the nodes in the data.

The specific method for constructing the regular term constraint non-semi-negative matrix semi-NMF decomposition model in the step S2 is as follows:

for X ∈ R^d×nThe method is characterized by high-dimensional data, and after multi-graph constraint is obtained, dimension reduction processing is carried out on original data by using semi-NMF; let Z be an element of R^d×pRepresenting a feature space, H ∈ R^p×nA matrix of representation coefficients representing the input data in a feature space, the elements of the matrix being non-negative values;

the optimization objective was constructed as follows:

wherein the content of the first and second substances,

representing Frobenius norm, wherein alpha and beta are used as weight parameters to adjust the weight of a regular constraint term in an optimization target, and tau is a weight vector of different constraint graphs, wherein L₁＝D₁-R₁，D₁＝diag(∑R₁) And L₂＝D₂-R₂，D₂＝diag(∑R₂)。

The step S3 of solving the non-semi-negative matrix semi-NMF decomposition model to obtain the low-dimensional expression of the behavior sequence as H includes the following specific steps:

solving the semi-NMF matrix decomposition model is a non-convex optimization problem without a global optimal solution, obtaining a local optimal solution of the optimization model by using an alternative iteration thought, setting iteration times, respectively fixing other variables each time, and sequentially calculating Z, H and tau until the iteration times reach a set value, wherein the obtained H is final output;

at each iteration, Z, H and τ are calculated as follows:

the iterative formula for Z: according to the matrix operation rule, the closed solution is obtained as follows:

Z＝XH^T(HH^T)^-1

the iterative formula of H: according to the rule of non-negative matrix factorization, the iterative formula of H requires solving the following optimization objectives:

command 2]⁺Denotes a non-negative element containing only the original matrix, [ alpha ]]^-Representing only non-positive elements of the original matrix, the iterative formula of H is:

wherein

The iteration rule of τ is: the computational requirements of τ solve the following optimization objectives:

let γ be α/β, and by the above calculation, the low-dimensional representation matrix H is obtained.

The specific method for generating the relationship graph G of the sequence points by using the low-dimensional representation matrix H in step S4 is as follows:

generating a relation Graph G epsilon R of sequence points by using H according to a spectrogram theory (Spectral Graph)^n×nThe vertex of the graph represents a sequence point, the edge of the graph is obtained by calculating the similarity between the sequence point on the H and other points, the first k points with the highest similarity form a k adjacent set N (i) of the point, and the edge G (i, j) is defined as:

obviously, because the i +1 th point is very close to the point i, and the similarity between the points is high due to the structural constraint, an edge is generated, and thus, the points in the action sequence are ensured to be divided into the same class as much as possible; if the similarity between i and i +1 is still not high even if structural constraint is utilized, i is in the ending position of the action sequence, and if the similarity value between points far away from i is also high, i belongs to the repeated action sequence, so that the repeated action sequence can be classified into the same class.

The graph segmentation method in step S5 is to segment G by using the eigenvector corresponding to the second smallest laplacian eigenvalue of G.

Compared with the prior art, the invention has the beneficial effects that:

the invention designs a reasonable and easily-calculated behavior sequence point similarity measurement criterion, and carries out preprocessing such as similarity measurement, data alignment, data dimension reduction and the like on high-dimensional data according to the similarity measurement criterion, so that the accuracy of segmentation actions is higher, the phenomenon of wrong division in the action sequence is few, and the more similar actions in human body behaviors can be accurately distinguished.

Drawings

Fig. 1 is a flow chart of an action behavior segmentation method of multi-neighbor graph constraint matrix decomposition according to the present invention.

Fig. 2 is a schematic diagram of a number 1 data action segmentation result according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a number 2 data action segmentation result according to an embodiment of the present invention.

Fig. 4 is a schematic diagram of a result of dividing the number 3 data action according to an embodiment of the present invention.

Detailed Description

An embodiment of the present invention will be described in detail below with reference to fig. 1-4, but it should be understood that the scope of the present invention is not limited to the embodiment.

As shown in fig. 1, an action behavior segmentation method based on multi-neighbor graph constraint matrix decomposition according to an embodiment of the present invention provides a semi-non-negative matrix decomposition (semi-NMF) method based on multi-graph constraint for representing action sequence data, and then performs clustering by using a graph segmentation method, so as to implement segmentation of human body behaviors. The main work comprises the following steps: 1. designing a multi-graph constraint structure constraint regular term; 2. constructing a regular term constraint semi-NMF matrix decomposition model; 3. solving the model to obtain a low-dimensional representation of the behavior sequence; 4. calculating sequence point similarity weight by using the low-dimensional representation to generate a similarity relation matrix; 5. and obtaining the behavior segmentation by using a graph segmentation method.

1. Designing multi-graph constraint regularization terms

Firstly: computing a neighbor constraint graph R₁

Let input X ∈ R^d×nThe low dimension notation is: h is belonged to R^p×nWherein d represents the original dimension of each data point in the sequence, p represents the dimension of the data point in the low-dimensional space, and n represents the total number of the data points in the sequence. Since human behavior is composed of continuous sequences, there is a high similarity between adjacent sequence points. In general, the closer the sequence points are, the higher the similarity is; the farther the sequence points are, the lower the similarity. We want to be as similar as possible between neighboring points when constructing the sequence data representation to ensure that the data representation is as compact as possible within the behavioral actions. Assuming that the current data point i, the sum of the numbers of the adjacent neighbors before and after the comparison is q (q is a positive even number), we want the error of i and the adjacent neighbors as small as possible, namely:

as close to 0 as possible. Accordingly, structural constraints can be constructed using the current data point and its neighbors, for which a special band matrix R is defined₁∈R^n×n：

When i is less than or equal to q/2, the number of neighbors before the current sequence point is less than q/2, and at the moment, q neighbors after the sequence point can be sequentially complemented; similarly, when i is larger than n-q/2, the number of neighbors behind the current sequence point is less than q/2, and q neighbors ahead of the sequence point can be sequentially complemented. For example, when n is 10 and q is 2, the 1 st sequence point is not immediately before, and the third sequence point after it can be included in the structural constraint.

Secondly, the method comprises the following steps: calculate the similarity graph R₂

For input X ∈ R^d×nThe similarity between each node and all other nodes can be calculated, and the first q nodes with the highest similarity can form a similarity graph R₂∈R^n×nAnd is used for representing a plurality of node sets similar to the nodes in the data. R₂The specific computational Process reference (Cai D, He X, Han J, et al, graph regulated nonlinear Matrix Factorization for Data replication [ J.].IEEE Transactions on Pattern Analysis and Machine Analysis, 2011,33(8): 1548-.

Through the above calculation, two different graphs can be used to characterize the neighbor similarity information of the node.

2. Constructing a regularization term constraint semi-NMF matrix decomposition model

For X ∈ R^d×nThe method is characterized by high-dimensional data, and after multi-graph constraint is obtained, dimension reduction processing is carried out on the original data by using semi-NMF. Let Z be an element of R^d×pRepresenting a feature space, and H represents a coefficient matrix representing input data in the feature space, wherein elements of the matrix are non-negative values.

The optimization objective can be constructed as follows:

wherein the content of the first and second substances,

representing Frobenius norm, taking alpha and beta as weight parameters to adjust the weight of a regular constraint term in an optimization target, and tau is a weight vector of different constraint graphs, wherein L is₁＝D₁-R₁，D₁＝diag(∑R₁) And L₂＝D₂-R₂，D₂＝diag(∑R₂)。

3. Solving the model to obtain a low-dimensional representation H of the behavior sequence

Solving the model is a non-convex optimization problem, has no global optimal solution, and can obtain the local optimal solution of the optimization model by using an alternative iteration thought. The number of iterations can be set to 100, and each time the other variables are fixed, Z, H and τ are calculated in turn until after 100 iterations, H is obtained as the final output.

At each iteration, Z, H and τ are calculated as follows:

the iterative formula for Z: since Z can be a negative value, its closed form solution can be obtained according to the matrix operation rule.

Z＝XH^T(HH^T)^-1

The iterative formula of H: the iterative formula of H requires solving the following optimization objectives according to the rules of the non-negative matrix factorization

wherein

let γ be α/β, then the optimization objective is a typical constrained quadratic programming problem that can be solved using any method available, see Jie Chen, Hua Mao, Yongsheng Sang, Zhang yi.

Through the above calculation, the low-dimensional representation matrix H can be obtained.

4. Generation of relational graph G by using H

According to the spectrogram theory (Spectral Graph) literature Belkin M.Laplacian eigenmaps and Spectral techniques for embedding and clustering [ A].Advances in neural informationprocessing systems[C]Cambridge, MA 2001.585-591, using H to generate a graph G e R of the sequence points^n×n. The vertex of the graph represents a sequence point, the edge of the graph is obtained by calculating the similarity between the sequence point on the H and other points, the first k points with the highest similarity form a k adjacent set N (i) of the point, and the edge G (i, j) is defined as:

obviously, because the i +1 th point is very close to the point i, and the similarity between the points is generally high due to the structural constraint, an edge is generated, and thus, the points in the action sequence are ensured to be divided into the same class as much as possible; if the similarity between i and i +1 is not high even if the structural constraint is utilized, it indicates that i may be at the end of the action sequence. Meanwhile, for points far away from i, if the similarity value between the points is high, the points may belong to repeated action sequences, so that the repeated action sequences can be classified into the same class.

5. Classification of G by means of graph Cut (Normalized Cut: Ncut)

G was classified by the method of graph Cut (Normalized Cut: Ncut). Ncut is a commonly used graph division method that divides G by using a eigenvector corresponding to the second smallest laplacian eigenvalue of G. Reference is made in detail to the literature (Shi J. normalized volumes and image segmentation [ J ]. IEEE Transactions on Pattern Analysis and Machine understanding, 2000,22(8): 888-.

The concrete case is as follows: we used the 1 st, 2 nd, and 3 rd of CMU Mocap 86 (see Chu W S, et al, Video Co-simulation by Video Co-simulation A. 2015IEEE Conference on Computer Vision and Pattern Recognition C. Boston, MA, USA, IEEE Computer Society:2015.3584-3592) as an example. The human motion in each frame in the Mocap dataset consists of 42-dimensional vectors. As shown in fig. 2, 3 and 4, the first row in the dataset is the true segmentation point and the second row is the segmentation point for the action decomposition of the invention. The two number sequences need to be divided into more than 9 parts, as shown in fig. 4, and contain 7 action behaviors (classes), that is, there are repeated action behaviors in the sequence, and the two sequences contain negative value data.

Setting parameters: the number of iterations is selected to be 100, the dimension p of the H matrix is selected to be larger than the number of classes, and we select p to be 15, q to be a positive even number, such as 2, 4, 6, 8, 10, and the like. k is selected to be a positive integer, and is generally selected to be greater than 3. Alpha and beta are chosen as weighting parameters, typically between 0.001 and 1. The segmentation results are shown in fig. 2, 3 and 4, respectively.

The invention designs a reasonable and easily-calculated behavior sequence point similarity measurement criterion, and carries out preprocessing such as similarity measurement, data alignment, data dimension reduction and the like on high-dimensional data according to the similarity measurement criterion, so that the accuracy of segmentation actions is high (the difference between a segmentation starting point and a segmentation ending point is not large), the phenomenon of wrong segmentation is rarely generated in the action sequence, and the similar actions in human body behaviors can be accurately distinguished.

The above disclosure is only for a few specific embodiments of the present invention, however, the present invention is not limited to the above embodiments, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A multi-neighbor graph constraint matrix decomposition action behavior segmentation method is characterized by comprising the following steps:

2. The method for motion behavior segmentation in multi-neighbor graph constraint matrix decomposition according to claim 1, wherein the specific step of designing the constraint regular term of the multi-graph constraint structure in step S1 includes:

s11, calculating a neighbor constraint graph R₁

setting a current data point i, wherein the sum of the numbers of adjacent neighbors before and after comparison is q, q is a positive even number, and the error between i and the adjacent neighbors is as small as possible, namely:

s12, calculating a similarity graph R₂

For input X ∈ R^d×nCalculating the similarity between each node and all other nodes, wherein the first q nodes with the highest similarity form a similarity graphR₂∈R^n×nAnd is used for representing a plurality of node sets similar to the nodes in the data.

3. The method for motion behavior segmentation in multi-neighbor graph constraint matrix factorization of claim 2, wherein the step S2 is implemented by constructing a regular term constraint non-semi-negative matrix semi-NMF factorization model by a specific method comprising:

the optimization objective was constructed as follows:

wherein the content of the first and second substances,

4. The method for motion behavior segmentation in multi-neighbor graph constraint matrix factorization of claim 3, wherein the step S3 of solving the non-semi-negative matrix semi-NMF factorization model obtains a low-dimensional representation of a behavior sequence as H, and the specific method is as follows:

at each iteration, Z, H and τ are calculated as follows:

Z＝XH^T(HH^T)^-1

wherein

5. The method for motion behavior segmentation in constraint matrix decomposition of multi-neighbor graph according to claim 1, wherein the specific method for generating the relationship graph G of sequence points by using the low-dimensional representation matrix H in step S4 is as follows:

6. The method for motion behavior segmentation in multi-neighbor graph constraint matrix decomposition according to claim 1, wherein the graph segmentation in step S5 is performed by segmenting G using eigenvectors corresponding to the second smallest laplacian eigenvalue of G.