CN105574494B

CN105574494B - Multi-classifier gesture recognition method and device

Info

Publication number: CN105574494B
Application number: CN201510920778.9A
Authority: CN
Inventors: 王贵锦; 何礼; 陈醒濠
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2015-12-11
Filing date: 2015-12-11
Publication date: 2020-01-17
Anticipated expiration: 2035-12-11
Also published as: CN105574494A

Abstract

The invention discloses a multi-classifier gesture recognition method and a multi-classifier gesture recognition device, wherein the method comprises the following steps: obtaining the distribution centers of all sample contour point characteristics according to a K-means clustering algorithm, obtaining a first histogram through projection, and further obtaining a second histogram according to the contour point characteristics of the image to be identified; calculating the similarity of the second histogram and the histogram corresponding to each classifier in the sample library, and acquiring N classifiers according to a similarity threshold; obtaining a posture detection function according to the posture models of the N classifiers and the weight of each classifier in the N classifiers; the gesture detection function is a function corresponding to the gesture of the image to be recognized. The invention can effectively control the complexity of the submodel and realize the aggregation of the appearance similarity samples, thereby improving the effectiveness of model learning, meeting the training and learning task of mass data and effectively improving the performance of the gesture recognition method.

Description

Multi-classifier gesture recognition method and device

Technical Field

The invention relates to the field of human-computer interaction, in particular to a multi-classifier gesture recognition method and device.

Background

Generally, gesture recognition is one of key technologies of human-computer interaction, and currently, a gesture recognition method based on machine learning is mainly adopted, for example, a component recognition method is used for recognizing various parts of a human body, such as limbs, the head and the like, and then the components are connected to form a human body gesture. A large number of training samples are often required for training in order to ensure the performance of the machine learning method.

Currently, a single classifier is adopted for large-scale training, which not only needs a large amount of training resources (such as consumption of memory and training time), but also is difficult to ensure the performance of the trained classifier.

Disclosure of Invention

Because the current single classifier gesture recognition method needs a large amount of training resources and is difficult to ensure the performance of the classifier after training, the invention provides a multi-classifier gesture recognition method and a multi-classifier gesture recognition device.

In a first aspect, the present invention provides a multi-classifier gesture recognition method, including:

s1, obtaining distribution centers of all sample contour point features according to a K-means clustering algorithm, projecting the distribution centers to obtain a first histogram, and obtaining a second histogram corresponding to the image to be recognized according to the first histogram and the contour point features of the image to be recognized;

s2, calculating the similarity between the second histogram and the corresponding histogram of each classifier in the sample library, sorting all the classifiers according to the similarity from large to small, and acquiring the first N classifiers in the sorted classifiers according to a similarity threshold, wherein N is an integer greater than 0;

s3, obtaining a posture detection function according to the posture models of the first N classifiers and the weight of each classifier in the first N classifiers; the gesture detection function is a function corresponding to the gesture of the image to be recognized.

Preferably, step S1 is preceded by:

and S0, clustering all the images in the sample library according to the contour point characteristics to obtain a plurality of classifiers, and processing the histograms corresponding to all the images in each classifier to obtain the histogram corresponding to each classifier.

Preferably, step S1 includes: and performing soft projection on the distribution center.

Preferably, step S4 includes: and carrying out normalization processing on the similarity of the histograms corresponding to all the classifiers to obtain the similarity weight of the histograms corresponding to all the classifiers.

Preferably, step S5 includes: the gesture detection function is:

wherein, c_kRepresents the kth classifier in the sorted classifiers, I represents the picture to be recognized, X represents the posture model, and q (X | c)_kI) denotes the pose function of the kth classifier, p (c)_kI) represents the similarity weight of the kth classifier.

In a second aspect, the present invention further provides a multi-classifier gesture recognition apparatus, including:

the characteristic alignment module is used for obtaining distribution centers of all sample contour point characteristics according to a K-means clustering algorithm, projecting the distribution centers to obtain a first histogram, and obtaining a second histogram corresponding to the image to be recognized according to the first histogram and the contour point characteristics of the image to be recognized;

the similarity calculation module is used for calculating the similarity between the second histogram and the histogram corresponding to each classifier in the sample library, sorting all the classifiers according to the similarity from large to small, and acquiring the first N classifiers in the sorted classifiers according to a similarity threshold, wherein N is an integer greater than 0;

the gesture recognition module is used for obtaining a gesture detection function according to the gesture models of the first N classifiers and the weight of each classifier in the first N classifiers; the gesture detection function is a function corresponding to the gesture of the image to be recognized.

Preferably, the method further comprises the following steps:

and the classifier histogram acquisition module is used for clustering all the images in the sample base according to the contour point characteristics to obtain a plurality of classifiers, and processing the histograms corresponding to all the images in each classifier to obtain the histogram corresponding to each classifier.

Preferably, the feature alignment module is further configured to soft project the distribution center.

Preferably, the similarity calculation module is further configured to perform normalization processing on the similarities of the histograms corresponding to all the classifiers to obtain similarity weights of the histograms corresponding to all the classifiers.

Preferably, the gesture detection function in the gesture recognition module is:

wherein, c_kRepresents the kth classifier in the sorted classifiers, I represents the classifier to be identifiedPicture, X denotes a posture model, q (X | c)_kI) denotes the pose function of the kth classifier, p (c)_kI) represents the similarity weight of the kth classifier.

According to the technical scheme, the training sample clustering is carried out by using the alignment characteristics, and the human posture outline is described, so that the complexity of the sub-model can be effectively controlled, the aggregation of the appearance similarity samples can be realized, the effectiveness of model learning is improved, the training and learning task of mass data is met, and the performance of the posture recognition method is effectively improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flowchart illustrating a multi-classifier gesture recognition method according to an embodiment of the present invention;

FIG. 2 is a feature alignment method of a multi-classifier gesture recognition method according to an embodiment of the present invention;

FIG. 3 is a gesture inference method of the multi-classifier gesture recognition method according to an embodiment of the present invention;

FIG. 4 is a flowchart of a multi-classifier gesture recognition method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a multi-classifier gesture recognition apparatus according to an embodiment of the present invention.

Detailed Description

The following further describes embodiments of the invention with reference to the drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.

Fig. 1 shows a flowchart of a multi-classifier gesture recognition method provided in this embodiment, which includes:

In the embodiment, by adopting the sample library of the multiple classifiers, the complexity of the sub-model can be effectively controlled, the aggregation of the appearance similarity samples can be realized, so that the effectiveness of model learning is improved, the training and learning tasks of mass data are met, and the performance of the gesture recognition method is effectively improved.

As a preferable solution of this embodiment, step S1 includes:

By adopting a multi-classifier mode, the complexity of the sub-model can be effectively controlled, and the aggregation of the appearance similarity samples can be realized, so that the effectiveness of model learning is improved; by establishing the histogram of each classifier, the image to be detected can be conveniently and quickly detected to be the most similar classifier.

Further, step S1 includes: and performing soft projection on the distribution center.

The soft projection technology is a common noise reduction mode and is widely applied to histogram statistics of features such as HOG and sift.

Specifically, step S3 includes: the histograms of all the images in each classifier are processed using an averaging method.

The average method is a simple and effective processing method, and is used for representing the average contour point characteristics of the images in the classifier after accumulating and averaging the histograms of all the classifiers. Other processing methods may also be used.

Further, step S4 includes: and carrying out normalization processing on the similarity of the histograms corresponding to all the classifiers to obtain the similarity weight of the histograms corresponding to all the classifiers.

And carrying out normalization processing on the similarity, so as to conveniently set a unified threshold value in the follow-up process, wherein the threshold value is used for selecting a classifier in the gesture detection function.

Further, step S5 includes: the gesture detection function is:

The gesture detection function combines and considers the gesture function of each classifier in the sorted classifiers and the corresponding similarity weight, multiplies the two, and more objectively selects the gesture closest to the image to be recognized.

Fig. 2 illustrates a feature alignment method of a multi-classifier gesture recognition method provided in this embodiment: each contour point feature only describes local information near the point, and for the whole posture, the feature description is a set of all contour points forming the human body contour, however, the points are unordered in the set, so that the points cannot be directly used for comparing the similarity of two human body postures. For this purpose, the embodiment adopts a codebook with feature distribution, and performs feature alignment and compares the similarity of two human postures by using the codebook. And B distribution centers (bin) are learned as codebooks by using a K-means clustering mode for the contour point characteristics of all the training samples. And then, carrying out feature alignment on the human body contour features by utilizing the learned codebook. Specifically, each of the contour points is soft-projected onto two distribution centers (bins) closest to its distribution, and the value of the projection weight is inversely proportional to the feature of the point and the similarity in the distribution. The soft projection technology is a common noise reduction mode and is widely applied to histogram statistics of features such as HOG and sift. After the features are aligned, the features of the human body contour can be represented by a feature histogram with the length of B, and the aligned features are represented by BH, so that the similarity comparison between the two contours can be completed by directly comparing the distances of the two histograms.

FIG. 3 illustrates a gesture inference method of the multi-classifier gesture recognition method provided by the present embodiment: aiming at an image I to be detected, firstly calculating the contour feature H of the image I, completing feature alignment by using the feature alignment method, extracting a one-dimensional alignment feature histogram BH, and aligning the one-dimensional alignment feature histogram BH and the distribution center BH (c) of a subclass model_k) Similarity comparison is carried out to obtain a similarity value p (c) of the image I and the subclass model_kI). The calculation formula is as follows: p (c)_k|I)∝1＝dst(H(I)；H(c_k) In which c is_kRepresenting the kth sub-class model, then aligning and sorting according to the values of the similarity, and according to the following similarity accumulation threshold formula:

the pre-K sub-model is chosen for pose estimation. The final detection function is equation (1).

FIG. 4 is a flow chart of a multi-classifier gesture recognition method provided by the present embodiment; for all images in the sample library, firstly extracting three-dimensional shape features of the images, extracting aligned features by using the feature alignment method, clustering samples according to the aligned features, and then learning a model of each subclass by using a machine learning method. In the detection process, the samples to be detected are respectively compared with the submodels, and then the posture detection is carried out by utilizing a similarity accumulation threshold value method, so that the dynamic adjustment of the number of the submodels in the detection process is realized.

Fig. 5 is a schematic structural diagram of a multi-classifier gesture recognition device provided in this embodiment, including:

the feature alignment module 11 is configured to obtain distribution centers of all sample contour point features according to a K-means clustering algorithm, project the distribution centers to obtain a first histogram, and obtain a second histogram corresponding to an image to be recognized according to the first histogram and the contour point features of the image to be recognized;

the similarity calculation module 12 is configured to calculate a similarity between the second histogram and a histogram corresponding to each classifier in the sample library, sort all the classifiers according to a similarity value from large to small, and obtain the top N classifiers in the sorted classifiers according to a similarity threshold, where N is an integer greater than 0;

a gesture recognition module 13, configured to obtain a gesture detection function according to the gesture models of the first N classifiers and the weight of each classifier in the first N classifiers; the gesture detection function is a function corresponding to the gesture of the image to be recognized.

As a preferable aspect of this embodiment, the method further includes:

Further, the feature alignment module is further configured to soft project the distribution center.

Specifically, the multi-classifier obtaining module is further configured to process histograms of all images in each classifier by using an average value method.

Further, the similarity calculation module is further configured to perform normalization processing on the similarities of the histograms corresponding to all the classifiers to obtain similarity weights of the histograms corresponding to all the classifiers.

Still further, the gesture detection function in the gesture recognition module is formula (1).

In the description of the present invention, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Claims

1. A multi-classifier gesture recognition method, comprising:

s3, obtaining a posture detection function according to the posture models of the first N classifiers and the similarity weight of each classifier in the first N classifiers; the gesture detection function is a function corresponding to the gesture of the image to be recognized;

in the gesture detection process, images to be recognized are respectively compared with the sorted classifiers, and gesture detection is carried out according to a similarity accumulation threshold formula so as to dynamically adjust the number of the classifiers in the detection process;

the similarity accumulation threshold formula is as follows:

p(c_ii) is an image I to be recognized and a classifier c_iA similarity value of c_iAnd T is a similarity threshold value used for selecting the classifier in the gesture detection function.

2. The method according to claim 1, wherein step S1 is preceded by:

3. The method according to claim 2, wherein step S1 includes: and performing soft projection on the distribution center.

4. The method according to claim 3, wherein step S4 includes: and carrying out normalization processing on the similarity of the histograms corresponding to all the classifiers to obtain the similarity weight of the histograms corresponding to all the classifiers.

5. The method of claim 4, wherein the gesture detection function is:

wherein, c_kRepresents the kth classifier of the sorted classifiers, I represents the image to be recognized, X represents the pose model, q (X | c)_kI) denotes the pose function of the kth classifier, p (c)_kI) is an image I to be recognized and a classifier c_kWhich represents the similarity weight of the kth classifier.

6. A multi-classifier gesture recognition apparatus, comprising:

the gesture recognition module is used for obtaining a gesture detection function according to the gesture models of the first N classifiers and the similarity weight of each classifier in the first N classifiers; the gesture detection function is a function corresponding to the gesture of the image to be recognized;

the similarity accumulation threshold formula is as follows:

7. The apparatus of claim 6, further comprising:

8. The apparatus of claim 7, wherein the feature alignment module is further configured to soft project the distribution center.

9. The apparatus according to claim 8, wherein the similarity calculation module is further configured to perform normalization processing on the similarities of the histograms corresponding to all the classifiers to obtain similarity weights of the histograms corresponding to all the classifiers.

10. The apparatus of claim 9, wherein the gesture detection function in the gesture recognition module is: