CN111709441A

CN111709441A - Behavior recognition feature selection method based on improved feature subset discrimination

Info

Publication number: CN111709441A
Application number: CN202010377788.3A
Authority: CN
Inventors: 王怀军; 王瑞杰; 李军怀; 张发存; 王侃
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2020-05-07
Filing date: 2020-05-07
Publication date: 2020-09-25
Anticipated expiration: 2040-05-07
Also published as: CN111709441B

Abstract

The invention discloses a behavior recognition feature selection method based on improved feature subset distinguishability, which comprises the steps of establishing a sample feature set based on the feature subset distinguishability measurement criterion of DFS, and obtaining a feature subset distinguishability measurement formula DFS_S(ii) a Acquiring mutual information between two random variables based on information theory and probability theory concepts, thereby obtaining an expression of minimum redundancy of characteristics in a class; combined feature subset distinguishability measurement formula DFS_SAnd an intra-class feature minimum redundancy expression, defining a joint function of maximum correlation and minimum redundancy; and training a joint function of the maximum correlation and the minimum redundancy to complete the selection process. According to the behavior recognition feature selection method, on one hand, redundancy analysis is added, and redundancy features are deleted, so that the classification accuracy is improved, and meanwhile, the calculation complexity is further reduced; on the other hand, through the calculation of the maximum correlation and the minimum redundancy, the redundancy between the characteristics is reduced while the distinguishing capability between the categories of the characteristic subsets is ensured.

Description

Behavior recognition feature selection method based on improved feature subset discrimination

Technical Field

The invention belongs to the technical field of behavior recognition selection methods, and particularly relates to a behavior recognition feature selection method based on improved feature subset discrimination.

Background

In recent years, with the rapid development of computer science and sensor technology, sensors with different shapes are gradually influencing various layers of people's lives, and the recognition and understanding of human actions and behaviors based on sensor data constitute a key task in future human-centered calculation.

The research of human behavior recognition can provide more humanized services for people, such as: monitoring the old, body feeling games and health care. Human perception behavior recognition based on the sensor is a new branch in behavior recognition, is more convenient, free and safe compared with behavior recognition based on images, is low in dependence on external environment and can be freely worn, and the privacy of user data is improved.

In the human behavior recognition research based on the acceleration sensor, characteristics such as a time domain and a frequency domain are generally extracted. Where a small subset of features may result in a relatively high classification error rate, and a large subset of features may result in a relatively low classification error rate. It should be noted that the extracted features cannot be excessive because: as the number of features increases, the amount of computation grows exponentially, resulting in dimensionality disasters; in addition, because the extracted features have irrelevant and redundant characteristics, the extracted features need to be subjected to dimensionality reduction, relevant features are selected, irrelevant and redundant features are removed, and a better classification effect is finally achieved.

According to the relationship between the feature selection method and the classifier, the feature selection method can be generally classified into four categories: filter, pack, inlay and hybrid. Filtering feature selection algorithms are independent of classifiers, typically using distance, correlation, consistency, or information metrics to measure the correlation between features and classification classes and the redundancy between features, different evaluation criteria may result in a distinct optimal subset of features. The packing type feature selection method considers the interaction among the features, depends on the performance of the classifier, and directly outputs the optimal feature subset after the algorithm is completed. The feature selection is taken as a construction requirement of a learning algorithm and is carried out synchronously with classification, and the embedded feature selection algorithm can embed the feature selection into an algorithm constructed by a classifier so as to effectively select a feature subset. The hybrid feature selection algorithm combines the advantages of both filtering and wrapping algorithms, and first generates a certain feature subset using the filtering algorithm, and then further compresses the feature subset using the wrapping algorithm.

Considering the advantage of the hybrid mode having higher classification accuracy, the Feature selection method based on the Feature Subset (DFS) weighing criterion of the Subset discrimination of the Xian English considers the correlation between the features, and combines the search strategy and the classifier to optimize the Feature Subset by calculating the size of the joint contribution of a plurality of features to the classification. However, in the process of selecting the features, the method does not consider the influence of the redundancy among the features on the classification result, and the preferred feature subset has redundant features.

Disclosure of Invention

The invention aims to provide a behavior recognition feature selection method based on improved feature subset discrimination, and solves the problems of more redundant features, low classification accuracy and high calculation complexity in the existing behavior recognition feature selection method.

The invention adopts the technical scheme that the behavior recognition feature selection method based on the improved feature subset discrimination comprises the following steps:

step 1, establishing a sample feature set based on the feature subset distinguishability measuring criterion of the DFS, and obtaining a feature subset distinguishability measuring formula DFS_S；

Step 2, acquiring mutual information between two random variables based on information theory and probability theory concepts, thereby obtaining an intra-class feature minimum redundancy expression in the sample feature set in the step 1;

step 3, combining the feature subset distinguishability measurement formula DFS in the step 1_SAnd step 2, an intra-class characteristic minimum redundancy expression is used for defining a joint function of maximum correlation and minimum redundancy;

and 4, training the combined function of the maximum correlation and the minimum redundancy in the step 4 to finish the selection process.

The present invention is also characterized in that,

the step 1 specifically comprises the following steps:

let m dimension real space be written as R^mAny k (k is not less than 2) category, set sampleIs n, and the total number of the i-th class samples is n_iAnd the spatial dimension of the sample is m, the training set T can be expressed as formula (1):

T＝{(x_t,y_t)|x_t∈R^m,y_t∈{1,2...k},t∈{1,2...n}} (1)

therefore, the method includes k categories, | S | represents the number of elements in the feature set S and 0 & lt | S | < m, and | S | feature subset discrimination measurement formula DFS of features_SIs formula (2):

wherein the parameters

Represents the mean vector obtained after averaging all samples,

represents the mean vector obtained after averaging the ith type samples,

and representing the feature vector of the jth sample of the ith category, wherein the vectors of the three categories all contain | S | < m features.

The step 2 specifically comprises the following steps:

setting the probability density of any two random variables X and Y and the joint probability density of the random variables X and Y as p (X), p (Y) and p (X, Y) in sequence; then the mutual information between these two random variables can be defined as in equation (3):

the calculation formula for setting the minimum redundancy of the features in the class is as the following formula (4):

wherein the parameters u, v represent any two features in the feature set, and r (S) represents mutual information values between all features in the set S, i.e. redundancy between features.

The step 3 specifically comprises the following steps:

combining equations (2) and (4), the combined function that yields the maximum correlation and the minimum redundancy is shown as equation (5):

f(DFS_S,R'(S))＝DFS_S-R'(S)

wherein the parameter k represents the number of categories;

respectively expressed under the ith category, the characteristics α₁And α₂The true value in the jth sample; and DFS_SIndicating the DFS value size corresponding to the subset of features containing | S | features.

The step 4 specifically comprises the following steps:

firstly, a feature subset distinguishability measurement formula DFS_SSelecting the characteristic with the maximum DFS value from the empty set, and adding the characteristic into the initially empty optimal characteristic subset X;

then, judging whether the newly added features are reserved or not according to the accuracy of the random forest classifier corresponding to the optimal feature subset X after the new features are added: if the accuracy rate is increased, retaining the newly added feature; otherwise, delete it; and (4) performing iteration until all the characteristics are tested, wherein the optimal characteristic subset X after the iteration is finished is the final selection result.

The invention has the beneficial effects that: according to the behavior recognition feature selection method based on the improved feature subset discrimination, on one hand, redundancy analysis is added in the feature selection process, redundant features are deleted, and the calculation complexity is further reduced while the classification accuracy is improved; on the other hand, through the calculation of the maximum correlation and the minimum redundancy, the distinguishing capability among the categories of the feature subsets is ensured, meanwhile, the redundancy among the features is reduced, and the method has good practical value.

Detailed Description

The present invention will be described in detail below with reference to specific embodiments.

The invention relates to a behavior recognition characteristic selection method based on improved characteristic subset distinguishability,

step 1, establishing a sample feature set based on the feature subset distinguishability measuring criterion of the DFS, and obtaining a feature subset distinguishability measuring formula DFS_S(ii) a The method specifically comprises the following steps:

let m dimension real space be written as R^mAnd any k (k is more than or equal to 2) type, setting the total recorded number of the samples as n and the total number of the ith type samples as n_iAnd the spatial dimension of the sample is m, the training set T can be expressed as formula (1):

T＝{(x_t,y_t)|xt∈R^m,yt∈{1,2...k},t∈{1,2...n}} (1)

wherein the parameters

Represents the mean vector obtained after averaging all samples,

represents the mean vector obtained after averaging the ith type samples,

Larger numerator values indicate more sparseness between feature subset classes, and smaller denominator values indicate more clustering within feature subset classes. Thus, DFS_SThe larger the value of (A), the stronger the inter-class discrimination capability of the characteristic subset is, and the classification identification effect isThe better the result, the greater the impact on the classification results.

Step 2, acquiring mutual information between two random variables based on information theory and probability theory concepts, thereby obtaining an intra-class feature minimum redundancy expression in the sample feature set in the step 1; the method specifically comprises the following steps:

in the information theory and the probability theory, mutual information of two random variables is used to measure the degree of interdependence between the two variables. More specifically, it is the "amount of information" obtained by quantifying another random variable by observing it. Which is different from the correlation coefficient and is not limited to a real-valued random variable, determines how similar the products of the joint distribution and the respective edge distributions are. By utilizing mutual information, the feature redundancy can be effectively reduced, the classification accuracy can be further improved, and the method has outstanding contribution in the aspect of feature optimization.

Mutual information indicates the degree of correlation, i.e., the degree of redundancy, of two features. Setting the probability density of any two random variables X and Y and the joint probability density of the random variables X and Y as p (X), p (Y) and p (X, Y) in sequence; then the mutual information between these two random variables can be defined as in equation (3):

The process of feature selection can be viewed as a process of searching for the most representative feature subset, namely: on the basis of maximizing the accuracy, the computational complexity is reduced. Therefore, the selected feature subset should not only maximize the correlation with the classification category, but also minimize the redundancy between features.

In the step 3, the step of,combining the feature subset distinguishability measurement formula DFS in the step 1_SAnd step 2, an intra-class characteristic minimum redundancy expression is used for defining a joint function of maximum correlation and minimum redundancy; the method specifically comprises the following steps:

wherein the parameter k represents the number of categories;

And 4, training the combined function of the maximum correlation and the minimum redundancy in the step 4 to finish the selection process. The method specifically comprises the following steps:

Examples

First, experimental data

The UCI HAR Dataset is adopted in the experiment, 30 testers with different ages, heights and weights carry a smart phone on the waist, and then six types of human behavior and action acceleration sensor data are acquired at a constant speed (50Hz), wherein the data are respectively as follows: walking, ascending stairs, descending stairs, sitting, standing, and lying down.

Using a sliding window technology (the window size is 110%, the coverage rate is 50%) to perform feature extraction on the denoised data set, wherein the extracted features are 15 types, the feature numbers are respectively 1 to 15, and are respectively: mean, variance, root mean square, mean absolute deviation, interquartile range, interaxial correlation coefficient, kurtosis, skewness, energy, maximum, minimum, median absolute difference, signal amplitude domain, peak-to-peak value, and median.

In order to obtain a reliable and stable classification model, a 10-fold cross validation experiment is adopted. In order to obtain uniform experimental data, firstly, the sample sequence is randomly disturbed, each type of sample is sequentially added into 10 initially empty sample sets one by one until each sample of the type is added, and the purpose of randomly and uniformly dividing the samples into 10 parts is achieved. Then, using 1 sample as a test sample set and the other 9 samples as a training sample set, sequentially polling, and finally realizing 10-fold cross validation.

Second, pretreatment of experimental data

Aiming at the acceleration sensor of the smart phone, due to the influence of noise and external environment existing in hardware of the acceleration sensor, the acquired original data deviates from a true value. The moving average filtering method adopting smooth denoising has the following calculation formula:

wherein, the parameter Original is the raw data collected by the acceleration sensor, the parameter Result is the calculation Result, i represents the ith moment, and n represents the window length for smoothing.

Third, analysis of experimental results

In order to verify the effectiveness of the behavior recognition feature selection method R-DFS, the recall ratio R (Recall), the accuracy ratio P (precision) and

three metrics.

Performing 10-fold cross validation on the K neighbor KNN, the support vector machine SVM, the decision tree DT, the naive Bayes NB, the RF and other five classifiers respectively; the five classifiers are compared in a confusion matrix under the optimal feature subset, and the experimental results are shown in tables 1-6.

TABLE 1 DFS recall based comparison data

Parameter(s)	Walking device	Go upstairs	Go downstairs	Sit down	Standing up	Lie down
							KNN	0.9	1	0.8	0.9	1	1
SVM	0.8	0.6	0.9	0.9	1	1
							DT	0.7	0.6	0.8	0.9	1	0.9
NB	0.9	0.7	1	0.9	1	1
							RF	0.9	1	1	1	1	1

TABLE 2 DFS based accuracy comparison data

Parameter(s)	Walking device	Go upstairsLadder with adjustable height	Go downstairs	Sit down	Standing up	Lie down
							KNN	0.9	0.9	0.9	1	1	0.9
SVM	0.9	0.8	0.7	1	1	0.9
							DT	0.8	0.7	0.6	0.9	0.9	0.9
NB	0.9	0.9	0.8	1	0.9	1
							RF	1	0.9	0.9	1	1	1

Table 3 comparing data based on F1 values of DFS

Parameter(s)	Walking device	Go upstairs	Go downstairs	Sit down	Standing up	Lie down
							KNN	0.95	0.91	0.87	0.95	0.95	0.95
SVM	0.85	0.72	0.83	0.94	0.96	0.96
							DT	0.77	0.61	0.69	0.89	0.95	0.93
NB	0.9	0.79	0.86	0.95	0.98	0.98
							RF	0.95	0.94	0.96	0.99	0.99	0.99

TABLE 4 recall ratio comparison data based on R-DFS

TABLE 5R-DFS based accuracy comparison data

Parameter(s)	Walking device	Go upstairs	Go downstairs	Sit down	Standing up	Lie down
							KNN	1	0.9	1	1	1	1
SVM	0.9	0.8	0.8	1	1	1
							DT	0.8	0.7	0.7	0.9	0.9	1
NB	0.9	0.9	0.8	1	1	1
							RF	1	1	1	1	1	1

TABLE 6F 1 value comparison data based on R-DFS

Parameter(s)	Walking device	Go upstairs	Go downstairs	Sit down	Standing up	Lie down
							KNN	0.96	0.94	0.9	0.97	0.97	0.98
SVM	0.87	0.76	0.87	0.99	0.99	1
							DT	0.81	0.65	0.75	0.91	0.97	0.95
NB	0.93	0.83	0.89	0.97	0.99	0.99
							RF	0.98	0.98	0.99	1	1	1

From the comparison results of tables 1 to 6, it can be seen that: the R-DFS feature selection method is generally superior to the DFS method in three evaluation indexes of accuracy, recall rate and F1 score.

Furthermore, the RF algorithm was verified to work best among the five classifiers. Compared with the DFS feature selection method, the R-DFS has the average performance improvement of 1.9% in accuracy rate, 1.8% in recall rate and 1.9% in F1 score.

The behavior recognition feature selection method based on the improved feature subset discrimination uses the UCI HARDATASET data set to perform experimental analysis, and the result shows that: compared with the DFS method, the method can further improve the classification performance in the aspect of deleting redundancy; in addition, among the five classes of classifiers, the RF classifier has the highest accuracy.

Claims

1. The behavior recognition feature selection method based on the improved feature subset discrimination is characterized by comprising the following steps of:

2. The behavior recognition feature selection method based on the improved feature subset discrimination as claimed in claim 1, wherein the step 1 specifically comprises:

T＝{(x_t,y_t)|x_t∈R^m,y_t∈{1,2...k},t∈{1,2...n}} (1)

wherein the parameters

Represents the mean vector obtained after averaging all samples,

represents the mean vector obtained after averaging the ith type samples,

3. The behavior recognition feature selection method based on the improved feature subset discrimination as claimed in claim 2, wherein the step 2 is specifically:

4. The behavior recognition feature selection method based on the improved feature subset discrimination as claimed in claim 3, wherein the step 3 is specifically:

f(DFS_S,R'(S))＝DFS_S-R'(S)

wherein the parameter k represents the number of categories;

5. The behavior recognition feature selection method based on the improved feature subset discrimination as claimed in claim 4, wherein the step 4 is specifically: