CN107025461B

CN107025461B - Matrix classification model based on inter-class discrimination

Info

Publication number: CN107025461B
Application number: CN201611124167.4A
Authority: CN
Inventors: 王喆; 李冬冬; 张国威; 高大启
Original assignee: East China University of Science and Technology
Current assignee: East China University of Science and Technology
Priority date: 2016-12-08
Filing date: 2016-12-08
Publication date: 2020-07-14
Anticipated expiration: 2036-12-08
Also published as: CN107025461A

Abstract

The present invention providesA matrix classification model based on inter-class discrimination firstly collects a data set, converts the collected samples into matrix type samples and secondly constructs a regularization item

. Then regularizing term

Introducing MatMHKS, generating a new matrix-mode-oriented classification model CBCMatMHKS, training the new model by using a training set, and solving the model CBCMatMHKS by using a gradient descent method in order to obtain the optimal solution of the model. And then testing the obtained optimal solution by using a test set so as to obtain an optimal decision function. And finally, calculating the input matrix sample needing to be classified by using the optimal decision function, and classifying the matrix sample according to the output result. Compared with the traditional matrix classification model, the method has the advantages that the discrimination information among the classes is introduced, the cluster center is used for representing the samples in a certain area, so that the distance among the local samples of different classes is maximized, and the classification accuracy is improved.

Description

Matrix classification model based on inter-class discrimination

Technical Field

The invention relates to the field of pattern recognition, in particular to a method for learning a machine model based on an inter-class discrimination matrix.

Background

Most classifiers can only process vector type samples, and matrix type samples need to be converted into vector type samples for processing. For example, a face picture, the vector type classifier needs to convert the face picture into a vector type sample before processing the face picture, but the structure discrimination information in a single sample is lost to some extent. The matrix pattern classifier design method can directly classify the matrix samples. Meanwhile, experiments show that the matrix pattern-oriented classifier design method can effectively improve the performance of the vectorization-oriented classifier design method to a certain extent.

The original Matrix pattern Classifier design method ignores the discrimination information between classes, wherein a typical linear algorithm for comparison is MatMHKS (Matrix-pattern-oriented Ho-Kashyap Classifier). At present, no method for overcoming the defect exists in the field of design of a classifier based on a matrix pattern. Therefore, we proposeNew regularization term R_BCAnd introducing inter-class discrimination information into a matrix pattern classifier design method. We construct the regularization term R by maximizing the distance between classes_BCFirstly, clustering is carried out on each type of samples by using a clustering algorithm, cluster centers are obtained, and then the distance between the cluster centers of different types in a projection space is maximized.

R is a handle_LsDIntroduced into the two-sided matrix type classifier MatMHKS, thereby generating a new classification algorithm CBCMatMHKS. CBCMatMHKS can not only acquire discrimination information between classes, but also improve the classification accuracy of MatMHKS.

Disclosure of Invention

Aiming at the problem that the existing design method for the matrix pattern-oriented classifier does not consider inter-class discrimination information between matrix patterns, the invention has the technical scheme that a new regularization item is designed on the frame designed by the original matrix pattern-oriented classifier to consider the inter-class discrimination information, so that a local sensitive discrimination matrix learning model is generated. The framework is applied to MatMHKS which is a previous work of the people, and is named as CBCMatMHKS, and the optimal solution is obtained by using a gradient descent method. Since the model adopts a one-to-one and voting method, the data set with the category number of M can be converted into M (M-1)/2 binary problems.

The technical scheme adopted by the invention for solving the technical problems is as follows: firstly, a data set is collected, and the collected sample is converted into a matrix type sample, wherein the matrix type sample is digitalized for a non-numerical data set, and gray processing and dimension reduction processing are needed for a picture data set so as to remove noise. Second construct the regularization term R_BC. Then the regularization term R_BCIntroducing MatMHKS, generating a new matrix-mode-oriented classification model CBCMatMHKS, training the new model by using a training set, and solving the model CBCMatMHKS by using a gradient descent method in order to obtain the optimal solution of the model. And then testing the obtained optimal solution by using a test set so as to obtain an optimal decision function.And finally, calculating the input matrix sample needing to be classified by using the optimal decision function, and classifying the matrix sample according to the output result.

The technical scheme adopted by the invention can be further perfected. At said construction regularization term R_BCFirstly, clustering is carried out on samples of different classes by using a clustering algorithm, a cluster center of each cluster is obtained, and then the distance between different cluster centers is maximized in a projection space. This method is a matrix type classification model, but can be handled by a vector type method.

The invention has the beneficial effects that: the information for distinguishing between the clusters is obtained by dividing the same cluster into a plurality of clusters by using a clustering method and then maximizing the distance between cluster centers of different clusters; by introducing the discrimination information among classes, the cluster center is used for representing the sample of a certain area to maximize the distance among the local samples of different classes, and the discrimination information among the classes is introduced into the traditional matrix-mode-oriented classification model, so that the classification accuracy is improved to a certain extent; meanwhile, the overfitting problem of the small sample is improved to a certain extent, and the method can be used for directly processing the image data set and the vector data set.

Drawings

FIG. 1 is a system framework of the inter-class discriminant matrix-based learning machine model of the present invention.

Detailed Description

The invention is further described below with reference to the figures and examples, and the method of the invention is divided into three major steps.

The first step is as follows: the data set is subjected to an acquisition transformation,

firstly processing the acquired data set, digitizing the data set which is not digitized, graying the picture data set, and then reducing the dimension of the picture data set by using a dimension reduction algorithm for subsequent processing, secondly matrixing the acquired data set, for example x ∈ R^1×NThe sample converted into matrix is

Wherein d is₁×d₂＝N。

The second step is that: model training

1) First, a regularization term R is constructed_BC

Assume a two-class matrix pattern of

And each mode is classified as

Clustering is performed on each class by using a clustering method, and the cluster center is calculated as in equation (1):

wherein the number of clusters of each class is k_mM is 1,2, and the number of patterns per cluster is

Maximize the distance of different cluster centers in the projection space, then R_BCAs shown under equation (2):

where f (x) is a discriminant function.

2) The minimum structural risk framework of the conventional matrixing method is shown in equation (3):

minJ＝R_emp+cR_reg, (3)

wherein R is_empIs an empirical risk term, R_regIs a regularization term that aims to control the smoothness and computational complexity of the entire framework. The regularization parameter c is the equilibrium R_empAnd R_regThe relationship (2) of (c). We apply the regularization term R_BCA new matrixing method framework can be obtained by introducing into (3),as shown in equation (4):

minJ＝R_emp+cR_reg-λR_BC, (4)

wherein the first two R_empAnd R_regSame as formula (3), R_BCThe same as the formula (2).

3) After introducing a new framework into matrix-oriented classifier MatMHKS, we can get CBCMatMHKS, the target of which

The function is shown in equation (5):

it is composed of

v₀∈ R are offsets, labeled class numbers for each matrix pattern.

4) Firstly matrixing the formula (5), and then solving the optimal weight vector of the CBCMatMHKS model by using a gradient descent method

The partial derivatives of u and v are calculated separately for equation (5), i.e.

And

then respectively order

And

solving the weight vectors u and v is shown in equations (6) and (7):

wherein

And the calculation formula of the iteration end condition is shown in equation (8):

b_N×1(iter+1)＝b_N×1(iter)+ρ(e(iter)+‖e(iter)‖) (8)

wherein b is_i≧ 0, i ═ 1,2,3, …, N, ρ > 0, iter, e (iter) yv (iter) -1, and iter, where iter is the number of iteration steps_N×1-b_N×1(iter)。

Third, model testing

And after the weight vector is obtained, testing the weight vector by using a test set so as to obtain an optimal decision function.

The fourth step, prediction

And identifying the sample of the unknown class through the optimal decision function obtained in the last step. Assume a sample of unknown class is

The decision function is as follows:

wherein

Is a category of the sample.

Hereinbefore, specific embodiments of the present invention are described with reference to the drawings. It will be understood by those skilled in the art that various changes and substitutions may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention. Such modifications and substitutions are intended to be included within the scope of the present invention as defined by the appended claims.

Results of the experiment

To verify the effectiveness and feasibility of our proposed method, we collected 5 vector datasets from UCI, ke L to verify our algorithm, the datasets are shown in table 1, the dimensions, number of classes, training set scale and test set scale of the datasets are given in table 1 the dataset of the experiment is divided into two parts, training set and test set, where the proportion of training set and test set in each dataset is 0.5, and 5 rounds of monte carlo cross-validation are used to obtain the classification accuracy, the model parameters are set by experiment and artificial experience, where the number of clusters k ∈ {1,3,6,9,25,50,100}, and the regularization term parameter value c ∈ [0.01,0.1,1,10,100 }]And λ ∈ {0.01,0.1,1,10,100}, with the initialized boundary vector b (1) ═ 10^-61_N×1Weight vector

The maximum number of iterations maxter is 100, the minimum stop error ξ is 0.0001, and the iteration step ρ is 0.99.

TABLE 1 data set

The parameter settings of the comparison algorithm are as follows:

MatMHKS and the modified CBCMatMHKS involved use the same parameter settings to facilitate comparison, MatMHKS has parameter settings of regularization term parameter values c ∈ [0.01,0.1,1,10,100]The initialization boundary vector b (1) is 10^-61_N×1Weight vector

The results of the experiment are shown in table 2. From the experimental results, the accuracy of CBCMatMHKS is better than that of MatMHKS to a certain extent. This verifies the effectiveness and feasibility of our proposed method.

Table 2 data set accuracy (%)

Data set	CBCMatMHKS	MatMHKS
			Bal	91.21±1.74	89.17±0.94
Bup	68.95±3.69	67.67±2.59
			Cle	58.91±4.01	58.78±3.52
Iri	98.67±1.33	98.40±0.12
			Led	74.55±0.84	73.66±0.93

Remarking: the experimental data are all from the environments of Inter Xeon CPU E5-24072.20 GHZ,16G RAM DDR3, Windows Server 2012 and Matlab.

Claims

1. A matrix classification method based on inter-class discrimination comprises the following specific steps:

1) firstly, acquiring an image data set: converting the collected image sample into a matrix mode so that a subsequent algorithm can process the image sample, and performing gray level processing on the image data set and performing dimension reduction processing by using a traditional dimension reduction algorithm so as to remove noise;

2) secondly, clustering each class in the training set by using a clustering method to obtain a cluster center;

3) then maximizing the distance between the cluster centers between different classes in the projection space, thereby constructing a new regularization term R_BC；

4) Followed by a regularization term R_BCCombining matrix-mode-oriented classifier MatMHKS to construct a new matrix-mode-oriented classification method CBCMatMHKS, wherein the method frame is minJ ═ R_emp+cR_reg-λR_BCWherein R is_empIs an empirical risk term, R_regIs a regularization item for controlling the smoothness and computational complexity of the whole CBCMatMHKS matrix pattern classification model, and a regularization parameter c is a balance R_empAnd R_regλ is the regulation R_BCThe parameters of (1); training the CBCMatMHKS by using a training set, and solving the optimal solution of the model CBCMatMHKS by using a gradient descent method;

5) then, testing an optimal solution by using the test set, and obtaining an optimal decision function;

6) and finally, calculating the input unknown matrix mode by using the obtained optimal decision function, and classifying the unknown matrix mode according to the output result.

2. The method of claim 1, wherein the matrix classification method based on inter-class discrimination comprises: the regularization term R_BCThe method is to discover the discrimination information between classes by maximizing the distance of cluster centers of different classes in a projection space, and the form of the discrimination information is

Wherein the number of clusters of each class is k_m,m＝1,2，

Is the ith cluster center of the first class,

is the jth cluster center of the second class.