CN111639686B - Semi-supervised classification method based on dimension weighting and visual angle feature consistency - Google Patents

Semi-supervised classification method based on dimension weighting and visual angle feature consistency Download PDF

Info

Publication number
CN111639686B
CN111639686B CN202010416737.7A CN202010416737A CN111639686B CN 111639686 B CN111639686 B CN 111639686B CN 202010416737 A CN202010416737 A CN 202010416737A CN 111639686 B CN111639686 B CN 111639686B
Authority
CN
China
Prior art keywords
matrix
label
representing
view
semi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010416737.7A
Other languages
Chinese (zh)
Other versions
CN111639686A (en
Inventor
聂飞平
石少君
王榕
李学龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202010416737.7A priority Critical patent/CN111639686B/en
Publication of CN111639686A publication Critical patent/CN111639686A/en
Application granted granted Critical
Publication of CN111639686B publication Critical patent/CN111639686B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a semi-supervised classification method based on dimension weighting and visual angle feature consistency. Firstly, constructing a similarity matrix of each visual angle of multi-visual angle data by adopting a self-adaptive local structure learning method; then, taking an average value of the similarity matrixes of all the visual angles as an initial consistency similarity matrix, and constructing a multi-visual angle semi-supervised classification model based on dimension weighting and visual angle characteristic consistency; then, solving the model by adopting an alternative iteration updating method until a final label matrix is obtained; and finally, obtaining the label of the sample according to the label matrix, and finishing sample classification. The classification model constructed by the method combines the construction similarity matrix with the label inference, so that the influence of the composition quality on the classification result is reduced; and better classification results can be obtained due to the fact that the characteristic dimensions in the visual angle are weighted and the local structure information of the data is considered.

Description

Semi-supervised classification method based on dimension weighting and visual angle feature consistency
Technical Field
The invention belongs to the technical field of machine learning and data mining, and particularly relates to a semi-supervised classification method based on dimension weighting and view angle feature consistency.
Background
With the advent of the big data age, information in many real scenes can be obtained through different channels, different angles, different modalities and different features. For the multi-source data, how to efficiently and accurately fuse the information through a certain strategy to complete a specific task has important research significance in practical scenes.
Under the assumption that the multi-view data sets have "complementarity" and "consistency", multi-view learning refers to a method for describing a researched object from multiple angles and then integrating information of multiple angles for learning. Semi-supervised classification refers to training a classifier with a small number of labeled samples and unlabeled samples, and then using a learned classifier to infer the labels of the unlabeled samples. In an actual scene, a generally obtained multi-view data set has a small number of labels, and a large amount of manpower and material resources are consumed for labeling the data set. Therefore, it is of great research value to label unlabeled samples with a small number of labels in combination with information from multiple perspectives.
The traditional multi-view semi-supervised classification method is mainly divided into three categories: 1) performing collaborative training; 2) multi-view semi-supervised classification based on graphs; 3) regression-based multi-view semi-supervised classification. In graph-based semi-supervised classification, samples represent nodes of a graph, and the similarity between any two nodes represents the strength of an edge. Thus, the graph-based semi-supervised learning process is equivalent to a staining process. Nie et al in the references "F.Nie, J.Li and X.Li, Parameter-free auto-weighted multiple graph: A frame for multi-view clustering and semi-superimposed classification, in Proc.IEEE conf.IJCAI,2016, pp.1881-1887" first construct a similarity matrix and then infer the label of the unlabeled sample from the labeled sample information and the constructed similarity map. Yang et al, in the documents "M.Yang, C.Deng, and F.Nie, Adaptive-weighted differential regression for multi-view classification, Pattern Recognition, vol.88, pp.236-245,2019," classify multi-view datasets using the idea of Adaptive discriminant regression. Considering that there are a small number of labels in the real dataset, Tao et al established a Regression-based Semi-Supervised Classification model for each View in the documents "H.Tao, C.Hou, F.Nie, J.Zhu, and D.Yi, Scalable Multi-View Semi-Supervised Classification View Adaptive Regression, IEEE Transactions on Image Processing, vol.26, No.9, pp.4283-4296". In order to make the model robust to noise or outliers, the method also utilizes L2,1And (4) norm. The model can adaptively assign view weights considering that each view has different importance to the classification result. However, it is not limited toThese regression-based concepts described above only consider the linear relationship between the samples and the labels, which is not satisfied with the nonlinear relationship. In graph-based semi-supervised classification, the quality of the constructed similarity graph will greatly affect the final classification result, and since the construction of the similarity graph and the label inference are treated as two separate steps, the relationship between the two is ignored. In addition, the methods only consider the difference of characteristics between the visual angles, and ignore the difference between dimensions in the visual angles, thereby ignoring the data local structure information. Therefore, the classification accuracy of these methods is affected.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a semi-supervised classification method based on dimension weighting and view angle feature consistency. Firstly, constructing a similarity matrix of each visual angle in multi-visual angle data by adopting a self-adaptive local structure learning method; then, taking an average value of the similarity matrixes of all the visual angles as an initial consistency similarity matrix, and constructing a multi-visual angle semi-supervised classification model based on dimension weighting and visual angle characteristic consistency; then, solving the model by adopting an alternative iteration updating method until a final label matrix is obtained; and finally, obtaining the label of the sample according to the label matrix, and finishing sample classification. The method of the invention simultaneously performs the learning of the similarity matrix and the propagation of the label, thereby reducing the dependence of the label propagation process on the quality of the similarity matrix. In addition, when the similarity matrix is updated, the semi-supervised classification result can be improved by weighting the feature dimension in each visual angle.
A semi-supervised classification method based on dimension weighting and visual angle feature consistency is characterized by comprising the following steps:
step 1: let χ ═ X1,X2,...,XVDenotes a multi-view data set, in which,
Figure GDA0003308847840000021
a feature representing a V-th view, V1, 2.., V being the number of views, n representing the number of samples, d (V) representing the dimensionality of the V-th view feature; the number of categories of the data set is set as C;
according to
Figure GDA0003308847840000022
The method calculates the ith sample point in the v view
Figure GDA00033088478400000211
To the jth sample point
Figure GDA0003308847840000023
I, j ═ 1, 2.., n; for each sample point, sorting the distances between all other sample points and the sample point from small to large, and selecting the first k sample points with the minimum distance as the adjacent points; then, the ith sample point is calculated as follows
Figure GDA0003308847840000024
And the jth sample point
Figure GDA0003308847840000025
Similarity between them:
Figure GDA0003308847840000026
wherein the content of the first and second substances,
Figure GDA0003308847840000027
representing distance sample points
Figure GDA0003308847840000028
Sample point of k +1 th nearest sample point and sample point
Figure GDA0003308847840000029
The value range of k is 5-15, when i is j,
Figure GDA00033088478400000210
to be provided with
Figure GDA0003308847840000031
Obtaining a similarity matrix S of a v-th visual angle as the ith row and j column element values of the matrixv∈Rn×n,v=1,2,...,V;
Step 2: adding the similarity matrixes of all V visual angles, and averaging to obtain an initial consistency similarity matrix S; then, according to LS=DSS is calculated to obtain an initial Laplace matrix LSWherein D isSIs a degree matrix, is a diagonal matrix, the ith diagonal element of which is
Figure GDA0003308847840000032
1,2, n; tag matrix F ═ Fl;Fu]T,Fl=Yl,Yl∈Rl×CA label matrix representing known samples, Fu∈Ru×CA label matrix representing unlabeled samples, wherein u is n-l, l is the number of known sample labels, and F is taken as the first C eigenvectors of the matrix S at the beginning; according to the formula theta v ii1/d (v) initialize the weight matrix Θ for the v-th view anglev,Θv∈Rd(v)×d(v)Is a diagonal matrix, Θv iiIs thetavI 1,2, d (V), V1, 2, 1.., V;
and step 3: the multi-view semi-supervised classification model is constructed as follows:
Figure GDA0003308847840000033
wherein s isijJ row and column elements of the ith row representing the consistency similarity matrix S, | | · | | computationallyFF norm, θ, representing the matrixvThe representation is represented by a weight matrix Θ v1 represents a column vector with all elements 1, gamma and lambda are regularization parameters, gamma > 0, lambda > 0;
and 4, step 4: and (3) solving the semi-supervised classification model in the step (3) by adopting an iterative alternation method according to the following processes by taking all the matrixes obtained in the step (2) as initial values until a final label matrix F is obtained:
step 4.1, fixing theta and F, and solving the following formula to update S:
Figure GDA0003308847840000034
wherein s isiThe i-th row vector, d, representing the matrix SiRepresents a vector whose j-th element is calculated according to the following equation:
Figure GDA0003308847840000035
wherein f isiAnd fjI and j represent the ith and jth row vectors, i, j, respectively, of the matrix F;
step 4.2, fixing S and theta, updating F:
first, the degree matrix D is updated as followsS
Figure GDA0003308847840000041
Wherein is DiiIs a diagonal matrix DSI ═ 1,2,. and n;
the laplacian matrix is then updated as follows:
LS=DS-S (6)
will Laplace matrix LSBlocking from row l and column l:
Figure GDA0003308847840000042
wherein L isllRepresenting a matrix of size L x LluDenotes a matrix of size L × u, LulDenotes a matrix of size u x L, LuuRepresents a matrix of size u x u;
the consistency similarity matrix S and the degree matrix D are comparedSPartitioning:
Figure GDA0003308847840000043
Figure GDA0003308847840000044
the label matrix F of unlabeled samples is updated as followsu
Fu=(I-Puu)-1PulFl (10)
Wherein the content of the first and second substances,
Figure GDA0003308847840000045
finally, according to F ═ Fl;Fu]TUpdating a label matrix F;
step 4.3, fixing F and S, and solving the following formula to update theta:
Figure GDA0003308847840000046
wherein, thetavThe representation is represented by a weight matrix ΘvIs a vector of diagonal elements, WvIs a diagonal matrix of the v-th view, its i-th diagonal element
Figure GDA0003308847840000047
Figure GDA0003308847840000048
Is a matrix MvThe ith diagonal element of (1), Mv=(Xv)TLsXv
Step 4.4, iteration stop judgment:
s, F, L obtained by the last time and the current time of updating respectivelySΘ brings in the following objective function:
Figure GDA0003308847840000049
if the difference between the two obtained objective function values Z is smaller than a set threshold value, stopping iteration, wherein F at the moment is the final label matrix F; otherwise, returning to the step 4.1 to continue the iterative updating;
and 5: the label for each sample was obtained as follows:
yi=argmax1≤j≤CFij i=1,2,...,n (13)
wherein, yiRepresents the ith sample point
Figure GDA0003308847840000051
Label of (1), FijRepresenting the i row and j column elements of the final label matrix F obtained in the step 4;
and classifying the samples with the same label into one class to obtain a classification result.
Further, the regularization parameter γ stated in step 3 is
Figure GDA0003308847840000052
Further, the threshold value set in step 4 is 10-8
The invention has the beneficial effects that: because the process of constructing the similarity matrix is combined with the process of label deduction, label propagation is carried out while the similarity matrix is learned, and the influence of composition quality on the classification result is reduced; due to the fact that the characteristic dimensions in the view angles are weighted, the difference among the dimensions in the view angles is considered, and the relation among the characteristic dimensions in the view angles can be better mined; due to the consideration of the local structure information of the data, better neighborhood distribution can be obtained, and thus better classification results can be obtained.
Drawings
FIG. 1 is a flow chart of a semi-supervised classification method based on dimension weighting and view feature consistency according to the present invention;
FIG. 2 is a schematic diagram of a simulation data set;
in the figure, (a) -a first perspective of the simulated data set, and (b) -a second perspective of the simulated data set.
Detailed Description
The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.
As shown in fig. 1, the present invention provides a semi-supervised classification method based on dimension weighting and view feature consistency, which is implemented as follows:
1. initializing a similarity matrix
Let χ ═ X1,X2,...,XVDenotes a multi-view data set, in which,
Figure GDA0003308847840000053
features representing a V-th view, V1, 2.., V being the number of views, n representing the number of samples, d (V) representing the dimensionality of the V-th view features.
In the euclidean space, if the distance between two samples is closer, it indicates that the similarity between the two samples is higher, and the two samples should have the same output class. Furthermore, there is complementarity and consistency between the multi-view data sets. Thus, the initialized similarity matrix S for the v-th perspectivevCan be obtained by solving the following objective function:
Figure GDA0003308847840000061
wherein the content of the first and second substances,
Figure GDA0003308847840000062
representation matrix SvThe ith row and j column element values of (i, j) 1, 2.
The first term of equation (14) is to measure the correlation between the multi-view datasets; the second term is a regular term, in order to avoid the occurrence of trivial solutions, namely: in the v-th view angle, distance samples
Figure GDA0003308847840000063
The probability of the nearest sample point assignment is 1, and the other sample points are assigned 0. In order to improve the efficiency of label propagation, the invention adopts the model to construct a sparse similarity matrix. In particular, for sample points
Figure GDA0003308847840000064
According to
Figure GDA0003308847840000065
Measuring the distance between other sample points and the sample points, sorting the sample points according to the distance from small to large, then selecting k sample points with the smallest distance as the adjacent points of the sample points, adopting a k adjacent method to allocate weight, namely when j is less than or equal to k,
Figure GDA0003308847840000066
when j is greater than k, the number of the first and second groups,
Figure GDA0003308847840000067
j is the sample point sequence number after sorting by distance,
Figure GDA0003308847840000068
represents the (k + 1) th sample point and the sample point after sorting
Figure GDA0003308847840000069
The distance (k) is a parameter value to be set in advance, and is usually set in a range of 5. ltoreq. k.ltoreq.15.
In addition, to solve for the regularization parameter γ, the lagrangian function of equation (14) can be derived for γ and using the KKT condition to obtain
Figure GDA00033088478400000610
2. Construction of multi-view semi-supervised classification model
Obtaining an initial similarity matrix S for each viewvThereafter, for the multi-view dataset, a consistency similarity matrix S needs to be learned, so the initial consistency similarity matrix is set to
Figure GDA00033088478400000611
The degree matrix of the corresponding matrix S is set to DS,DSIs a diagonal matrix with the ith diagonal element of
Figure GDA00033088478400000612
i=1,2,...,n。
In order to enable label propagation while learning the similarity matrix, it can be obtained by solving the following model (15):
Figure GDA00033088478400000613
wherein s isijJ row and column elements of the ith row representing the consistency similarity matrix S, | | · | | computationallyFF norm, θ, representing the matrixvIs formed by a weight matrix thetavGamma is a regularization parameter, gamma is greater than 0; l isSIs a Laplace matrix and is semi-positive definite, initially in terms of LS=DSAnd S is obtained through calculation. F ═ Fl;Fu]TRepresenting a label matrix, consisting of two parts, Fl=Yl,Yl∈Rl×CLabel matrix representing the first l samples, FuThe labels representing unlabeled exemplars, the initial value of F is obtained from the first C eigenvectors of the similarity matrix S above, u-n-l, l being the number of known exemplar labels.
The first term and the second term in the above formula objective function represent the process of the learning of the similarity graph, and the third term represents the process of the label propagation. The combination of the two can ensure label propagation when constructing similar graphs.
In the semi-supervised classification model described above, only the "complementarity" and "consistency" information among multiple views is utilized in constructing the similarity map to construct the similarity matrix. However, for each view angle feature, there is also a difference between the different dimensions. In order to take into account the influence of different dimension information within the view on the classification result, the characteristic dimensions within each view can be weighted adaptively. Let the weight matrix of the v-th view beΘv,Θv∈Rd(v)×d(v)Is a diagonal matrix whose diagonal elements are initially in accordance with
Figure GDA0003308847840000071
And (4) calculating. Therefore, a multi-view semi-supervised classification model based on dimension weighting and view feature consistency is obtained as follows:
Figure GDA0003308847840000072
wherein, thetavThe representation is represented by a weight matrix ΘvThe diagonal elements of (a) constitute a vector. In the above formula, Θ and F, S are to be solved and can be obtained by learning, that is, the formula (16) can be obtained by an alternate iterative update algorithm.
3. Alternating iteration updating and solving multi-view semi-supervised classification model
Theta, F, S, L have been obtained previouslySThe final label matrix F is obtained by alternate iterative updating. The method specifically comprises the following steps:
(1) fix F and theta, update S
When F and Θ are fixed, the above model (16) is equivalent to the minimum solving problem as follows:
Figure GDA0003308847840000073
due to the fact that
Figure GDA0003308847840000074
And is
Figure GDA0003308847840000075
Therefore, the above equation can be converted into:
Figure GDA0003308847840000081
is provided with
Figure GDA0003308847840000082
Since it is independent for each i, the above equation is equivalent to solving the following problem:
Figure GDA0003308847840000083
setting the Lagrangian function of equation (19) and applying it to siDerivative to obtain
Figure GDA0003308847840000084
(2) Fix S and theta, update F
When S and Θ are fixed, the first two terms of the objective function in equation (16) are fixed values, which is equivalent to solving the following problem:
Figure GDA0003308847840000085
general matrix F, DSS and LS=DS-S is written separately in block form, i.e. F ═ Fl;Fu]T,
Figure GDA0003308847840000086
Wherein, FlDenotes a matrix of size l × C, FuDenotes a matrix of size u × C, LllRepresenting a matrix of size L x LluDenotes a matrix of size L × u, LulDenotes a matrix of size u x L, LuuDenotes a matrix of size u × u, u ═ n-l;
equation (20) can be converted to:
Figure GDA0003308847840000087
the Lagrangian function of equation (21) is applied to FlThe derivative is taken and set to 0 and,then F is obtainedu=-inv(Luu)LulYl=-inv(Duu-Suu)SulYl. Suppose that
Figure GDA0003308847840000091
Then FuCan be written as
Figure GDA0003308847840000092
(3) Fix F and S, update Θ
When F and S are fixed, the second term and the third term of the objective function in equation (16) are fixed values, and therefore, it can be converted to solve the following problem:
Figure GDA0003308847840000093
since each view angle v is independent, solving equation (22) is equivalent to solving:
Figure GDA0003308847840000094
wherein (X)T)vTranspose representing the v-th view characteristic, Mv=(Xv)TLsXv,WvIs a diagonal matrix of the v-th view, its i-th diagonal element
Figure GDA0003308847840000095
Figure GDA0003308847840000096
Is a matrix MvThe ith diagonal element of (1).
(4) Substituting each matrix obtained in the updating into an objective function in a formula (16), calculating to obtain an objective function value, subtracting the objective function value obtained in the last iterative calculation, and stopping iterative updating if the difference value of the objective function value and the objective function value is less than a set threshold value, wherein the obtained F is a final label matrix; otherwise, returning to the step (1) to perform the next iteration updating. The threshold value may be set to 10^ (-8) in general.
4. Sample classification
According to the finally obtained label matrix F, selecting the column serial number of the maximum value in each row as the label of the sample, and setting the maximum value of the column in the ith row as FikAnd k is the label of the ith sample, and the samples with the same label are classified into one class to obtain the final classification result.
In order to verify the effect of the method, classification experiments are respectively carried out on a simulation data set and a real data set, and the classification effect is evaluated by respectively adopting indexes such as classification Accuracy (ACC), arithmetic mean of average accuracy (Map), F-measure, Precision (Precision) and recall rate (Re-call), wherein the larger the ACC value is, the more the number of samples which are classified correctly is; map represents the arithmetic mean of the average precision in the information retrieval system; precision represents Precision; re-call represents recall ratio, and F-measure is the compromise between precision ratio and recall ratio. Fig. 2 shows simulated datasets for two views and table 1 shows a simplified depiction of the real datasets, where # v1- # v6 represent the characteristic dimensions of the first view through the sixth view of each dataset and MSRC-v1 is a dataset containing a total of 240 images of 8 classes, seven of which were selected in the experiment: i.e. trees, buildings, airplanes, cattle, portraits, cars, bicycles. Since there are 30 images per category, there are 210 images in total. Six features of these images were extracted, including 48-dimensional Color Moments (CMT), 256-dimensional local binary pattern features (LBP), 100-dimensional gradient direction Histogram (HOG), 200-dimensional Scale Invariant Features (SIFT), 512-dimensional GIST grayscale features, and 1320-dimensional CENTRIST features. HW is a digital image data set of '0 to 9', with 200 images per class for a total of 2000 images. Six features of the dataset were extracted for classification, including 240-dimensional pixel features (PIX) with a sliding window size of 2 × 3, 76-dimensional fourier coefficient Features (FOU), 216-dimensional contour correlation Features (FAC), 47-dimensional transform coefficient features (ZER), 64-dimensional KAR coefficient features and 6-dimensional morphological features (MOR). The Cal101 dataset contains 101 target recognition images, the first category being the selection broad1474 images of 7 types form a Cal101-7 data set; the second category is to select a widely used 20 categories of 2386 images in total to construct the Cal101-20 dataset. For the above two data sets, six features that are commonly used are extracted: namely 48-dimensional Gabor features, 40-dimensional wavelet moments, 254-dimensional CENTRIST features, 1984-dimensional gradient direction histogram, 512-dimensional GIST features, 928-dimensional local binary pattern features (LBP). For the simulation dataset, the known number of specimen labels per class is set to 1. The number of neighbors for all datasets is set to 5 and the regularization parameter λ is 10-3,10-2,10-1,100,101,102,103}. For each lambda value, the experiment was run 10 times, the best result for each dataset was selected as the final classification result, and Gaussian Field and Harmonic Function (GFHF) was selected as the comparison method. Table 2 gives the single-view and multi-view classification results for the simulated dataset. Tables 3-6 present the results of classification of the four above-described real data sets MSRC-v1, HW, Cal101-7, and Cal101-20 for 10%, 20%, 30%, and 40% known sample labels, respectively. As can be seen from tables 2-6, the classification results of the method of the present invention are superior to the classification results of single view in the simulation data set. In addition, for the four real data sets, as the proportion of known sample labels is increased, the accuracy of the classification result of the method is gradually increased, and the method has a better classification effect.
TABLE 1
Figure GDA0003308847840000101
Figure GDA0003308847840000111
TABLE 2
Data set ACC(%) Map(%) F-measure(%) Precision(%) Re-call(%)
View 1(GFHF) 85.35% 87.29% 75.79% 79.08% 72.77%
View 2(GFHF) 77.27% 67.80% 67.35% 73.18% 62.37%
The method of the invention 97.98% 98.68% 96.00% 96.02% 95.98%
TABLE 3
Data set ACC(%) Map(%) F-measure(%) Precision(%) Re-call(%)
MSRC-v1 91.38% 85.56% 83.81% 84.71% 82.94%
HW 97.48% 95.74% 95.01% 95.09% 94.93%
Cal101-7 95.37% 73.19% 95.36% 98.89% 92.10%
Cal101-20 84.13% 55.80% 86.56% 91.34% 82.27%
TABLE 4
Data set ACC(%) Map(%) F-measure(%) Precision(%) Re-call(%)
MSRC-v1 91.79% 86.03% 84.33% 84.91% 83.76%
HW 97.74% 96.14% 95.53% 95.57% 95.49%
Cal101-7 96.51% 79.34% 97.04% 99.15% 95.02%
Cal101-20 87.60% 62.87% 89.92% 92.15% 87.82%
TABLE 5
Data set ACC(%) Map(%) F-measure(%) Precision(%) Re-call(%)
MSRC-v1 93.06% 87.92% 86.57% 87.04% 86.11%
HW 98.03% 96.67% 96.09% 96.12% 96.06%
Cal101-7 97.12% 82.92% 97.75% 99.00% 96.53%
Cal101-20 88.51% 65.20% 90.81% 92.22% 89.46%
TABLE 6
Data set ACC(%) Map(%) F-measure(%) Precision(%) Re-call(%)
MSRC-v1 93.49% 89.31% 87.28% 87.67% 86.90%
HW 98.03% 96.62% 96.11% 96.13% 96.08%
Cal101-7 97.37% 82.83% 98.11% 98.89% 97.35%
Cal101-20 89.23% 67.33% 91.41% 92.35% 90.50%

Claims (3)

1. A semi-supervised classification method based on dimension weighting and visual angle feature consistency is characterized by comprising the following steps:
step 1: let χ ═ X1,X2,...,XVDenotes a multi-view data set, in which,
Figure FDA0003308847830000011
a feature representing a V-th view, V1, 2.., V being the number of views, n representing the number of samples, d (V) representing the dimensionality of the V-th view feature; the number of categories of the data set is set as C;
according to
Figure FDA0003308847830000012
The method calculates the ith sample point in the v view
Figure FDA0003308847830000013
To the jth sample point
Figure FDA0003308847830000014
I, j ═ 1, 2.., n; for each sample point, sorting the distances between all other sample points and the sample point from small to large, and selecting the first k sample points with the minimum distance as the adjacent points; then, the ith sample point is calculated as follows
Figure FDA0003308847830000015
And the jth sample point
Figure FDA0003308847830000016
Similarity between them:
Figure FDA0003308847830000017
wherein the content of the first and second substances,
Figure FDA0003308847830000018
representing distance sample points
Figure FDA0003308847830000019
Sample point of k +1 th nearest sample point and sample point
Figure FDA00033088478300000110
The value range of k is 5-15, when i is j,
Figure FDA00033088478300000111
to be provided with
Figure FDA00033088478300000112
Obtaining a similarity matrix S of a v-th visual angle as the ith row and j column element values of the matrixv∈Rn×n,v=1,2,...,V;
Step 2: adding the similarity matrixes of all V visual angles, and averaging to obtain an initial consistency similarity matrix S; then, according to LS=DSS is calculated to obtain an initial Laplace matrix LSWherein D isSIs a degree matrix, is a diagonal matrix, the ith diagonal element of which is
Figure FDA00033088478300000113
Tag matrix F ═ Fl;Fu]T,Fl=Yl,Yl∈Rl×CA label matrix representing known samples, Fu∈Ru×CA label matrix representing unlabeled samples, wherein u is n-l, l is the number of known sample labels, and F is taken as the first C eigenvectors of the matrix S at the beginning; according to the formula thetav ii1/d (v) initialize the weight matrix Θ for the v-th view anglev,Θv∈Rd(v)×d(v)Is a diagonal matrix, Θv iiIs thetavI 1,2, d (V), V1, 2, 1.., V;
and step 3: the multi-view semi-supervised classification model is constructed as follows:
Figure FDA00033088478300000114
wherein s isijJ row and column elements of the ith row representing the consistency similarity matrix S, | | · | | computationallyFF norm, θ, representing the matrixvThe representation is represented by a weight matrix Θv1 represents a column vector with all elements 1, gamma and lambda are regularization parameters, gamma > 0, lambda > 0;
and 4, step 4: and (3) solving the semi-supervised classification model in the step (3) by adopting an iterative alternation method according to the following processes by taking all the matrixes obtained in the step (2) as initial values until a final label matrix F is obtained:
step 4.1, fixing theta and F, and solving the following formula to update S:
Figure FDA0003308847830000021
wherein s isiThe i-th row vector, d, representing the matrix SiRepresents a vector whose j-th element is calculated according to the following equation:
Figure FDA0003308847830000022
wherein f isiAnd fjI and j represent the ith and jth row vectors, i, j, respectively, of the matrix F;
step 4.2, fixing S and theta, updating F:
first, the degree matrix D is updated as followsS
Figure FDA0003308847830000023
Wherein is DiiIs a diagonal matrix DSI ═ 1,2,. and n;
the laplacian matrix is then updated as follows:
LS=DS-S (6)
will Laplace matrix LSBlocking from row l and column l:
Figure FDA0003308847830000024
wherein L isllRepresenting a matrix of size L x LluDenotes a matrix of size L × u, LulDenotes a matrix of size u x L, LuuRepresents a matrix of size u x u;
the consistency similarity matrix S and the degree matrix D are comparedSPartitioning:
Figure FDA0003308847830000025
Figure FDA0003308847830000026
the label matrix F of unlabeled samples is updated as followsu
Fu=(I-Puu)-1PulFl (10)
Wherein the content of the first and second substances,
Figure FDA0003308847830000031
finally, according to F ═ Fl;Fu]TUpdating a label matrix F;
step 4.3, fixing F and S, and solving the following formula to update theta:
Figure FDA0003308847830000032
wherein, thetavThe representation is represented by a weight matrix ΘvIs a vector of diagonal elements, WvIs a diagonal matrix of the v-th view, its i-th diagonal element
Figure FDA0003308847830000033
Figure FDA0003308847830000034
Is a matrix MvThe ith diagonal element of (1), Mv=(Xv)TLsXv
Step 4.4, iteration stop judgment:
s, F, L obtained by the last time and the current time of updating respectivelySΘ brings in the following objective function:
Figure FDA0003308847830000035
if the difference between the two obtained objective function values Z is smaller than a set threshold value, stopping iteration, wherein F at the moment is the final label matrix F; otherwise, returning to the step 4.1 to continue the iterative updating;
and 5: the label for each sample was obtained as follows:
yi=argmax1≤j≤CFij i=1,2,...,n (13)
wherein, yiRepresents the ith sample point
Figure FDA0003308847830000036
Label of (1), FijRepresenting the i row and j column elements of the final label matrix F obtained in the step 4;
and classifying the samples with the same label into one class to obtain a classification result.
2. The semi-supervised classification method based on dimension weighting and view angle feature consistency as claimed in claim 1, wherein: the regularization parameter γ in step 3 is
Figure FDA0003308847830000037
3. The semi-supervised classification method based on dimension weighting and view angle feature consistency as claimed in claim 1, wherein: the threshold value in step 4 is set to 10-8
CN202010416737.7A 2020-05-17 2020-05-17 Semi-supervised classification method based on dimension weighting and visual angle feature consistency Active CN111639686B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010416737.7A CN111639686B (en) 2020-05-17 2020-05-17 Semi-supervised classification method based on dimension weighting and visual angle feature consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010416737.7A CN111639686B (en) 2020-05-17 2020-05-17 Semi-supervised classification method based on dimension weighting and visual angle feature consistency

Publications (2)

Publication Number Publication Date
CN111639686A CN111639686A (en) 2020-09-08
CN111639686B true CN111639686B (en) 2022-03-15

Family

ID=72332795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010416737.7A Active CN111639686B (en) 2020-05-17 2020-05-17 Semi-supervised classification method based on dimension weighting and visual angle feature consistency

Country Status (1)

Country Link
CN (1) CN111639686B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114399649B (en) * 2021-11-30 2023-09-19 西安交通大学 Rapid multi-view semi-supervised learning method and system based on learning graph
CN117274726B (en) * 2023-11-23 2024-02-23 南京信息工程大学 Picture classification method and system based on multi-view supplementary tag

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228027A (en) * 2016-08-26 2016-12-14 西北大学 A kind of semi-supervised feature selection approach of various visual angles data
CN108280163A (en) * 2018-01-18 2018-07-13 厦门美图之家科技有限公司 Video features learning method, device, electronic equipment and readable storage medium storing program for executing
CN110334777A (en) * 2019-07-15 2019-10-15 广西师范大学 A kind of unsupervised attribute selection method of weighting multi-angle of view
CN110941734A (en) * 2019-11-07 2020-03-31 南京理工大学 Depth unsupervised image retrieval method based on sparse graph structure
CN111160387A (en) * 2019-11-28 2020-05-15 广东工业大学 Graph model based on multi-view dictionary learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10436896B2 (en) * 2015-11-29 2019-10-08 Vayyar Imaging Ltd. System, device and method for imaging of objects using signal clustering
US11275900B2 (en) * 2018-05-09 2022-03-15 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for automatically assigning one or more labels to discussion topics shown in online forums on the dark web

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228027A (en) * 2016-08-26 2016-12-14 西北大学 A kind of semi-supervised feature selection approach of various visual angles data
CN108280163A (en) * 2018-01-18 2018-07-13 厦门美图之家科技有限公司 Video features learning method, device, electronic equipment and readable storage medium storing program for executing
CN110334777A (en) * 2019-07-15 2019-10-15 广西师范大学 A kind of unsupervised attribute selection method of weighting multi-angle of view
CN110941734A (en) * 2019-11-07 2020-03-31 南京理工大学 Depth unsupervised image retrieval method based on sparse graph structure
CN111160387A (en) * 2019-11-28 2020-05-15 广东工业大学 Graph model based on multi-view dictionary learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Semi-Supervised Learning with Auto-Weighting Feature and Adaptive Graph;Feiping Nie.et.;《IEEE Transactions on Knowledge and Data Engineering》;20190226;第32卷(第6期);第1167-1178页 *
基于多视角的半监督特征选择算法研究;汪荆琪;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20141015(第10期);第I140-35页 *

Also Published As

Publication number Publication date
CN111639686A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
CN107066559B (en) Three-dimensional model retrieval method based on deep learning
CN111723675B (en) Remote sensing image scene classification method based on multiple similarity measurement deep learning
CN110796667B (en) Color image segmentation method based on improved wavelet clustering
CN109543723B (en) Robust image clustering method
CN111639686B (en) Semi-supervised classification method based on dimension weighting and visual angle feature consistency
CN109840518B (en) Visual tracking method combining classification and domain adaptation
CN109241816B (en) Image re-identification system based on label optimization and loss function determination method
CN108664969B (en) Road sign recognition method based on conditional random field
CN112861929B (en) Image classification method based on semi-supervised weighted migration discriminant analysis
CN111091129B (en) Image salient region extraction method based on manifold ordering of multiple color features
Ben Ayed et al. Auxiliary cuts for general classes of higher order functionals
CN110738672A (en) image segmentation method based on hierarchical high-order conditional random field
Liu et al. Multiobjective multiple features fusion: A case study in image segmentation
Jia et al. Nonlocal regularized CNN for image segmentation
CN108921853B (en) Image segmentation method based on super-pixel and immune sparse spectral clustering
Wang et al. Pedestrian detection in infrared image based on depth transfer learning
CN111488923B (en) Enhanced anchor point image semi-supervised classification method
Tariq et al. T-clustering: Image clustering by tensor decomposition
You et al. Robust structure low-rank representation in latent space
Zhao et al. Interactive segmentation of texture image based on active contour model with local inverse difference moment feature
CN109389127B (en) Structured multi-view Hessian regularization sparse feature selection method
Nguyen et al. A Combination of Histogram of Oriented Gradients and Color Features to Cooperate with Louvain Method based Image Segmentation.
CN115393631A (en) Hyperspectral image classification method based on Bayesian layer graph convolution neural network
CN112766180B (en) Pedestrian re-identification method based on feature fusion and multi-core learning
Khan et al. Image segmentation via multi dimensional color transform and consensus based region merging

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant