CN111639686B - Semi-supervised classification method based on dimension weighting and visual angle feature consistency - Google Patents
Semi-supervised classification method based on dimension weighting and visual angle feature consistency Download PDFInfo
- Publication number
- CN111639686B CN111639686B CN202010416737.7A CN202010416737A CN111639686B CN 111639686 B CN111639686 B CN 111639686B CN 202010416737 A CN202010416737 A CN 202010416737A CN 111639686 B CN111639686 B CN 111639686B
- Authority
- CN
- China
- Prior art keywords
- matrix
- label
- representing
- view
- semi
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a semi-supervised classification method based on dimension weighting and visual angle feature consistency. Firstly, constructing a similarity matrix of each visual angle of multi-visual angle data by adopting a self-adaptive local structure learning method; then, taking an average value of the similarity matrixes of all the visual angles as an initial consistency similarity matrix, and constructing a multi-visual angle semi-supervised classification model based on dimension weighting and visual angle characteristic consistency; then, solving the model by adopting an alternative iteration updating method until a final label matrix is obtained; and finally, obtaining the label of the sample according to the label matrix, and finishing sample classification. The classification model constructed by the method combines the construction similarity matrix with the label inference, so that the influence of the composition quality on the classification result is reduced; and better classification results can be obtained due to the fact that the characteristic dimensions in the visual angle are weighted and the local structure information of the data is considered.
Description
Technical Field
The invention belongs to the technical field of machine learning and data mining, and particularly relates to a semi-supervised classification method based on dimension weighting and view angle feature consistency.
Background
With the advent of the big data age, information in many real scenes can be obtained through different channels, different angles, different modalities and different features. For the multi-source data, how to efficiently and accurately fuse the information through a certain strategy to complete a specific task has important research significance in practical scenes.
Under the assumption that the multi-view data sets have "complementarity" and "consistency", multi-view learning refers to a method for describing a researched object from multiple angles and then integrating information of multiple angles for learning. Semi-supervised classification refers to training a classifier with a small number of labeled samples and unlabeled samples, and then using a learned classifier to infer the labels of the unlabeled samples. In an actual scene, a generally obtained multi-view data set has a small number of labels, and a large amount of manpower and material resources are consumed for labeling the data set. Therefore, it is of great research value to label unlabeled samples with a small number of labels in combination with information from multiple perspectives.
The traditional multi-view semi-supervised classification method is mainly divided into three categories: 1) performing collaborative training; 2) multi-view semi-supervised classification based on graphs; 3) regression-based multi-view semi-supervised classification. In graph-based semi-supervised classification, samples represent nodes of a graph, and the similarity between any two nodes represents the strength of an edge. Thus, the graph-based semi-supervised learning process is equivalent to a staining process. Nie et al in the references "F.Nie, J.Li and X.Li, Parameter-free auto-weighted multiple graph: A frame for multi-view clustering and semi-superimposed classification, in Proc.IEEE conf.IJCAI,2016, pp.1881-1887" first construct a similarity matrix and then infer the label of the unlabeled sample from the labeled sample information and the constructed similarity map. Yang et al, in the documents "M.Yang, C.Deng, and F.Nie, Adaptive-weighted differential regression for multi-view classification, Pattern Recognition, vol.88, pp.236-245,2019," classify multi-view datasets using the idea of Adaptive discriminant regression. Considering that there are a small number of labels in the real dataset, Tao et al established a Regression-based Semi-Supervised Classification model for each View in the documents "H.Tao, C.Hou, F.Nie, J.Zhu, and D.Yi, Scalable Multi-View Semi-Supervised Classification View Adaptive Regression, IEEE Transactions on Image Processing, vol.26, No.9, pp.4283-4296". In order to make the model robust to noise or outliers, the method also utilizes L2,1And (4) norm. The model can adaptively assign view weights considering that each view has different importance to the classification result. However, it is not limited toThese regression-based concepts described above only consider the linear relationship between the samples and the labels, which is not satisfied with the nonlinear relationship. In graph-based semi-supervised classification, the quality of the constructed similarity graph will greatly affect the final classification result, and since the construction of the similarity graph and the label inference are treated as two separate steps, the relationship between the two is ignored. In addition, the methods only consider the difference of characteristics between the visual angles, and ignore the difference between dimensions in the visual angles, thereby ignoring the data local structure information. Therefore, the classification accuracy of these methods is affected.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a semi-supervised classification method based on dimension weighting and view angle feature consistency. Firstly, constructing a similarity matrix of each visual angle in multi-visual angle data by adopting a self-adaptive local structure learning method; then, taking an average value of the similarity matrixes of all the visual angles as an initial consistency similarity matrix, and constructing a multi-visual angle semi-supervised classification model based on dimension weighting and visual angle characteristic consistency; then, solving the model by adopting an alternative iteration updating method until a final label matrix is obtained; and finally, obtaining the label of the sample according to the label matrix, and finishing sample classification. The method of the invention simultaneously performs the learning of the similarity matrix and the propagation of the label, thereby reducing the dependence of the label propagation process on the quality of the similarity matrix. In addition, when the similarity matrix is updated, the semi-supervised classification result can be improved by weighting the feature dimension in each visual angle.
A semi-supervised classification method based on dimension weighting and visual angle feature consistency is characterized by comprising the following steps:
step 1: let χ ═ X1,X2,...,XVDenotes a multi-view data set, in which,a feature representing a V-th view, V1, 2.., V being the number of views, n representing the number of samples, d (V) representing the dimensionality of the V-th view feature; the number of categories of the data set is set as C;
according toThe method calculates the ith sample point in the v viewTo the jth sample pointI, j ═ 1, 2.., n; for each sample point, sorting the distances between all other sample points and the sample point from small to large, and selecting the first k sample points with the minimum distance as the adjacent points; then, the ith sample point is calculated as followsAnd the jth sample pointSimilarity between them:
wherein the content of the first and second substances,representing distance sample pointsSample point of k +1 th nearest sample point and sample pointThe value range of k is 5-15, when i is j,
to be provided withObtaining a similarity matrix S of a v-th visual angle as the ith row and j column element values of the matrixv∈Rn×n,v=1,2,...,V;
Step 2: adding the similarity matrixes of all V visual angles, and averaging to obtain an initial consistency similarity matrix S; then, according to LS=DSS is calculated to obtain an initial Laplace matrix LSWherein D isSIs a degree matrix, is a diagonal matrix, the ith diagonal element of which is1,2, n; tag matrix F ═ Fl;Fu]T,Fl=Yl,Yl∈Rl×CA label matrix representing known samples, Fu∈Ru×CA label matrix representing unlabeled samples, wherein u is n-l, l is the number of known sample labels, and F is taken as the first C eigenvectors of the matrix S at the beginning; according to the formula theta v ii1/d (v) initialize the weight matrix Θ for the v-th view anglev,Θv∈Rd(v)×d(v)Is a diagonal matrix, Θv iiIs thetavI 1,2, d (V), V1, 2, 1.., V;
and step 3: the multi-view semi-supervised classification model is constructed as follows:
wherein s isijJ row and column elements of the ith row representing the consistency similarity matrix S, | | · | | computationallyFF norm, θ, representing the matrixvThe representation is represented by a weight matrix Θ v1 represents a column vector with all elements 1, gamma and lambda are regularization parameters, gamma > 0, lambda > 0;
and 4, step 4: and (3) solving the semi-supervised classification model in the step (3) by adopting an iterative alternation method according to the following processes by taking all the matrixes obtained in the step (2) as initial values until a final label matrix F is obtained:
step 4.1, fixing theta and F, and solving the following formula to update S:
wherein s isiThe i-th row vector, d, representing the matrix SiRepresents a vector whose j-th element is calculated according to the following equation:
wherein f isiAnd fjI and j represent the ith and jth row vectors, i, j, respectively, of the matrix F;
step 4.2, fixing S and theta, updating F:
first, the degree matrix D is updated as followsS:
Wherein is DiiIs a diagonal matrix DSI ═ 1,2,. and n;
the laplacian matrix is then updated as follows:
LS=DS-S (6)
will Laplace matrix LSBlocking from row l and column l:
wherein L isllRepresenting a matrix of size L x LluDenotes a matrix of size L × u, LulDenotes a matrix of size u x L, LuuRepresents a matrix of size u x u;
the consistency similarity matrix S and the degree matrix D are comparedSPartitioning:
the label matrix F of unlabeled samples is updated as followsu:
Fu=(I-Puu)-1PulFl (10)
finally, according to F ═ Fl;Fu]TUpdating a label matrix F;
step 4.3, fixing F and S, and solving the following formula to update theta:
wherein, thetavThe representation is represented by a weight matrix ΘvIs a vector of diagonal elements, WvIs a diagonal matrix of the v-th view, its i-th diagonal element Is a matrix MvThe ith diagonal element of (1), Mv=(Xv)TLsXv;
Step 4.4, iteration stop judgment:
s, F, L obtained by the last time and the current time of updating respectivelySΘ brings in the following objective function:
if the difference between the two obtained objective function values Z is smaller than a set threshold value, stopping iteration, wherein F at the moment is the final label matrix F; otherwise, returning to the step 4.1 to continue the iterative updating;
and 5: the label for each sample was obtained as follows:
yi=argmax1≤j≤CFij i=1,2,...,n (13)
wherein, yiRepresents the ith sample pointLabel of (1), FijRepresenting the i row and j column elements of the final label matrix F obtained in the step 4;
and classifying the samples with the same label into one class to obtain a classification result.
Further, the threshold value set in step 4 is 10-8。
The invention has the beneficial effects that: because the process of constructing the similarity matrix is combined with the process of label deduction, label propagation is carried out while the similarity matrix is learned, and the influence of composition quality on the classification result is reduced; due to the fact that the characteristic dimensions in the view angles are weighted, the difference among the dimensions in the view angles is considered, and the relation among the characteristic dimensions in the view angles can be better mined; due to the consideration of the local structure information of the data, better neighborhood distribution can be obtained, and thus better classification results can be obtained.
Drawings
FIG. 1 is a flow chart of a semi-supervised classification method based on dimension weighting and view feature consistency according to the present invention;
FIG. 2 is a schematic diagram of a simulation data set;
in the figure, (a) -a first perspective of the simulated data set, and (b) -a second perspective of the simulated data set.
Detailed Description
The present invention will be further described with reference to the following drawings and examples, which include, but are not limited to, the following examples.
As shown in fig. 1, the present invention provides a semi-supervised classification method based on dimension weighting and view feature consistency, which is implemented as follows:
1. initializing a similarity matrix
Let χ ═ X1,X2,...,XVDenotes a multi-view data set, in which,features representing a V-th view, V1, 2.., V being the number of views, n representing the number of samples, d (V) representing the dimensionality of the V-th view features.
In the euclidean space, if the distance between two samples is closer, it indicates that the similarity between the two samples is higher, and the two samples should have the same output class. Furthermore, there is complementarity and consistency between the multi-view data sets. Thus, the initialized similarity matrix S for the v-th perspectivevCan be obtained by solving the following objective function:
wherein the content of the first and second substances,representation matrix SvThe ith row and j column element values of (i, j) 1, 2.
The first term of equation (14) is to measure the correlation between the multi-view datasets; the second term is a regular term, in order to avoid the occurrence of trivial solutions, namely: in the v-th view angle, distance samplesThe probability of the nearest sample point assignment is 1, and the other sample points are assigned 0. In order to improve the efficiency of label propagation, the invention adopts the model to construct a sparse similarity matrix. In particular, for sample pointsAccording toMeasuring the distance between other sample points and the sample points, sorting the sample points according to the distance from small to large, then selecting k sample points with the smallest distance as the adjacent points of the sample points, adopting a k adjacent method to allocate weight, namely when j is less than or equal to k,when j is greater than k, the number of the first and second groups,j is the sample point sequence number after sorting by distance,represents the (k + 1) th sample point and the sample point after sortingThe distance (k) is a parameter value to be set in advance, and is usually set in a range of 5. ltoreq. k.ltoreq.15.
In addition, to solve for the regularization parameter γ, the lagrangian function of equation (14) can be derived for γ and using the KKT condition to obtain
2. Construction of multi-view semi-supervised classification model
Obtaining an initial similarity matrix S for each viewvThereafter, for the multi-view dataset, a consistency similarity matrix S needs to be learned, so the initial consistency similarity matrix is set toThe degree matrix of the corresponding matrix S is set to DS,DSIs a diagonal matrix with the ith diagonal element ofi=1,2,...,n。
In order to enable label propagation while learning the similarity matrix, it can be obtained by solving the following model (15):
wherein s isijJ row and column elements of the ith row representing the consistency similarity matrix S, | | · | | computationallyFF norm, θ, representing the matrixvIs formed by a weight matrix thetavGamma is a regularization parameter, gamma is greater than 0; l isSIs a Laplace matrix and is semi-positive definite, initially in terms of LS=DSAnd S is obtained through calculation. F ═ Fl;Fu]TRepresenting a label matrix, consisting of two parts, Fl=Yl,Yl∈Rl×CLabel matrix representing the first l samples, FuThe labels representing unlabeled exemplars, the initial value of F is obtained from the first C eigenvectors of the similarity matrix S above, u-n-l, l being the number of known exemplar labels.
The first term and the second term in the above formula objective function represent the process of the learning of the similarity graph, and the third term represents the process of the label propagation. The combination of the two can ensure label propagation when constructing similar graphs.
In the semi-supervised classification model described above, only the "complementarity" and "consistency" information among multiple views is utilized in constructing the similarity map to construct the similarity matrix. However, for each view angle feature, there is also a difference between the different dimensions. In order to take into account the influence of different dimension information within the view on the classification result, the characteristic dimensions within each view can be weighted adaptively. Let the weight matrix of the v-th view beΘv,Θv∈Rd(v)×d(v)Is a diagonal matrix whose diagonal elements are initially in accordance withAnd (4) calculating. Therefore, a multi-view semi-supervised classification model based on dimension weighting and view feature consistency is obtained as follows:
wherein, thetavThe representation is represented by a weight matrix ΘvThe diagonal elements of (a) constitute a vector. In the above formula, Θ and F, S are to be solved and can be obtained by learning, that is, the formula (16) can be obtained by an alternate iterative update algorithm.
3. Alternating iteration updating and solving multi-view semi-supervised classification model
Theta, F, S, L have been obtained previouslySThe final label matrix F is obtained by alternate iterative updating. The method specifically comprises the following steps:
(1) fix F and theta, update S
When F and Θ are fixed, the above model (16) is equivalent to the minimum solving problem as follows:
is provided withSince it is independent for each i, the above equation is equivalent to solving the following problem:
(2) Fix S and theta, update F
When S and Θ are fixed, the first two terms of the objective function in equation (16) are fixed values, which is equivalent to solving the following problem:
general matrix F, DSS and LS=DS-S is written separately in block form, i.e. F ═ Fl;Fu]T,
Wherein, FlDenotes a matrix of size l × C, FuDenotes a matrix of size u × C, LllRepresenting a matrix of size L x LluDenotes a matrix of size L × u, LulDenotes a matrix of size u x L, LuuDenotes a matrix of size u × u, u ═ n-l;
equation (20) can be converted to:
the Lagrangian function of equation (21) is applied to FlThe derivative is taken and set to 0 and,then F is obtainedu=-inv(Luu)LulYl=-inv(Duu-Suu)SulYl. Suppose thatThen FuCan be written as
(3) Fix F and S, update Θ
When F and S are fixed, the second term and the third term of the objective function in equation (16) are fixed values, and therefore, it can be converted to solve the following problem:
since each view angle v is independent, solving equation (22) is equivalent to solving:
wherein (X)T)vTranspose representing the v-th view characteristic, Mv=(Xv)TLsXv,WvIs a diagonal matrix of the v-th view, its i-th diagonal element Is a matrix MvThe ith diagonal element of (1).
(4) Substituting each matrix obtained in the updating into an objective function in a formula (16), calculating to obtain an objective function value, subtracting the objective function value obtained in the last iterative calculation, and stopping iterative updating if the difference value of the objective function value and the objective function value is less than a set threshold value, wherein the obtained F is a final label matrix; otherwise, returning to the step (1) to perform the next iteration updating. The threshold value may be set to 10^ (-8) in general.
4. Sample classification
According to the finally obtained label matrix F, selecting the column serial number of the maximum value in each row as the label of the sample, and setting the maximum value of the column in the ith row as FikAnd k is the label of the ith sample, and the samples with the same label are classified into one class to obtain the final classification result.
In order to verify the effect of the method, classification experiments are respectively carried out on a simulation data set and a real data set, and the classification effect is evaluated by respectively adopting indexes such as classification Accuracy (ACC), arithmetic mean of average accuracy (Map), F-measure, Precision (Precision) and recall rate (Re-call), wherein the larger the ACC value is, the more the number of samples which are classified correctly is; map represents the arithmetic mean of the average precision in the information retrieval system; precision represents Precision; re-call represents recall ratio, and F-measure is the compromise between precision ratio and recall ratio. Fig. 2 shows simulated datasets for two views and table 1 shows a simplified depiction of the real datasets, where # v1- # v6 represent the characteristic dimensions of the first view through the sixth view of each dataset and MSRC-v1 is a dataset containing a total of 240 images of 8 classes, seven of which were selected in the experiment: i.e. trees, buildings, airplanes, cattle, portraits, cars, bicycles. Since there are 30 images per category, there are 210 images in total. Six features of these images were extracted, including 48-dimensional Color Moments (CMT), 256-dimensional local binary pattern features (LBP), 100-dimensional gradient direction Histogram (HOG), 200-dimensional Scale Invariant Features (SIFT), 512-dimensional GIST grayscale features, and 1320-dimensional CENTRIST features. HW is a digital image data set of '0 to 9', with 200 images per class for a total of 2000 images. Six features of the dataset were extracted for classification, including 240-dimensional pixel features (PIX) with a sliding window size of 2 × 3, 76-dimensional fourier coefficient Features (FOU), 216-dimensional contour correlation Features (FAC), 47-dimensional transform coefficient features (ZER), 64-dimensional KAR coefficient features and 6-dimensional morphological features (MOR). The Cal101 dataset contains 101 target recognition images, the first category being the selection broad1474 images of 7 types form a Cal101-7 data set; the second category is to select a widely used 20 categories of 2386 images in total to construct the Cal101-20 dataset. For the above two data sets, six features that are commonly used are extracted: namely 48-dimensional Gabor features, 40-dimensional wavelet moments, 254-dimensional CENTRIST features, 1984-dimensional gradient direction histogram, 512-dimensional GIST features, 928-dimensional local binary pattern features (LBP). For the simulation dataset, the known number of specimen labels per class is set to 1. The number of neighbors for all datasets is set to 5 and the regularization parameter λ is 10-3,10-2,10-1,100,101,102,103}. For each lambda value, the experiment was run 10 times, the best result for each dataset was selected as the final classification result, and Gaussian Field and Harmonic Function (GFHF) was selected as the comparison method. Table 2 gives the single-view and multi-view classification results for the simulated dataset. Tables 3-6 present the results of classification of the four above-described real data sets MSRC-v1, HW, Cal101-7, and Cal101-20 for 10%, 20%, 30%, and 40% known sample labels, respectively. As can be seen from tables 2-6, the classification results of the method of the present invention are superior to the classification results of single view in the simulation data set. In addition, for the four real data sets, as the proportion of known sample labels is increased, the accuracy of the classification result of the method is gradually increased, and the method has a better classification effect.
TABLE 1
TABLE 2
Data set | ACC(%) | Map(%) | F-measure(%) | Precision(%) | Re-call(%) |
View 1(GFHF) | 85.35% | 87.29% | 75.79% | 79.08% | 72.77% |
View 2(GFHF) | 77.27% | 67.80% | 67.35% | 73.18% | 62.37% |
The method of the invention | 97.98% | 98.68% | 96.00% | 96.02% | 95.98% |
TABLE 3
Data set | ACC(%) | Map(%) | F-measure(%) | Precision(%) | Re-call(%) |
MSRC-v1 | 91.38% | 85.56% | 83.81% | 84.71% | 82.94% |
HW | 97.48% | 95.74% | 95.01% | 95.09% | 94.93% |
Cal101-7 | 95.37% | 73.19% | 95.36% | 98.89% | 92.10% |
Cal101-20 | 84.13% | 55.80% | 86.56% | 91.34% | 82.27% |
TABLE 4
Data set | ACC(%) | Map(%) | F-measure(%) | Precision(%) | Re-call(%) |
MSRC-v1 | 91.79% | 86.03% | 84.33% | 84.91% | 83.76% |
HW | 97.74% | 96.14% | 95.53% | 95.57% | 95.49% |
Cal101-7 | 96.51% | 79.34% | 97.04% | 99.15% | 95.02% |
Cal101-20 | 87.60% | 62.87% | 89.92% | 92.15% | 87.82% |
TABLE 5
Data set | ACC(%) | Map(%) | F-measure(%) | Precision(%) | Re-call(%) |
MSRC-v1 | 93.06% | 87.92% | 86.57% | 87.04% | 86.11% |
HW | 98.03% | 96.67% | 96.09% | 96.12% | 96.06% |
Cal101-7 | 97.12% | 82.92% | 97.75% | 99.00% | 96.53% |
Cal101-20 | 88.51% | 65.20% | 90.81% | 92.22% | 89.46% |
TABLE 6
Data set | ACC(%) | Map(%) | F-measure(%) | Precision(%) | Re-call(%) |
MSRC-v1 | 93.49% | 89.31% | 87.28% | 87.67% | 86.90% |
HW | 98.03% | 96.62% | 96.11% | 96.13% | 96.08% |
Cal101-7 | 97.37% | 82.83% | 98.11% | 98.89% | 97.35% |
Cal101-20 | 89.23% | 67.33% | 91.41% | 92.35% | 90.50% |
Claims (3)
1. A semi-supervised classification method based on dimension weighting and visual angle feature consistency is characterized by comprising the following steps:
step 1: let χ ═ X1,X2,...,XVDenotes a multi-view data set, in which,a feature representing a V-th view, V1, 2.., V being the number of views, n representing the number of samples, d (V) representing the dimensionality of the V-th view feature; the number of categories of the data set is set as C;
according toThe method calculates the ith sample point in the v viewTo the jth sample pointI, j ═ 1, 2.., n; for each sample point, sorting the distances between all other sample points and the sample point from small to large, and selecting the first k sample points with the minimum distance as the adjacent points; then, the ith sample point is calculated as followsAnd the jth sample pointSimilarity between them:
wherein the content of the first and second substances,representing distance sample pointsSample point of k +1 th nearest sample point and sample pointThe value range of k is 5-15, when i is j,
to be provided withObtaining a similarity matrix S of a v-th visual angle as the ith row and j column element values of the matrixv∈Rn×n,v=1,2,...,V;
Step 2: adding the similarity matrixes of all V visual angles, and averaging to obtain an initial consistency similarity matrix S; then, according to LS=DSS is calculated to obtain an initial Laplace matrix LSWherein D isSIs a degree matrix, is a diagonal matrix, the ith diagonal element of which isTag matrix F ═ Fl;Fu]T,Fl=Yl,Yl∈Rl×CA label matrix representing known samples, Fu∈Ru×CA label matrix representing unlabeled samples, wherein u is n-l, l is the number of known sample labels, and F is taken as the first C eigenvectors of the matrix S at the beginning; according to the formula thetav ii1/d (v) initialize the weight matrix Θ for the v-th view anglev,Θv∈Rd(v)×d(v)Is a diagonal matrix, Θv iiIs thetavI 1,2, d (V), V1, 2, 1.., V;
and step 3: the multi-view semi-supervised classification model is constructed as follows:
wherein s isijJ row and column elements of the ith row representing the consistency similarity matrix S, | | · | | computationallyFF norm, θ, representing the matrixvThe representation is represented by a weight matrix Θv1 represents a column vector with all elements 1, gamma and lambda are regularization parameters, gamma > 0, lambda > 0;
and 4, step 4: and (3) solving the semi-supervised classification model in the step (3) by adopting an iterative alternation method according to the following processes by taking all the matrixes obtained in the step (2) as initial values until a final label matrix F is obtained:
step 4.1, fixing theta and F, and solving the following formula to update S:
wherein s isiThe i-th row vector, d, representing the matrix SiRepresents a vector whose j-th element is calculated according to the following equation:
wherein f isiAnd fjI and j represent the ith and jth row vectors, i, j, respectively, of the matrix F;
step 4.2, fixing S and theta, updating F:
first, the degree matrix D is updated as followsS:
Wherein is DiiIs a diagonal matrix DSI ═ 1,2,. and n;
the laplacian matrix is then updated as follows:
LS=DS-S (6)
will Laplace matrix LSBlocking from row l and column l:
wherein L isllRepresenting a matrix of size L x LluDenotes a matrix of size L × u, LulDenotes a matrix of size u x L, LuuRepresents a matrix of size u x u;
the consistency similarity matrix S and the degree matrix D are comparedSPartitioning:
the label matrix F of unlabeled samples is updated as followsu:
Fu=(I-Puu)-1PulFl (10)
finally, according to F ═ Fl;Fu]TUpdating a label matrix F;
step 4.3, fixing F and S, and solving the following formula to update theta:
wherein, thetavThe representation is represented by a weight matrix ΘvIs a vector of diagonal elements, WvIs a diagonal matrix of the v-th view, its i-th diagonal element Is a matrix MvThe ith diagonal element of (1), Mv=(Xv)TLsXv;
Step 4.4, iteration stop judgment:
s, F, L obtained by the last time and the current time of updating respectivelySΘ brings in the following objective function:
if the difference between the two obtained objective function values Z is smaller than a set threshold value, stopping iteration, wherein F at the moment is the final label matrix F; otherwise, returning to the step 4.1 to continue the iterative updating;
and 5: the label for each sample was obtained as follows:
yi=argmax1≤j≤CFij i=1,2,...,n (13)
wherein, yiRepresents the ith sample pointLabel of (1), FijRepresenting the i row and j column elements of the final label matrix F obtained in the step 4;
and classifying the samples with the same label into one class to obtain a classification result.
3. The semi-supervised classification method based on dimension weighting and view angle feature consistency as claimed in claim 1, wherein: the threshold value in step 4 is set to 10-8。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010416737.7A CN111639686B (en) | 2020-05-17 | 2020-05-17 | Semi-supervised classification method based on dimension weighting and visual angle feature consistency |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010416737.7A CN111639686B (en) | 2020-05-17 | 2020-05-17 | Semi-supervised classification method based on dimension weighting and visual angle feature consistency |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111639686A CN111639686A (en) | 2020-09-08 |
CN111639686B true CN111639686B (en) | 2022-03-15 |
Family
ID=72332795
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010416737.7A Active CN111639686B (en) | 2020-05-17 | 2020-05-17 | Semi-supervised classification method based on dimension weighting and visual angle feature consistency |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111639686B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114399649B (en) * | 2021-11-30 | 2023-09-19 | 西安交通大学 | Rapid multi-view semi-supervised learning method and system based on learning graph |
CN117274726B (en) * | 2023-11-23 | 2024-02-23 | 南京信息工程大学 | Picture classification method and system based on multi-view supplementary tag |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228027A (en) * | 2016-08-26 | 2016-12-14 | 西北大学 | A kind of semi-supervised feature selection approach of various visual angles data |
CN108280163A (en) * | 2018-01-18 | 2018-07-13 | 厦门美图之家科技有限公司 | Video features learning method, device, electronic equipment and readable storage medium storing program for executing |
CN110334777A (en) * | 2019-07-15 | 2019-10-15 | 广西师范大学 | A kind of unsupervised attribute selection method of weighting multi-angle of view |
CN110941734A (en) * | 2019-11-07 | 2020-03-31 | 南京理工大学 | Depth unsupervised image retrieval method based on sparse graph structure |
CN111160387A (en) * | 2019-11-28 | 2020-05-15 | 广东工业大学 | Graph model based on multi-view dictionary learning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10436896B2 (en) * | 2015-11-29 | 2019-10-08 | Vayyar Imaging Ltd. | System, device and method for imaging of objects using signal clustering |
US11275900B2 (en) * | 2018-05-09 | 2022-03-15 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems and methods for automatically assigning one or more labels to discussion topics shown in online forums on the dark web |
-
2020
- 2020-05-17 CN CN202010416737.7A patent/CN111639686B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106228027A (en) * | 2016-08-26 | 2016-12-14 | 西北大学 | A kind of semi-supervised feature selection approach of various visual angles data |
CN108280163A (en) * | 2018-01-18 | 2018-07-13 | 厦门美图之家科技有限公司 | Video features learning method, device, electronic equipment and readable storage medium storing program for executing |
CN110334777A (en) * | 2019-07-15 | 2019-10-15 | 广西师范大学 | A kind of unsupervised attribute selection method of weighting multi-angle of view |
CN110941734A (en) * | 2019-11-07 | 2020-03-31 | 南京理工大学 | Depth unsupervised image retrieval method based on sparse graph structure |
CN111160387A (en) * | 2019-11-28 | 2020-05-15 | 广东工业大学 | Graph model based on multi-view dictionary learning |
Non-Patent Citations (2)
Title |
---|
Semi-Supervised Learning with Auto-Weighting Feature and Adaptive Graph;Feiping Nie.et.;《IEEE Transactions on Knowledge and Data Engineering》;20190226;第32卷(第6期);第1167-1178页 * |
基于多视角的半监督特征选择算法研究;汪荆琪;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20141015(第10期);第I140-35页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111639686A (en) | 2020-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107066559B (en) | Three-dimensional model retrieval method based on deep learning | |
CN111723675B (en) | Remote sensing image scene classification method based on multiple similarity measurement deep learning | |
CN110796667B (en) | Color image segmentation method based on improved wavelet clustering | |
CN109543723B (en) | Robust image clustering method | |
CN111639686B (en) | Semi-supervised classification method based on dimension weighting and visual angle feature consistency | |
CN109840518B (en) | Visual tracking method combining classification and domain adaptation | |
CN109241816B (en) | Image re-identification system based on label optimization and loss function determination method | |
CN108664969B (en) | Road sign recognition method based on conditional random field | |
CN112861929B (en) | Image classification method based on semi-supervised weighted migration discriminant analysis | |
CN111091129B (en) | Image salient region extraction method based on manifold ordering of multiple color features | |
Ben Ayed et al. | Auxiliary cuts for general classes of higher order functionals | |
CN110738672A (en) | image segmentation method based on hierarchical high-order conditional random field | |
Liu et al. | Multiobjective multiple features fusion: A case study in image segmentation | |
Jia et al. | Nonlocal regularized CNN for image segmentation | |
CN108921853B (en) | Image segmentation method based on super-pixel and immune sparse spectral clustering | |
Wang et al. | Pedestrian detection in infrared image based on depth transfer learning | |
CN111488923B (en) | Enhanced anchor point image semi-supervised classification method | |
Tariq et al. | T-clustering: Image clustering by tensor decomposition | |
You et al. | Robust structure low-rank representation in latent space | |
Zhao et al. | Interactive segmentation of texture image based on active contour model with local inverse difference moment feature | |
CN109389127B (en) | Structured multi-view Hessian regularization sparse feature selection method | |
Nguyen et al. | A Combination of Histogram of Oriented Gradients and Color Features to Cooperate with Louvain Method based Image Segmentation. | |
CN115393631A (en) | Hyperspectral image classification method based on Bayesian layer graph convolution neural network | |
CN112766180B (en) | Pedestrian re-identification method based on feature fusion and multi-core learning | |
Khan et al. | Image segmentation via multi dimensional color transform and consensus based region merging |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |