CN103942214A

CN103942214A - Natural image classification method and device on basis of multi-modal matrix filling

Info

Publication number: CN103942214A
Application number: CN201310021734.3A
Authority: CN
Inventors: 罗勇; 许超
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2013-01-21
Filing date: 2013-01-21
Publication date: 2014-07-23
Anticipated expiration: 2033-01-21
Also published as: CN103942214B

Abstract

The invention relates to a natural image classification method and device on basis of multi-modal matrix filling. The method comprises the steps of carrying out feature extraction on natural image data with labels, natural image data without labels and natural image data for testing, and obtaining different feature representations; adopting a matrix filling algorithm to generate estimation labels of all features of the data with the labels; carrying out linear combination on all the estimation labels to be approximate to the corresponding known true labels corresponding to the estimation labels to obtain a combination coefficient; for all features, utilizing the natural image data with the labels and adopting the matrix filling algorithm to predict the labels of the natural image data without the labels and the labels of the natural image data for testing; adopting the combination coefficient to combine the predicted labels of all the features to obtain labels combining multiple features; classifying the natural image data on the basis of the labels combining the multiple features. The natural image classification method and device on the basis of the multi-modal matrix filling are easy to achieve, high classification accuracy can be obtained, meanwhile, the advantages of image classification on the basis of matrix filling are inherited, and the natural image classification method and device on the basis of the multi-modal matrix filling are suitable for the fields of network picture summarizing and classifying, image retrieval and the like.

Description

Natural image sorting technique and device based on multi-modal matrix fill-in

Technical field

The invention belongs to Images Classification and multi-modal data analysis (many Fusion Features) technical field, relate to the many labelings technology based on matrix fill-in, be specifically related to a kind of image classification method and device that utilizes multi-modal matrix fill-in.

Background technology

The image single from the content such as face, fingerprint, form is consistent is different, in a width natural image, conventionally comprises multiple objects, and presents respectively different forms.In natural image classification, often need to distribute multiple class labels to piece image.As shown in Figure 1, (a) " people " riding " bicycle ", and (b) " sky " and " ocean " often occurs together, and (c) " dog " is a kind of " animal ".Traditional single labeling (sample only has a class label) algorithm great majority cannot be directly used in many labelings.More feasible is " one-to-many " strategy in multicategory classification: for each classification builds respectively a two-value sorter, belong to such other sample and be considered as positive example, remaining is all considered as negative example.An open defect of this method is easily to cause serious data skew problem, has also ignored interrelated between classification (for example, the cooccurrence relation of " sky " and " ocean ", and the subordinate relation of " dog " and " animal ") simultaneously.Therefore, there are in recent years a lot of new algorithms to be suggested the problem that solves many labels.Wherein, utilize many labelings algorithm of matrix fill-in to allow input data (feature and label) to have excalation, noise and wild point are had to very strong robustness.

Matrix fill-in, as its name suggests, is the matrix M of free missing value to fill up by one exactly.If this matrix is not had to task hypothesis or priori, cannot fill.Therefore, that matrix of conventionally supposing required recovery is low-rank (low-rank) (E.Candes and B.Recht, Exact matrix completion via convex optimization, Found.Comput.Math, 9:717-772,2009).The target of matrix fill-in is exactly to find a matrix X to make X and the M error in known terms as far as possible little, and the order of X is low as far as possible simultaneously.This order minimization problem is the difficult problem of a NP-, therefore almost there is no what practicality.Fortunately, order rank (X) can be by its protruding encapsulation, i.e. nuclear norm || X|| _*institute replaces (M.Fazel, Matrix rank minimization with applications, Ph.D.thesis, Stanford University, 2002).Based on this point, a lot of algorithms are developed for matrix fill-in.For example, Candes and Recht(E.Candes and B.Recht, Exact matrix completion via convex optimization, Found.Comput.Math, 9:717-772,2009) point out to minimize nuclear norm || X|| _*with rank (X) has identical unique solution, and prove to recover a matrix and only needed limited number of samples.In addition the algorithm that, the author of the document has also proposed a kind of positive semidefinite optimization solves the problem that minimizes nuclear norm.Not very low situation in order to process large matrix and rank of matrix, researchist has proposed respectively singular value thresholding (singular value thresholding, SVT) (J.Cai, E.Candes and Z.Shen, A singular value thresholding algorithm for matrix completion, SIAM, 20 (4): 1956-1982, 2010) and fixed point continuity (fixed point continuation) (S.Ma, D.Goldfarb and L.Chen, Fixed point and Bregman iterative methods for matrix rank minimization, Math.Program., 128 (1): 321-353, 2009) algorithm.Recently, matrix fill-in is introduced in conduction study (A.Goldberg, X.Zhu, B.Recht, J.Xu and R.Nowak, Transduction with matrix completion:three birds with one stone, NIPS, pp.757-765, 2010) and many label images classification (R.Cabral, F.Torre, J.Costeira and A.Bernardino, Matrix completion for multi-label image classification, NIPS, pp.190-198, 2011), basic thought is stitched together sample characteristics matrix and sample label matrix exactly, then by matrix fill-in algorithm, the value of unknown characteristics wherein and label is estimated.

This image classification algorithms based on matrix fill-in can only be processed single data of planting feature.And in fact, also do not have so far any feature to can be good at describing the various classifications of natural image.Therefore, conventionally all require to use various features (as SIFT(D.Lowe, Distinctive image features from scale-invariant keypoints, Int.J.Comput.Vis., 60 (2): 91-110, 2004), GIST(A.Torralba, K.Murphy and W.Freeman, Modeling the shape of the scene:A holistic representation of the spatial envelope, Int.J.Comput.Vis., 42 (3): 145-175, 2001) and RGB etc.), merge the most direct manifold way and exactly various features are conspired to create to a long vector.This way not only can reduce operation efficiency greatly, and can cause dimension blast problem, lacks physical interpretation simultaneously, affects classification accuracy rate.

Summary of the invention

The object of the invention is to for the problems referred to above, a kind of image classification method and device based on multi-modal matrix fill-in proposed, adopt the matrix fill-in algorithm of many Fusion Features, by excavating the complementarity between each various features, realization efficiently, many label images are classified fast.

The sorting algorithm of many Fusion Features is broadly divided into three kinds: the fusion (M.White of characteristic layer, Y.Yu, X.Zhang and D.Schuurmans, Convex multi-view subspace learning, NIPS, pp.1682-1690, 2012), interactive (the A.Blum and T.Mitchell that merges, Combining labeled and unlabeled data with co-training, COLT, pp.92-100, 1998) and the fusion (C.Snoek of sorter layer, M.Worring and A.Smeulders, Early versus late fusion in semantic video analysis, Multimedia, pp.399-402, 2005).What the present invention adopted is the strategy merging at sorter layer.

Specifically, the image classification method based on multi-modal matrix fill-in of the present invention, its step comprises:

1) to tape label, without label with test natural image data carry out feature extraction, obtain different characteristic and represent;

2) adopt matrix fill-in algorithm to generate the estimation label of each feature of tape label data;

3) each estimation label is carried out to linear combination to approach the known true label of its correspondence, obtain combination coefficient;

4), for various features, utilize the natural image data acquisition matrix fill-in algorithm predicts of tape label without label label and natural image data test;

5) adopt described combination coefficient to combine the characteristic label of step 4) prediction, obtain merging manifold label;

6) based on the manifold label of described fusion, natural image data are classified.

Further, adopt the Feature Extraction Algorithm such as SIFT, GIST to carry out described feature extraction.

Further, the each character representation of step 1) gained is carried out to pre-service, then carry out step 2); Preferably use coring principal component analysis (PCA) to carry out pre-service, can also adopt other methods such as Random Maps (Random Projection).In the present invention, pre-service is not steps necessary, but carries out the pre-service execution efficiency of boosting algorithm greatly, can improve to a certain extent classification accuracy rate simultaneously yet.

Further, step 2) implementation method be: after establishing pre-service, obtain X ^{0 (v)}, v=1 ..., V, wherein V is feature kind number, X ⁰what represent is that raw data matrix is for v kind feature; The natural image data of tape label are divided into two parts, suppose the label of Part I data unknown, the label of Part II data known; Adopt matrix fill-in algorithm to use Part II data to estimate the label of Part I data, obtain estimating label in like manner obtain the estimation label of Part II data will with be stitched together, obtain the estimation label of v kind character representation the feature of all kinds is implemented to said process, obtain v=1 ..., V.

Further, establishing combination coefficient is { θ _v, find combination coefficient { θ by solving following optimization problem _vmake

f (p_{k}) = Σ_{v = 1}^{V} θ_{v} p_{k}^{(v)}

Approaching to reality value as much as possible

\underset{θ}{\arg \min} \frac{1}{2 N} Σ_{k = 1}^{N} L (f (p_{k}), y_{k}^{0}) + \frac{η}{2} {| | θ | |}_{2}^{2},

s . t Σ_{v = 1}^{V} θ_{v} = 1, θ_{v} &GreaterEqual; 0,

Wherein,

N = | Ω_{Y_{l}} |

For middle element number,

p_{k} = {[p_{k}^{(1)}, . . ., p_{k}^{(V)}]}^{T},

θ=[θ ₁..., θ _v] ^t, regular terms can prevent weight { θ _vbe partial to certain best feature, and η>=0th, balance parameter, L can be any convex loss function, the hinge function being adopted as square error function, support vector machine (SVM) etc.

The present invention also provides the image classification device that adopts said method, and it comprises:

Feature extraction unit, for natural image data tape label and Unknown Label are carried out to feature extraction, obtains different characteristic and represents;

Training data generation unit, connects described feature extraction unit, for adopting matrix fill-in algorithm to generate the estimation label of each feature of tape label data;

Combination coefficient computing unit, connects described training data generation unit, for each is estimated to label carries out linear combination to approach the known true label of its correspondence, obtains combination coefficient;

Label prediction unit, connects described feature extraction unit, for utilizing the natural image data acquisition of the tape label label of the natural image data of matrix fill-in algorithm predicts Unknown Label;

Output integrated unit, connects described combination coefficient computing unit and described label prediction unit, for adopting described combination coefficient to combine the label being obtained by label prediction unit, obtains merging manifold label;

Image classification unit, connects described Fusion Features unit, based on the manifold label of described fusion, natural image data is classified.

Further, above-mentioned image classification device also can comprise pretreatment unit, for using coring principal component analysis (PCA) to carry out pre-service to each character representation.Now this pretreatment unit connects described feature extraction unit, and training data generation unit is connected to receive pretreated data with this pretreatment unit respectively with label prediction unit.

Compared with prior art, advantage of the present invention and good effect are: simple, be easy to realize, and computation complexity is lower, and can obtain higher classification accuracy rate; Inherit the advantage of the Images Classification based on matrix fill-in simultaneously, for example, noise and missing data have been had to very strong robustness, can realize many labelings (can process the situation that piece image belongs to plurality of classes simultaneously) simultaneously.Main application of the present invention is: 1) at present image retrieval is main or based on key word, the network image without label that one width is uploaded can use many label image classification of the present invention to carry out automatic keyword, and then can be searched for by the image indexing system based on key word; 2) to the classification of summarizing of network picture, be convenient to user and browse.

Brief description of the drawings

Fig. 1 is some natural image samples with multiple class labels.

Fig. 2 is the step block diagram of the image classification method based on multi-modal matrix fill-in of the embodiment of the present invention.

Fig. 3 is the composition diagram of the image classification device of the embodiment of the present invention.

Fig. 4 is the present invention and other contrast and experiment based on matrix fill-in method.

Embodiment

Below by specific embodiment, and coordinate accompanying drawing, the present invention is described in detail.

The image classification method based on multi-modal matrix fill-in of the present embodiment, as shown in Figure 2, concrete steps comprise its flow process:

1) using different Feature Extraction Algorithm (SIFT, GIST etc.) to obtain different characteristic to all natural images (comprising tape label data, without label data, with test data) represents.

In classification, conventionally data can be divided into training data and test data, training data is for training classifier, and test data is the performance for testing classification device.Belong to training data without label data, but be different from the training data of tape label, they do not have label, can utilize these not the training data of tape label promote the performance of sorter.In the present invention, a large amount of can help mining data structure without label data, thereby obtains better test result.

2) in order to reduce the noise of input data, and consider the non-linear of characteristics of image, simultaneously also in order to improve matrix fill-in efficiency, first data (character representation of figure) are used to coring principal component analysis (PCA) (kernel principal component analysis, KPCA) carry out pre-service, obtain X ^{0 (v)}, v=1 ..., V, wherein V is feature kind number, X ⁰what represent is raw data matrix, X ^{0 (v)}represent input (original) data matrix of v kind feature.In the implementation process of matrix fill-in algorithm, the value of this data matrix can change, and finally obtains a new data matrix X ^(v), ensureing and X ^{0 (v)}in close as far as possible in its known terms, it is minimum meeting order.

3) generating training data: the training data is here for learning the combination coefficient between various features.For v kind character representation, the data of tape label are divided into two parts, first suppose the label of Part I data be unknown, then use Part II (known label) data to use matrix fill-in algorithm that the label of Part I data is estimated, obtain same reason, can obtain the estimation label of Part II data their corresponding true labels are will with be stitched together, obtain

4) feature by all kinds implements the 3rd) operation of step obtains v=1 ..., V, their corresponding true labels known.Therefore, can be by each linear combination goes to approach together combination coefficient { the θ obtaining _valso just reflected the importance of various features.

5) then, for v ∈ 1 ..., V} kind feature, utilizes tape label data remove the label of prediction without label and test data with matrix fill-in algorithm, obtain with

6) last, by all { θ for output _vcombine manifold prediction label Y that obtained final fusion _uand Y _t, this has just realized the classification without label data and test data.

The present invention lays particular emphasis on " multi-modal ", and how comprehensive various features realizes the Images Classification based on matrix fill-in.Basic thought is first to obtain classification results by single feature of planting, then learns one group of coefficient { θ _vnext comprehensive these results.The classification results obtaining is " soft label ", i.e. some real number values, and what they represented is the degree that certain test pattern belongs to certain classification; In the time differentiating particularly certain image and belong to which classification (give image " hard label "), can be by setting a threshold value, in certain classification, the predicted value of certain image is greater than this threshold value and just gives this image by this class label.

Above the 3rd) and the 5th) step mentions and uses matrix fill-in algorithm predicts label.The following describes and how to use matrix fill-in to carry out many labelings.

A given n sample

D_{n} = {{(x_{j}^{0}, y_{j}^{0})}_{j = 1}^{n}},

Wherein proper vector corresponding label

y_{j}^{0} &Element; {- 1,1}^{m}

(for being unknown without label and test data), m is class label number.In order to predict Unknown Label, first construction feature matrix with corresponding label matrix problem is transformed into filled matrix Y ⁰in unknown term.Consider linear classification model y _j=Wx _j+ b, and the eigenmatrix that hypothesis prediction obtains is X=[x ₁..., x _n], label matrix is Y=[y ₁..., y _n].So, matrix Z=[Y X and Y being combined into; X; 1 ^t] be low-rank.First this be because the every a line in Y can be by [X; 1 ^t] in line linearity combination obtain.On the other hand, X itself also meets low-rank conventionally, and this is also the basis that some linear dimension reduction methods (such as PCA etc.) can be implemented.If Z ⁰=[Y ⁰; X ⁰; 1 ^t] and suppose Ω _xand Ω _yknown terms index set wherein, so can be by minimizing the nuclear norm of Z, and Z and Z ⁰at Ω _x, Ω _yon difference estimate Y ⁰in unknown term, optimization problem can be expressed as:

\arg \min μ {| | Z | |}_{*} \frac{1}{| Ω_{X} |} \underset{i, j &Element; Ω_{X}}{Σ} c_{x} (Z_{ij}, Z_{ij}^{0}) + \frac{λ}{| Ω_{Y} |} \underset{i, j &Element; Ω_{Y}}{Σ} c_{y} (Z_{ij}, Z_{ij}^{0}), - - - (1)

s . t Z_{(m + d + 1)} = 1^{T} .

Wherein, c _xand c _ybe respectively loss function, μ and λ are greater than zero balance parameter.Concrete solution procedure can be with reference to (A.Goldberg, X.Zhu, B.Recht, J.Xu and R.Nowak, Transduction with matrix completion:three birds with one stone, NIPS, pp.757-765,2010).

The following describes the 4th) how to learn combination coefficient { θ in step _v.Study combination various features is referred to as " MVMC " by the present invention.

For the ease of the explanation of deriving, the present embodiment becomes the 2-d index (i, j) about sample into one dimension index k by vectorization, then uses represent the index set of known label data, represent the training data (the prediction label value of labeled data) generating from v kind feature.Our target is just to locate one group of coefficient { θ _vmake approaching to reality value as much as possible optimization problem can be expressed as:

\underset{θ}{\arg \min} \frac{1}{2 N} Σ_{k = 1}^{N} L (f (p_{k}), y_{k}^{0}) + \frac{η}{2} {| | θ | |}_{2}^{2}, - - - (2)

s . t Σ_{v = 1}^{V} θ_{v} = 1, θ_{v} &GreaterEqual; 0,

Wherein

N = | Ω_{Y_{l}} |

For middle element number,

p_{k} = {[p_{k}^{(1)}, . . . ., p_{k}^{(V)}]}^{T},

θ=[θ ₁..., θ _v] ^t, regular terms can prevent weight { θ _vbe partial to best certain feature (certain θ _vbe 1, its residual value is all 0), η>=0th, balance parameter.L can be any convex loss function, as square error function, and the hinge function that support vector machine (SVM) adopts etc.

When selecting square error function, i.e. L (f (x), y)=(f (x)-y) ², optimization problem becomes following form:

\arg \min \frac{1}{2 N} {| | P^{T} θ - y^{0} | |}_{F}^{2} + \frac{η}{2} {| | θ | |}_{2}^{2} - - - (3)

s . t Σ_{v = 1}^{V} θ_{v} = 1, θ_{v} &GreaterEqual; 0,

Wherein P=[p ₁..., p _n], this problem can adopt coordinate gradient decline (coordinate descent) algorithm to solve.Specifically, only select two variable θ at every turn _iand θ _jupgrade, until the change amount of adjacent twice iterative target function is less than a threshold value.Consider bound term by use lagrange's method of multipliers in every step iteration, can obtain following update rule:

\{\begin{matrix} θ_{i}^{*} = \frac{η (θ_{i} + θ_{j}) + (h_{i} - h_{j}) + ϵ_{ij}}{(H_{ii} - H_{ij} - H_{ji} + H_{jj}) + 2 η} \\ θ_{j}^{*} = θ_{i} + θ_{j} - θ_{i}^{*} \end{matrix} - - - (4)

Wherein, a positive semidefinite matrix, p _(i)the i that refers to matrix P is capable,

ϵ_{ij} = (H_{ii} - H_{ij} - H_{ji} + H_{jj}) θ_{i} - Σ_{k} (H_{ik} - H_{jk}) θ_{k} .

Further will retrain θ _v>=0 takes into account, and has

When adopting hinge loss function, i.e. L (f (x), y)=(1-yf (x)) ₊, can obtain following optimization problem:

\arg \min \frac{1}{2} {| | θ | |}_{2}^{2} + C Σ_{k = 1}^{N} ξ_{k},

s . t y_{k}^{0} (θ^{T} p_{k}) &GreaterEqual; 1 - ξ_{k}, ξ_{k} &GreaterEqual; 0, k = 1, . . ., N - - - (6)

Σ_{v = 1}^{V} θ_{v} = 1, θ_{v} &GreaterEqual; 0, v = 1 . . ., V

Wherein C>=0th, balance parameter, ξ _kit is slack variable.This problem can, by using lagrange's method of multipliers to convert dual problem to, then alternately be optimized two subproblems and solve.Subproblem 1:

\underset{α}{\arg \max} α^{T} (1_{N} - Y_{d}^{0} p^{T} δ) - \frac{1}{2} α^{T} Hα, - - - (7)

s . t 0 \leq α_{k} \leq C .

Wherein, α and δ are dual variables to be solved.H is the matrix of a N × N, each and 1 _nthat length is the column vector that the each element of N is 1.Subproblem 2:

\underset{δ}{\arg \max} (- α^{T} Y_{d}^{0} p^{T}) δ - \frac{1}{2} δ^{T} δ, - - - (8)

s . t δ_{v} &GreaterEqual; 0 .

Alternately optimize (7) and (8) and obtain α ^*, δ ^*, combination coefficient its normalization is finally separated.

Fig. 3 is the composition diagram that adopts the image classification device of said method.This device comprises: feature extraction unit, for natural image data tape label and Unknown Label are carried out to feature extraction, obtains different characteristic and represent; Pretreatment unit, connects described feature extraction unit, for using coring principal component analysis (PCA) to carry out pre-service to each character representation; Training data generation unit, connects described pretreatment unit, for adopting matrix fill-in algorithm to generate the estimation label of each feature of tape label data; Combination coefficient computing unit, connects described training data generation unit, for each is estimated to label carries out linear combination to approach the known true label of its correspondence, obtains combination coefficient; Label prediction unit, connects described pretreatment unit, for utilizing the natural image data acquisition of the tape label label of the natural image data of matrix fill-in algorithm predicts Unknown Label; Output integrated unit, connects described combination coefficient computing unit and described label prediction unit, for adopting described combination coefficient to combine the label being obtained by label prediction unit, obtains merging manifold label; Image classification unit, connects described output integrated unit, based on the manifold label of described fusion, natural image data is classified.It should be noted that, above-mentioned pretreatment unit is not necessary, in other embodiments, can not establish this pretreatment unit yet, and training data generation unit is directly connected with feature extraction unit with label prediction unit.

Fig. 4 has compared some algorithms based on matrix fill-in, comprises that MVMC-LS(utilizes square error loss), MVMC-SVM(utilizes hinge loss) and other three kinds of algorithms, be respectively 1) BMC: use monotypism energy best feature; 2) CMC: various features is conspired to create to a long vector; 3) AMC: adopt average combined coefficient when combination various features.The present invention is at PASCAL VOC ' 07(M.Everingham, L.Van Gool, C.Williams, J.Winn and A.Zisserman, The PASCAL Visual Object Classes Challenge2007 (VOC2007) Results) test on data set, use mAP(M.Zhu, Recall, precision and average precision, University of Waterloo, Tech.Rep., 2004) as assessment level, and to have tested tape label number of samples be 100,150,200 3 kinds of situations.Experimental result shows, uses various features simultaneously, no matter is that feature is conspired to create to long vector (CMC), or Output rusults is averaged to (AMC), is all better than and uses single feature of planting.Use the combination coefficient of the MVMC Algorithm Learning output of the present invention's proposition can further improve performance, and the result outline of losing based on hinge is better than use square error (the latter's algorithm complex is lower).

Table 1 has compared MVMC-SVM and other several many Feature Fusion Algorithms, respectively 1) HierSVM(J.Kludas, E.Bruno and S.Marchand-Maillet, Information fusion in multimedia information retrieval, Adaptive Multimedia Retrieval, pp.147-159,2009): for every kind of feature is trained respectively a svm classifier device, then merge multiple output by a SVM; 2) SimpleMKL(A.Rakotomamonjy, F.Bach, S.Canu and Y.Grandvalet, SimpleMKL, JMLR, 9:2491-2521,2008): a kind of popular Multiple Kernel Learning (MKL) algorithm, for every kind of feature builds respectively core, then is used for classification by all core linear combination; 3) LpMKL(M.Kloft, U.Brefeld, S.Sonnenburg and A.Zien, Lp-norm multiple kernel learning, JMLR, 12:953-997,2011): a kind of Multiple Kernel Learning algorithm proposing recently, is generalized to l by MKL _p-norm, p>=1.Experimental result shows that MVMC-SVM is better than other algorithm, especially, compares with LpMKL, the present invention is 100 in tape label data, 150 and obtained respectively 7.2%, 9.7% and 11.4% mAP(mean average precision at 200 o'clock, the average of Average Accuracy) promote.

The contrast and experiment of table 1. the inventive method and other multi-modal method

Above embodiment is only in order to technical scheme of the present invention to be described but not be limited; those of ordinary skill in the art can modify or be equal to replacement technical scheme of the present invention; and not departing from the spirit and scope of the present invention, protection scope of the present invention should be as the criterion with described in claim.

Claims

1. the natural image sorting technique based on multi-modal matrix fill-in, comprises the following steps:

2. the method for claim 1, its spy is: the each character representation of step 1) gained is carried out to pre-service, then carry out step 2).

3. method as claimed in claim 2, its spy is: use coring principal component analytical method or Random Maps method to carry out described pre-service.

4. method as claimed in claim 2, its spy is, step 2) implementation method be:

If obtain X after pre-service ^{0 (v)}, v=1 ..., V, wherein V is feature kind number, X ⁰what represent is that raw data matrix is for v kind feature; The natural image data of tape label are divided into two parts, suppose the label of Part I data unknown, the label of Part II data known;

Adopt matrix fill-in algorithm to use Part II data to estimate the label of Part I data, obtain estimating label in like manner obtain the estimation label of Part II data will with be stitched together, obtain the estimation label of v kind character representation the feature of all kinds is implemented to said process, obtain v=1 ..., V.

5. method as claimed in claim 4, its spy is: establishing combination coefficient is { θ _v, find combination coefficient { θ by solving following optimization problem _vmake

f (p_{k}) = Σ_{v = 1}^{V} θ_{v} p_{k}^{(v)}

Approaching to reality value as much as possible

\underset{θ}{\arg \min} \frac{1}{2 N} Σ_{k = 1}^{N} L (f (p_{k}), y_{k}^{0}) + \frac{η}{2} {| | θ | |}_{2}^{2},

s . t Σ_{v = 1}^{V} θ_{v} = 1, θ_{v} &GreaterEqual; 0,

Wherein,

N = | Ω_{Y_{l}} |

For middle element number,

p_{k} = {[p_{k}^{(1)}, . . ., p_{k}^{(V)}]}^{T},

θ=[θ ₁..., θ _v] ^t, regular terms be used for preventing weight { θ _vbe partial to certain best feature, and η>=0th, balance parameter, L is any convex loss function.

6. method as claimed in claim 5, is characterized in that: L is square error function, i.e. L (f (x), y)=(f (x)-y) ², adopt coordinate gradient descent algorithm to solve, each iteration is only upgraded two variablees, and update rule is:

\{\begin{matrix} θ_{i}^{*} = \frac{η (θ_{i} + θ_{j}) + (h_{i} - h_{j}) + ϵ_{ij}}{(H_{ii} - H_{ij} - H_{ji} + H_{jj}) + 2 η} \\ θ_{j}^{*} = θ_{i} + θ_{j} - θ_{i}^{*} \end{matrix}

And

ϵ_{ij} = (H_{ii} - H_{ij} - H_{ji} + H_{jj}) θ_{i} - Σ_{k} (H_{ik} - H_{jk}) θ_{k} .

7. method as claimed in claim 5, its spy is: L is hinge loss function, i.e. L (f (x), y)=(1-yf (x)) ₊, solve by alternately optimizing two subproblems,

Subproblem 1:

\underset{α}{\arg \max} α^{T} (1_{N} - Y_{d}^{0} p^{T} δ) - \frac{1}{2} α^{T} Hα,

s . t 0 \leq α_{k} \leq C,

Wherein, H is the matrix of a N × N, each

H_{ij} = y_{i}^{0} y_{j}^{0} p_{i}^{T} p_{j},

Y_{d}^{0} = diag (y_{1}^{0}, . . ., y_{N}^{0}),

1 _nthat length is the column vector that the each element of N is 1;

Subproblem 2:

\underset{δ}{\arg \max} (- α^{T} Y_{d}^{0} P^{T}) δ - \frac{1}{2} δ^{T} δ,

s . t δ_{v} &GreaterEqual; 0,

Combination coefficient

θ = Σ_{k = 1}^{N} α_{k}^{*} y_{k}^{0} p_{k} + δ^{*},

Its normalization is finally separated.

8. method as claimed in claim 1 or 2, is characterized in that: the Feature Extraction Algorithm that described feature extraction adopts is SIFT or GIST.

9. an image classification device that adopts method described in claim 1, is characterized in that, comprising:

Training data generation unit, for adopting matrix fill-in algorithm to generate the estimation label of each feature of tape label data;

Label prediction unit, for utilizing the natural image data acquisition of the tape label label of the natural image data of matrix fill-in algorithm predicts Unknown Label;

10. device as claimed in claim 9, is characterized in that: also comprise pretreatment unit, connect described feature extraction unit, for each character representation is carried out to pre-service; Described training data generation unit is connected to receive pretreated data with this pretreatment unit respectively with described label prediction unit.