CN107563410A

CN107563410A - The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories

Info

Publication number: CN107563410A
Application number: CN201710662859.2A
Authority: CN
Inventors: 胡卫明; 毛雪
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2017-08-04
Filing date: 2017-08-04
Publication date: 2018-01-09

Abstract

The present invention relates to field of computer data processing, it is proposed that a kind of classification and equipment unanimously clustered based on topic categories with multi-task learning, it is intended to solves the problems, such as that SVMs make it that classification is slow because operand is big in data classification.One embodiment of this method includes：Grouped data is treated using K averages to be clustered, and a linear SVM is trained in each cluster with the parameter of initialization model；Then, pending data is divided into multiple clusters using locally consistent clustering method, a linear SVM is trained in each cluster；Cluster and Training Support Vector Machines are fused in a production graph model；Learn above-mentioned SVMs simultaneously using multi-task learning method；Using the parameter of expectation maximization Algorithm for Solving preliminary classification model, disaggregated model is obtained by above-mentioned parameter, grouped data is treated using disaggregated model and is classified.The embodiment improves the classification speed of large-scale data to be sorted.

Description

The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories

Technical field

The present invention relates to machine learning field, and in particular to Nonlinear Classification technology, it is more particularly to a kind of to be based on local class Not not unanimously cluster and the sorting technique and processing equipment of multi-task learning.

Background technology

With the arriving in big data epoch, the data with same or similar feature are selected from many data to be become It is more and more important, by the cluster analysis to data, split data into different classes of application combination.SVMs (Support Vector Machine, SVM) is built upon the machine learning method on the basis of Statistical Learning Theory, can analyze Data, recognition mode, for classification and regression analysis, optimum balance can be sought between learning ability and model complexity Point.Inner product between data point is replaced with into kernel function, it is possible to linear classifier is converted into Nonlinear Classifier, i.e., by line The SVM of property is converted to kernel SVM.

But kernel SVM training and test speed are all slow.Kernel SVM training process, which corresponds to, solves one Individual convex double optimization problem, is related to substantial amounts of matrix operation, it is necessary to take a substantial amount of time.Due to needing to store kernel function square Battle array, the size of matrix and the number of training sample are into quadratic relationship, therefore space complexity is also very high.Kernel SVM test Speed is also slow, and for a test sample, it is necessary to which it and all supporting vectors are calculated into kernel function, then weighted sum is Final categorised decision can be obtained, it is directly proportional to the number of supporting vector that it, which tests complexity, so SVM test is complicated Degree is also higher.

Because Linear SVM is trained and is tested all than very fast, there is a kind of method to substitute by integrated multiple Linear SVMs at present Kernel SVM classify to nonlinear data.Even nonlinear decision boundary, it should also be smooth, i.e., in office Portion is linear.This kind of method is exactly to learn linear SVM in part to classify to nonlinear data.Specifically, use The strategy divided and rule：Data are divided into some clusters first, a Linear SVM is then trained in each cluster.On but The method of stating has following two：(1) cluster and trained in each cluster SVM this two step be typically it is separate, it Can not mutually promote, if Clustering Effect is bad, in each cluster train Linear SVM effect will not also get well；(2) This kind of method is generally individually learnt this multiple Linear SVM, and this easilys lead to over-fitting occur when training single SVM Phenomenon.

The content of the invention

It has been that solution kernel SVM training and test speed are slow to solve above mentioned problem of the prior art, with And multiple Linear SVMs substitution kernel SVM nonlinear data is classified in the presence of cluster and trained in cluster Occurs the problems such as over-fitting when can not be mutually promoted between SVM, and train single SVM, the present invention uses following technology Scheme is to solve the above problems：

In a first aspect, this application provides the sorting technique unanimously clustered based on topic categories with multi-task learning, the party Method comprises the following steps：

Step 1：Grouped data, which is treated, using K- averages carries out cluster operation, multiple first clusters of generation, each above-mentioned One the first linear SVMs of training in first cluster, gives birth to according to the above-mentioned first cluster and above-mentioned first SVMs Into the parameter of preliminary classification model.

Step 2：Above-mentioned data to be sorted are divided into by multiple second clusters according to the method that topic categories unanimously cluster, One the second linear SVMs of training in each above-mentioned second cluster.

Step 3：Above-mentioned second cluster is merged in a production graph model and trains above-mentioned second SVMs.

Step 4：Learn above-mentioned second SVMs simultaneously using multi-task learning method, and at each above-mentioned second Hold and knowledge is migrated between vector machine, above-mentioned knowledge is the characteristic information being made up of attribute or feature.

Step 5：Parameter that above-mentioned topic categories unanimously cluster and above-mentioned second is solved using expectation-maximization algorithm The parameter of vector machine is held, and the parameter unanimously clustered according to above-mentioned topic categories and the parameter renewal of above-mentioned second SVMs The parameter of above-mentioned preliminary classification, generate disaggregated model.

Step 6：Above-mentioned data to be sorted are classified using above-mentioned disaggregated model.

In some instances, above-mentioned steps 1 include：

Step 11：Above-mentioned data to be sorted are carried out with cluster operation, multiple first clusters of generation using K- averages.

Step 12：Calculate the first mixed coefficint, the first mean vector and the first covariance matrix of each first cluster.

Step 13：A first linear SVMs is trained in each cluster, obtains K linear support vector Machine.

Step 14：Supported according to above-mentioned first mixed coefficint, the first mean vector, the first covariance matrix and each first The weight vectors of vector machine, calculate the parameter of preliminary classification model.

In some instances, above-mentioned steps 2 include：

Step 21：Cluster operation, generation second are carried out to the data to be sorted using locally consistent gauss hybrid models Cluster, wherein, the locally consistent gauss hybrid models are the Gauss models under locally consistent regularization term；

Step 22：The second mixed coefficint, the second mean vector and the second covariance matrix of each second cluster are calculated, it is raw Into the parameter of the described second cluster.

Step 23：One the second linear SVMs of training in each second cluster.

In some instances, above-mentioned steps 3 include：

Step 31：In production model, trained in fusion second cluster and the described second cluster linear the Two SVMs.

Step 32：Joint probability between second cluster is calculated according to the distribution of the production model sample, and Go out the log-likelihood function of regularization according to the joint probability calculation.

Step 33：Linear second SVMs is dissolved into the log-likelihood function, optimization object functionTo train linear second SVMs, wherein, above-mentioned object function first For the regular terms of linear SVM weight vectors, Section 2It is total on training set Loss function, in Section 2For the loss function of data x and y cluster.

In some instances, above-mentioned steps 4 include：

Step 41, training one is linear in each second cluster based on multi-task learning method second support to Amount machine.

Step 42, all described second is polymerized to according to the weight vectors of each second SVMs multiple Group.

Step 43, according to each group cluster each above-mentioned second SVMs weight vectors of average weight vector sum canonical , migrate knowledge between each above-mentioned second SVMs.

In some instances, in step 4 " will all above-mentioned the according to the weight vectors of each above-mentioned second SVMs Dimerization class is polymerized to multiple groups ", including：

The weight vectors of K the second SVMs are polymerized to r group, establish corresponding relation：

Wherein, I_lThe set of tasks in l-th group is represented,Represent being averaged for task in l-th group Weight vectors, m_lRepresent to share m in l-th group_lIndividual task, Ω (W) represent the regular terms to Linear SVM weight vectors.

In some instances, the parameter described above that preliminary classification model is solved using expectation-maximization algorithm, including： Calculate the probability that each data in the data to be sorted are categorized into each second SVMs；Will according to the probability Each data in the data to be sorted are categorized into each second SVMs；By maximizing log-likelihood function Lower bound, renewal it is described second cluster parameter and second SVMs parameter.

In some instances, the above method also including the use of multiple second SVMs cluster result weighted average Value is treated grouped data and classified.

Second aspect, this application provides a kind of storage device, wherein being stored with a plurality of program, it is characterised in that described Program is suitable to be loaded as processor and performed to realize described in above-mentioned first aspect based on topic categories unanimously cluster and multitask The sorting technique of study.

The third aspect, this application provides a kind of processing unit, including processor and storage device.Processor is adapted for carrying out Each bar program；Equipment is stored up, suitable for storing a plurality of program；Program is suitable to be loaded by processor and performed to realize above-mentioned first aspect In any described sorting technique unanimously clustered based on topic categories with multi-task learning.

The sorting technique and processing equipment that are unanimously clustered based on topic categories with multi-task learning that the application provides, are passed through Topic categories are unanimously clustered and train Linear SVM to be fused in a production graph model in each cluster, so that Cluster and training Linear SVM can mutually promote.Meanwhile learn these Linear SVMs simultaneously using multi-task learning, avoid The over-fitting occurred when training SVM in single cluster.Realize to substantial amounts of, complicated data quickly and correctly Classification.

Brief description of the drawings

Fig. 1 is that the application can apply to exemplary system architecture figure therein；

Fig. 2 is the implementation unanimously clustered based on topic categories with the sorting technique of multi-task learning according to the application The schematic flow sheet of example；

Fig. 3 is that the Linear SVM trained in cluster and the cluster is fused to a production graph model learning schematic diagram；

Fig. 4 is semi-supervised that the Linear SVM trained in cluster and the cluster is fused in a production graph model Practise schematic diagram.

Embodiment

The preferred embodiment of the present invention described with reference to the accompanying drawings.It will be apparent to a skilled person that this A little embodiments are used only for explaining the technical principle of the present invention, it is not intended that limit the scope of the invention.

It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 shows the Nonlinear Classification unanimously clustered based on topic categories with multi-task learning that can apply the application Method unanimously clusters exemplary system with the embodiment of the Nonlinear Classification processing equipment of multi-task learning based on topic categories System framework.

As shown in figure 1, system architecture can include data end equipment 101, network 104 and server 105.Data end equipment 101 can be multiple identical or different equipment, can be the first data terminal 1011, the second data terminal 1012, the 3rd data terminal One or more of 1013.Network 104 between data end equipment 101 and server 105 provide communication link Jie Matter.Network 104 can include various connection types, such as wired, wireless communication link or fiber optic cables etc..

Data end equipment 101 can be by carrying out information exchange, to receive or send between network 104 and server 105 Information etc..Letter can be carried out by network 104 between first data terminal 1011, the second data terminal 1012, the 3rd data terminal 1013 Breath interaction.

Data end equipment 101 can be the various electronic equipments for having display screen and supporting network service, including but unlimited In smart mobile phone, tablet personal computer, pocket computer on knee and desktop computer and the computer being made up of multiple computers Processing system etc..It should be noted that data end equipment can include the unit of data storage, and filled in data end equipment Provided with the software for data processing or application.Above-mentioned application can be, for example, finding different clients from the basic storehouse of client Group, and portray with purchasing model the business application software of the feature of different customers, or from big data temporal characteristics, Application of analysis road jam situation of section feature etc..

Server 105 can be to provide the processor or server of various services, such as to transmitted by data end equipment 101 Or the data provided carry out data analysis, the data processing server of classification, above-mentioned data processing server can be to pending Data analyze etc. processing, generation result (for example, according to user's request, will carry out cluster analysis with grouped data Classification results afterwards) user is returned, or store stand-by.It should be noted that in the application, server 105 can individually be set The server put or the selection or specified from the first data terminal 1011, the second data terminal 1012, the 3rd data terminal 1013 One of equipment is as server.

It should be noted that point unanimously clustered based on topic categories with multi-task learning that the embodiment of the present application is provided Class method is typically performed by server 105, correspondingly, is unanimously clustered based on topic categories and the sorting device of multi-task learning one As be arranged in server 105.

It should be understood that the number of the data end equipment, network and server in Fig. 1 is only schematical.According to realization Need, can have any number of data end equipment, network and server.

With continued reference to Fig. 2, Fig. 2 shows point unanimously clustered based on topic categories with multi-task learning according to the application The flow of one embodiment of class method.This unanimously clusters the sorting technique with multi-task learning based on topic categories, including with Lower step：

Step 1, treat grouped data using K- averages and carry out cluster operation, multiple first clusters of generation, each first Linear first SVMs is trained in cluster, is generated according to the above-mentioned first cluster and above-mentioned linear first SVMs The parameter of preliminary classification model.

In the present embodiment, the electricity thereon with the operation of the sorting technique of multi-task learning is unanimously clustered based on topic categories Sub- equipment (such as server shown in Fig. 1) is it is possible, firstly, to by wired connection mode or radio connection from data terminal The pending data that equipment obtains, or pending data are obtained from default database.It is above-mentioned to be treated using K- averages It is first to randomly select K object as initial cluster centre that grouped data, which carries out cluster operation, then calculate each object and The distance between each seed cluster centre, distributes to each object the cluster centre nearest apart from it, cluster centre and The object for distributing to the cluster centre is just a cluster.A Linear SVM is trained in each cluster, by the knot of above-mentioned cluster Fruit and linear SVM obtain the parameter of preliminary classification model.

In some specific embodiments, above-mentioned steps 1 include：

Step 11：The data to be sorted are carried out with cluster operation, multiple first clusters of generation using K- averages.First is poly- The number K of class can be arranged between 13 to 20.

Step 12, the first mixed coefficint π of each first cluster is calculated_k(number of sample accounts for total number of samples in cluster Ratio), the first mean vector μ_kWith the first covariance matrix Σ_k.The value of k in above-mentioned parameter is k=1 ... K.Based on upper Each coefficient is stated, the initialization value that can obtain above-mentioned first clustering parameter is { π_k,μ_k,Σ_k| k=1 ..., K }.

Step 13, a Linear SVM is trained in each cluster, obtains K Linear SVM, wherein, k-th Linear SVM Weight vectors are expressed as w_k, the weight vectors that can obtain K Linear SVM are { w_k| k=1 ..., K }.

Step 14, according to mixed coefficint, mean vector, covariance matrix and Linear SVM weight vectors, initial model is obtained Parameter be：

Θ={ π, μ, Σ, W }={ π_k,μ_k,Σ_k,w_k| k=1 ..., K }.

Step 2, above-mentioned data to be sorted are divided into by multiple second clusters according to the consistent clustering method of topic categories, every One the second linear SVMs of training in individual second cluster.

In the present embodiment, above-mentioned data to be sorted are divided into by above-mentioned server according to the consistent clustering method of topic categories Multiple second clusters, linear second SVMs is trained in each second cluster.Here, topic categories unanimously cluster During cluster number obtain set it is identical with step 11, also, cluster number it is identical.Utilize gained cluster ginseng in step 1 Number, is clustered based on the consistent clustering method of topic categories, namely replaces global characteristics to data clusters using local.

In some specific embodiments, above-mentioned steps 2 include：

Step 21：Grouped data, which to be treated, using locally consistent gauss hybrid models carries out cluster operation, generation second clusters, Wherein, locally consistent gauss hybrid models are in Gauss model, the gauss hybrid models of standard under locally consistent regularization term (GMM) simply in theorem in Euclid space fitting data.However, the distribution of data and may not meet theorem in Euclid space, they may be distributed On potential submanifold.For the performance of lifting cluster, it is contemplated that local sub- popular structure, using locally consistent Gaussian Mixture Model clusters to data.Wherein, locally consistent gauss hybrid models are exactly on the basis of gauss hybrid models, are added One locally consistent regularization term.

The foundation of above-mentioned locally consistent regularization term can be：Neighbour's figure, neighbour figure are established on the training data Side weight matrix A in each element, represent whether corresponding two samples are neighbours, if neighbour, then corresponding element Element is 1, is otherwise 0.Based on the matrix A of neighbour's figure, it is as follows to define locally consistent regularization term：

Wherein A_ijIt is exactly the element in neighbour's figure A, represents sample x_iWith x_jWhether it is neighbour.P_i(c) represent i-th of sample Originally the probability of c-th of Gaussian component, D (P are assigned to_i(c)||P_j(c) probability P) is represented_iAnd P (c)_j(c) the KL- divergences between. Wherein, KL divergences (Kullback-Leibler divergence), also known as relative entropy (relative entropy), in probability By or information theory in, be describe two probability distribution P and Q difference a kind of method.Regular terms R is smaller, probability P_i(c) in neighbour It is more smooth on figure.That is, if two samples are in manifold closer to phase is got in their distributions between different Gaussian components Seemingly, i.e., they are more possible to be assigned to identical Gaussian component up.

Step 22：Calculate each second cluster mixed coefficint π_k, mean vector μ_kWith covariance matrix Σ_k, generation described the Parameter { the π of two clusters_k,μ_k,Σ_k| k=1 ..., K }；

Step 23：One the second linear SVMs of training in each second cluster.

Step 3, second cluster is merged in a production model and trains above-mentioned second SVMs.

In the present embodiment, topic categories are unanimously clustered to the second generated cluster and trained in each second cluster Linear SVM be fused in a production model, above-mentioned production model can be a production graph model.

In specific example, step 3 includes：

Step 31：In production model, specially in production model, fusion second cluster and described second is gathered Linear second SVMs trained in class.Cluster above-mentioned second and trained in each above-mentioned second cluster linear SVM is fused in a production model, as shown in figure 3, using dotted line as boundary in figure, top half, which corresponds to, utilizes locally consistent Gauss hybrid models cluster to training data, wherein π, μ, Σ be respectively the mixed coefficint of gauss hybrid models, average to Amount and covariance matrix；The latter half corresponds to trains a Linear SVM in each cluster, and wherein W is each Linear SVM power The matrix of weight vector.

Model in Fig. 3, the joint probability between X and Y can be drawn：

Wherein, N (x_i|z_i=k, μ_k,Σ_k) it is that k-th of Gaussian component of gauss hybrid models produces sample x_iProbability, P (y_i|x_i,w_k) it is k-th of Linear SVM to sample x_iThe probability of classification.Pair of regularization can be obtained according to this joint probability Number likelihood function：

Wherein, Ω (W) is represented to Linear SVM weight vectors W={ w_k}_k=1 ..., K regular terms.R is above-mentioned local class The regularization term not clustered unanimously.

Step 33：Linear second SVMs is dissolved into the log-likelihood function, optimization object functionTo train the second linear SVMs, wherein, the object function Section 1 is the regular terms of linear SVM weight vectors, and Section 2 is total loss function on training set.

Section 1 in above-mentioned object function is the regular terms of Linear SVM weight vectors, and Section 2 is total on training set Loss function.Corresponding relation is established between log-likelihood function and SVM minimums so that maximizing likelihood maximizing Meanwhile SVM is minimized, so as to reach the purpose of training Linear SVM.Wherein, first corresponding relation is posteriority in corresponding relation Probability P (y_i|x_i,w_k) and loss functionBetween correspondence, be expressed as specific corresponding to relation：

Loss functionEqual to 0, then posterior probability P (y_i|x_i,w_k) 1 will be equal to, otherwise probability P (y_i| x_i,w_k) 1 will be less than.

Second corresponding relation be Ω (W) with | | w | |²Between correspondence, be the regular terms to Linear SVM weight vectors.

In some concrete implementation modes, the study of the log-likelihood function of above-mentioned regularization can be expanded to half prison Educational inspector practises, and semi-supervised graph model is as shown in Figure 4.The model for having supervision that the right half part of model corresponds in Fig. 3, left side Divide and correspond to unsupervised model, it can be seen that left and right two parts share the parameter { π, μ, Σ } that topic categories unanimously cluster.It is logical Cross this semi-supervised mode utilize simultaneously label data and largely without label data come estimate these cluster parameter, can So that cluster obtains more preferable effect, so as to finally improve classification accuracy.

Semi-supervised graph model in Fig. 4, the log-likelihood function of regularization can be expressed as：

Wherein Section 1 corresponds to the log-likelihood function for having label data, and Section 2 corresponds to the logarithm without label data Likelihood function.Above-mentioned log-likelihood function can be equally maximized using expectation-maximization algorithm, so as to try to achieve the ginseng of model Number.

Step 4, linear SVM is learnt simultaneously using multi-task learning method, and between each SVMs Migrate knowledge.

In the present embodiment, above-mentioned knowledge is the characteristic information being made up of attribute or feature, can according to features described above information To train SVM.All Linear SVMs are trained using multi-task learning method, in single cluster Linear SVM will be trained to regard one as Task, so as to train the corresponding multi-task learning of all Linear SVMs for all clusters.

In some specific embodiments, above-mentioned steps 4 also include：

Step 41, linear second SVMs is trained in each second cluster based on multi-task learning method. Step 42, all second multiple groups will be polymerized to according to the weight vectors of each second SVMs.

Step 43, according to each group cluster each linear second SVMs weight vectors of average weight vector sum canonical Item migrates knowledge between each SVMs.

Specifically, by all SVM weight vectors W={ w_k}_{K=1 ..., K}R group has been polymerized to, has established following corresponding relation：

Wherein, K task is polymerized to r group, I by the Section 1 on the right of equation_lThe set of tasks in l-th group is represented,Represent the average weight vector of task in l-th group, wherein m_lRepresent to share m in l-th group_lIndividual Business.For weighing the variance of the weight vectors of task in same group, while maximizing likelihood, namely above-mentioned side is minimized Difference, so that the task with group has similar weight vectors.

Section 2 on the right of above-mentioned equation is to the K respective regular terms of Linear SVM weight vectors, to improve extensive energy Power so that decision boundary is smooth, avoids over-fitting.

Step 5, solved using expectation-maximization algorithm parameter that above-mentioned topic categories unanimously cluster and above-mentioned support to The parameter of amount machine, and the parameter unanimously clustered according to above-mentioned topic categories and the parameter renewal of above-mentioned SVMs are above-mentioned initial The parameter of disaggregated model, generate disaggregated model.

In the present embodiment, the parameter of initial model is tried to achieve using expectation-maximization algorithm, can be according to expectation maximization Algorithm for Solving obtains the parameter that above-mentioned topic categories unanimously cluster and the parameter of above-mentioned SVMs, and is given birth to according to above-mentioned each parameter The parameter of composition class model.

In some specific embodiments, the above-mentioned ginseng that preliminary classification model is solved using expectation-maximization algorithm Number, including：Calculate the probability that each data in above-mentioned data to be sorted are categorized into each linear second SVMs；According to upper State probability each data in data to be sorted are categorized into each linear second SVMs；By maximizing log-likelihood The lower bound of function, the parameter of the cluster of renewal second and the parameter of linear second SVMs.

It is above-mentioned to be come the parameter of solving model using expectation maximization (EM) algorithm：OrderRepresent parameter set when t takes turns iteration Close.In the E step of EM algorithms, calculate i-th of sample x_iIt is assigned to the probability of k-th of Linear SVM.

From above-mentioned formula, when one cluster k in sample be not linear separability when, then this cluster on certain A little sample x_iLinear SVM (the weight vectors w that can be clustered by this_k) misclassification, then probabilityCan be smaller, So as to cause the posterior probability P (c on the left of above formula equal sign_k|x_i)^(t)Can be smaller, that is, by sample x_iIt is assigned to k-th of cluster Probability can be smaller, final sample x_iIt can be assigned in other clusters.Constantly iteration by this way, each cluster Linear separability can be intended to.

In the M step of EM algorithms, by maximizing the lower bound of log-likelihood function, renewal topic categories unanimously cluster Parameter and SVMs parameter.The lower bound of log-likelihood function is expressed as：

Θ is arrived into parameter renewal by maximizing above-mentioned lower bound^(t+1), wherein, parameter that topic categories unanimously cluster π, μ, Σ } updated by following three formula：

Wherein,

In order to update the weight vectors W of Linear SVM, following formula optimized-type is solved：

Wherein,Above formula Section 1 is equivalent to the loss function of weighting, institute Above formula can be solved using ready-made method equivalent to the multi-task learning problem of cluster, thus obtain SVM power The vectorial W of weight.

Step 6, classified using above-mentioned disaggregated model to treat grouped data.

In the present embodiment, based on the disaggregated model solved required by step 5 and the parameter of disaggregated model, grouped data is treated Classified according to following formula：

The method that the above embodiments of the present application are provided, the multiple linear SVM substitution kemel SVM of combining classifiers, leads to The clustering method of the locally consistent classification based on local feature is crossed, nonlinear data is classified, simplifies the process of computing, A large amount of matrix operations that the training process avoided in Kernel SVM is related to；Multi-task learning so that can be simultaneously to multiple SVM carries out learning training, improves classification speed.

As on the other hand, present invention also provides a kind of storage device, wherein be stored with a plurality of program, program be suitable to by Processor is loaded and performed to realize：Grouped data, which to be treated, using K- averages carries out cluster operation, generation multiple first clusters, One the first linear SVMs of training in each above-mentioned first cluster, is supported according to the above-mentioned first cluster and above-mentioned first Vector machine generates the parameter of preliminary classification model；Above-mentioned data to be sorted are divided into according to the method that topic categories unanimously cluster Multiple second clusters, one the second linear SVMs of training in each above-mentioned second cluster；In a production figure Above-mentioned second cluster is merged in model and trains above-mentioned second SVMs；Learnt simultaneously using multi-task learning method above-mentioned Second SVMs, and knowledge is migrated between each above-mentioned second SVMs, above-mentioned knowledge is by attribute or feature The characteristic information of composition；Parameter that above-mentioned topic categories unanimously cluster and above-mentioned second is solved using expectation-maximization algorithm The parameter of vector machine is held, and the parameter unanimously clustered according to above-mentioned topic categories and the parameter renewal of above-mentioned second SVMs The parameter of above-mentioned preliminary classification model, generate disaggregated model；Above-mentioned data to be sorted are divided using above-mentioned disaggregated model Class.

On the other hand, present invention also provides a kind of processing unit, including processor and storage device.Processor is suitable to hold Each bar program of row；Equipment is stored up, suitable for storing a plurality of program；Program is suitable to be loaded by processor and performed to utilize K- averages to realize Treat grouped data and carry out cluster operation, multiple first clusters of generation, training one is linear in each above-mentioned first cluster First SVMs, the parameter of preliminary classification model is generated according to the above-mentioned first cluster and above-mentioned first SVMs；Root Above-mentioned data to be sorted are divided into multiple second clusters according to the method that topic categories unanimously cluster, in each above-mentioned second cluster Upper one the second linear SVMs of training；Above-mentioned second cluster is merged in a production graph model and is trained above-mentioned Second SVMs；Learn above-mentioned second SVMs simultaneously using multi-task learning method, and each above-mentioned second Knowledge is migrated between SVMs, above-mentioned knowledge is the characteristic information being made up of attribute or feature；Calculated using expectation maximization Method solves the parameter of parameter that above-mentioned topic categories unanimously cluster and above-mentioned second SVMs, and according to above-mentioned local class The parameter and the parameter of above-mentioned second SVMs that do not cluster unanimously update the parameter of above-mentioned preliminary classification model, generation classification Model；Above-mentioned data to be sorted are classified using above-mentioned disaggregated model.

So far, combined preferred embodiment shown in the drawings describes technical scheme, still, this area Technical staff is it is easily understood that protection scope of the present invention is expressly not limited to these embodiments.Without departing from this On the premise of the principle of invention, those skilled in the art can make equivalent change or replacement to correlation technique feature, these Technical scheme after changing or replacing it is fallen within protection scope of the present invention.

Claims

1. a kind of sorting technique unanimously clustered based on topic categories with multi-task learning, it is characterised in that methods described includes Following steps：

Step 1：Grouped data is treated using K- means clustering methods to be clustered, multiple first clusters of generation, each described One the first linear SVMs of training in first cluster, gives birth to according to the described first cluster and first SVMs Into the parameter of preliminary classification model；

Step 2：The data to be sorted are divided into by multiple second clusters according to the method that topic categories unanimously cluster, each One the second linear SVMs of training in second cluster；

Step 3：Second cluster is merged in a production model, and trains second SVMs；

Step 4：Learn second SVMs simultaneously using multi-task learning method, and each described second support to Knowledge is migrated between amount machine；The knowledge is the characteristic information being made up of attribute or feature；

Step 5：Solved using expectation-maximization algorithm parameter that the topic categories unanimously cluster and described second support to The parameter of amount machine, and described in the parameter unanimously clustered according to the topic categories and the parameter renewal of second SVMs The parameter of preliminary classification model, generate disaggregated model；

Step 6：The data to be sorted are classified using the disaggregated model.

2. according to the method for claim 1, it is characterised in that the step 1 includes：

Step 11：The data to be sorted are carried out with cluster operation, multiple first clusters of generation using K- means clustering methods；

Step 12：Calculate the first mixed coefficint, the first mean vector and the first covariance matrix of each first cluster；

Step 13：A first linear SVMs is trained in each cluster, obtains K the first linear supporting vectors Machine；

Step 14：According to first mixed coefficint, the first mean vector, the first covariance matrix and each first supporting vector The weight vectors of machine, calculate the parameter of preliminary classification model.

3. according to the method for claim 2, it is characterised in that the step 2 includes：

Step 21：Cluster operation is carried out to the data to be sorted using locally consistent gauss hybrid models, generation second clusters, Wherein, the locally consistent gauss hybrid models are the Gauss models under locally consistent regularization term；

Step 22：The each second poly- mixed coefficint of class second, the second mean vector and the second covariance matrix is calculated, generates institute State the parameter of the second cluster；

Step 23：One the second linear SVMs of training in each second cluster.

4. according to the method for claim 3, it is characterised in that the step 3 includes：

Step 31：In production model, fusion second cluster and described second clusters linear second trained Hold vector machine；

Step 32：Joint probability between second cluster, and foundation are calculated according to the distribution of the production model sample The joint probability calculation goes out the log-likelihood function of regularization；

Step 33：Linear second SVMs is dissolved into the log-likelihood function, optimization object functionLinear second SVMs trained；Wherein,For The regular terms of linear SVM weight vectors,It is total loss function on training set,For the loss function of data x and y cluster.

5. according to the method for claim 4, it is characterised in that the step 4 includes：

Step 41, a second linear supporting vector is trained in each second cluster based on multi-task learning method Machine；

Step 42, all described second multiple groups will be polymerized to according to the weight vectors of each second SVMs；

Step 43, according to each group cluster each second SVMs weight vectors of average weight vector sum regular terms, Knowledge is migrated between each second SVMs.

6. according to the method for claim 5, it is characterised in that " according to each second supporting vector in the step 4 The weight vectors of machine all described second will be polymerized to multiple groups ", including：

<mrow> <mi>&Omega;</mi> <mrow> <mo>(</mo> <mi>W</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <mi>&alpha;</mi> <munderover> <mo>&Sigma;</mo> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>r</mi> </munderover> <munder> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>&Element;</mo> <msub> <mi>I</mi> <mi>l</mi> </msub> </mrow> </munder> <mo>|</mo> <mo>|</mo> <msub> <mi>w</mi> <mi>k</mi> </msub> <mo>-</mo> <msub> <mover> <mi>w</mi> <mo>&OverBar;</mo> </mover> <mi>l</mi> </msub> <mo>|</mo> <msubsup> <mo>|</mo> <mn>2</mn> <mn>2</mn> </msubsup> <mo>-</mo> <mi>&beta;</mi> <munderover> <mo>&Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </munderover> <mo>|</mo> <mo>|</mo> <msub> <mi>w</mi> <mi>k</mi> </msub> <mo>|</mo> <msubsup> <mo>|</mo> <mn>2</mn> <mn>2</mn> </msubsup> </mrow>

Wherein, I_lThe set of tasks in l-th group is represented,Represent the average weight of task in l-th group Vector, m_lRepresent to share m in l-th group_lIndividual task, Ω (W) represent the regular terms to Linear SVM weight vectors.

7. according to the method for claim 5, it is characterised in that described to solve preliminary classification using expectation-maximization algorithm The parameter of model, including：

Calculate the probability that each data in the data to be sorted are categorized into each second SVMs；

Each data in the data to be sorted are categorized into each second SVMs according to the probability；

By maximizing the lower bound of log-likelihood function, the parameter of renewal second cluster and second SVMs Parameter.

8. according to the method for claim 6, it is characterised in that methods described is also including the use of multiple second SVMs The weighted average of result of cluster treat grouped data and classified.

9. a kind of storage device, wherein being stored with a plurality of program, it is characterised in that described program is suitable to be loaded and held by processor Go to realize the sorting technique unanimously clustered based on topic categories with multi-task learning described in claim any one of 1-8.

10. a kind of processing unit, including

Processor, it is adapted for carrying out each bar program；And

Storage device, suitable for storing a plurality of program；

Characterized in that, described program is suitable to be loaded by processor and performed to realize：

The sorting technique unanimously clustered based on topic categories with multi-task learning described in claim any one of 1-8.