CN107563410A - The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories - Google Patents
The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories Download PDFInfo
- Publication number
- CN107563410A CN107563410A CN201710662859.2A CN201710662859A CN107563410A CN 107563410 A CN107563410 A CN 107563410A CN 201710662859 A CN201710662859 A CN 201710662859A CN 107563410 A CN107563410 A CN 107563410A
- Authority
- CN
- China
- Prior art keywords
- cluster
- svms
- data
- parameter
- linear
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000012706 support-vector machine Methods 0.000 claims abstract description 83
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000004519 manufacturing process Methods 0.000 claims abstract description 20
- 238000013145 classification model Methods 0.000 claims abstract description 15
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 238000012545 processing Methods 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 57
- 230000006870 function Effects 0.000 claims description 31
- 239000011159 matrix material Substances 0.000 claims description 16
- 238000009826 distribution Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims 2
- 238000012360 testing method Methods 0.000 description 6
- 241000208340 Araliaceae Species 0.000 description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 235000008434 ginseng Nutrition 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000007621 cluster analysis Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000010224 classification analysis Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000003127 knee Anatomy 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Abstract
The present invention relates to field of computer data processing, it is proposed that a kind of classification and equipment unanimously clustered based on topic categories with multi-task learning, it is intended to solves the problems, such as that SVMs make it that classification is slow because operand is big in data classification.One embodiment of this method includes:Grouped data is treated using K averages to be clustered, and a linear SVM is trained in each cluster with the parameter of initialization model;Then, pending data is divided into multiple clusters using locally consistent clustering method, a linear SVM is trained in each cluster;Cluster and Training Support Vector Machines are fused in a production graph model;Learn above-mentioned SVMs simultaneously using multi-task learning method;Using the parameter of expectation maximization Algorithm for Solving preliminary classification model, disaggregated model is obtained by above-mentioned parameter, grouped data is treated using disaggregated model and is classified.The embodiment improves the classification speed of large-scale data to be sorted.
Description
Technical field
The present invention relates to machine learning field, and in particular to Nonlinear Classification technology, it is more particularly to a kind of to be based on local class
Not not unanimously cluster and the sorting technique and processing equipment of multi-task learning.
Background technology
With the arriving in big data epoch, the data with same or similar feature are selected from many data to be become
It is more and more important, by the cluster analysis to data, split data into different classes of application combination.SVMs
(Support Vector Machine, SVM) is built upon the machine learning method on the basis of Statistical Learning Theory, can analyze
Data, recognition mode, for classification and regression analysis, optimum balance can be sought between learning ability and model complexity
Point.Inner product between data point is replaced with into kernel function, it is possible to linear classifier is converted into Nonlinear Classifier, i.e., by line
The SVM of property is converted to kernel SVM.
But kernel SVM training and test speed are all slow.Kernel SVM training process, which corresponds to, solves one
Individual convex double optimization problem, is related to substantial amounts of matrix operation, it is necessary to take a substantial amount of time.Due to needing to store kernel function square
Battle array, the size of matrix and the number of training sample are into quadratic relationship, therefore space complexity is also very high.Kernel SVM test
Speed is also slow, and for a test sample, it is necessary to which it and all supporting vectors are calculated into kernel function, then weighted sum is
Final categorised decision can be obtained, it is directly proportional to the number of supporting vector that it, which tests complexity, so SVM test is complicated
Degree is also higher.
Because Linear SVM is trained and is tested all than very fast, there is a kind of method to substitute by integrated multiple Linear SVMs at present
Kernel SVM classify to nonlinear data.Even nonlinear decision boundary, it should also be smooth, i.e., in office
Portion is linear.This kind of method is exactly to learn linear SVM in part to classify to nonlinear data.Specifically, use
The strategy divided and rule:Data are divided into some clusters first, a Linear SVM is then trained in each cluster.On but
The method of stating has following two:(1) cluster and trained in each cluster SVM this two step be typically it is separate, it
Can not mutually promote, if Clustering Effect is bad, in each cluster train Linear SVM effect will not also get well;(2)
This kind of method is generally individually learnt this multiple Linear SVM, and this easilys lead to over-fitting occur when training single SVM
Phenomenon.
The content of the invention
It has been that solution kernel SVM training and test speed are slow to solve above mentioned problem of the prior art, with
And multiple Linear SVMs substitution kernel SVM nonlinear data is classified in the presence of cluster and trained in cluster
Occurs the problems such as over-fitting when can not be mutually promoted between SVM, and train single SVM, the present invention uses following technology
Scheme is to solve the above problems:
In a first aspect, this application provides the sorting technique unanimously clustered based on topic categories with multi-task learning, the party
Method comprises the following steps:
Step 1:Grouped data, which is treated, using K- averages carries out cluster operation, multiple first clusters of generation, each above-mentioned
One the first linear SVMs of training in first cluster, gives birth to according to the above-mentioned first cluster and above-mentioned first SVMs
Into the parameter of preliminary classification model.
Step 2:Above-mentioned data to be sorted are divided into by multiple second clusters according to the method that topic categories unanimously cluster,
One the second linear SVMs of training in each above-mentioned second cluster.
Step 3:Above-mentioned second cluster is merged in a production graph model and trains above-mentioned second SVMs.
Step 4:Learn above-mentioned second SVMs simultaneously using multi-task learning method, and at each above-mentioned second
Hold and knowledge is migrated between vector machine, above-mentioned knowledge is the characteristic information being made up of attribute or feature.
Step 5:Parameter that above-mentioned topic categories unanimously cluster and above-mentioned second is solved using expectation-maximization algorithm
The parameter of vector machine is held, and the parameter unanimously clustered according to above-mentioned topic categories and the parameter renewal of above-mentioned second SVMs
The parameter of above-mentioned preliminary classification, generate disaggregated model.
Step 6:Above-mentioned data to be sorted are classified using above-mentioned disaggregated model.
In some instances, above-mentioned steps 1 include:
Step 11:Above-mentioned data to be sorted are carried out with cluster operation, multiple first clusters of generation using K- averages.
Step 12:Calculate the first mixed coefficint, the first mean vector and the first covariance matrix of each first cluster.
Step 13:A first linear SVMs is trained in each cluster, obtains K linear support vector
Machine.
Step 14:Supported according to above-mentioned first mixed coefficint, the first mean vector, the first covariance matrix and each first
The weight vectors of vector machine, calculate the parameter of preliminary classification model.
In some instances, above-mentioned steps 2 include:
Step 21:Cluster operation, generation second are carried out to the data to be sorted using locally consistent gauss hybrid models
Cluster, wherein, the locally consistent gauss hybrid models are the Gauss models under locally consistent regularization term;
Step 22:The second mixed coefficint, the second mean vector and the second covariance matrix of each second cluster are calculated, it is raw
Into the parameter of the described second cluster.
Step 23:One the second linear SVMs of training in each second cluster.
In some instances, above-mentioned steps 3 include:
Step 31:In production model, trained in fusion second cluster and the described second cluster linear the
Two SVMs.
Step 32:Joint probability between second cluster is calculated according to the distribution of the production model sample, and
Go out the log-likelihood function of regularization according to the joint probability calculation.
Step 33:Linear second SVMs is dissolved into the log-likelihood function, optimization object functionTo train linear second SVMs, wherein, above-mentioned object function first
For the regular terms of linear SVM weight vectors, Section 2It is total on training set
Loss function, in Section 2For the loss function of data x and y cluster.
In some instances, above-mentioned steps 4 include:
Step 41, training one is linear in each second cluster based on multi-task learning method second support to
Amount machine.
Step 42, all described second is polymerized to according to the weight vectors of each second SVMs multiple
Group.
Step 43, according to each group cluster each above-mentioned second SVMs weight vectors of average weight vector sum canonical
, migrate knowledge between each above-mentioned second SVMs.
In some instances, in step 4 " will all above-mentioned the according to the weight vectors of each above-mentioned second SVMs
Dimerization class is polymerized to multiple groups ", including:
The weight vectors of K the second SVMs are polymerized to r group, establish corresponding relation:
Wherein, IlThe set of tasks in l-th group is represented,Represent being averaged for task in l-th group
Weight vectors, mlRepresent to share m in l-th grouplIndividual task, Ω (W) represent the regular terms to Linear SVM weight vectors.
In some instances, the parameter described above that preliminary classification model is solved using expectation-maximization algorithm, including:
Calculate the probability that each data in the data to be sorted are categorized into each second SVMs;Will according to the probability
Each data in the data to be sorted are categorized into each second SVMs;By maximizing log-likelihood function
Lower bound, renewal it is described second cluster parameter and second SVMs parameter.
In some instances, the above method also including the use of multiple second SVMs cluster result weighted average
Value is treated grouped data and classified.
Second aspect, this application provides a kind of storage device, wherein being stored with a plurality of program, it is characterised in that described
Program is suitable to be loaded as processor and performed to realize described in above-mentioned first aspect based on topic categories unanimously cluster and multitask
The sorting technique of study.
The third aspect, this application provides a kind of processing unit, including processor and storage device.Processor is adapted for carrying out
Each bar program;Equipment is stored up, suitable for storing a plurality of program;Program is suitable to be loaded by processor and performed to realize above-mentioned first aspect
In any described sorting technique unanimously clustered based on topic categories with multi-task learning.
The sorting technique and processing equipment that are unanimously clustered based on topic categories with multi-task learning that the application provides, are passed through
Topic categories are unanimously clustered and train Linear SVM to be fused in a production graph model in each cluster, so that
Cluster and training Linear SVM can mutually promote.Meanwhile learn these Linear SVMs simultaneously using multi-task learning, avoid
The over-fitting occurred when training SVM in single cluster.Realize to substantial amounts of, complicated data quickly and correctly
Classification.
Brief description of the drawings
Fig. 1 is that the application can apply to exemplary system architecture figure therein;
Fig. 2 is the implementation unanimously clustered based on topic categories with the sorting technique of multi-task learning according to the application
The schematic flow sheet of example;
Fig. 3 is that the Linear SVM trained in cluster and the cluster is fused to a production graph model learning schematic diagram;
Fig. 4 is semi-supervised that the Linear SVM trained in cluster and the cluster is fused in a production graph model
Practise schematic diagram.
Embodiment
The preferred embodiment of the present invention described with reference to the accompanying drawings.It will be apparent to a skilled person that this
A little embodiments are used only for explaining the technical principle of the present invention, it is not intended that limit the scope of the invention.
It should be noted that in the case where not conflicting, the feature in embodiment and embodiment in the application can phase
Mutually combination.Describe the application in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the Nonlinear Classification unanimously clustered based on topic categories with multi-task learning that can apply the application
Method unanimously clusters exemplary system with the embodiment of the Nonlinear Classification processing equipment of multi-task learning based on topic categories
System framework.
As shown in figure 1, system architecture can include data end equipment 101, network 104 and server 105.Data end equipment
101 can be multiple identical or different equipment, can be the first data terminal 1011, the second data terminal 1012, the 3rd data terminal
One or more of 1013.Network 104 between data end equipment 101 and server 105 provide communication link Jie
Matter.Network 104 can include various connection types, such as wired, wireless communication link or fiber optic cables etc..
Data end equipment 101 can be by carrying out information exchange, to receive or send between network 104 and server 105
Information etc..Letter can be carried out by network 104 between first data terminal 1011, the second data terminal 1012, the 3rd data terminal 1013
Breath interaction.
Data end equipment 101 can be the various electronic equipments for having display screen and supporting network service, including but unlimited
In smart mobile phone, tablet personal computer, pocket computer on knee and desktop computer and the computer being made up of multiple computers
Processing system etc..It should be noted that data end equipment can include the unit of data storage, and filled in data end equipment
Provided with the software for data processing or application.Above-mentioned application can be, for example, finding different clients from the basic storehouse of client
Group, and portray with purchasing model the business application software of the feature of different customers, or from big data temporal characteristics,
Application of analysis road jam situation of section feature etc..
Server 105 can be to provide the processor or server of various services, such as to transmitted by data end equipment 101
Or the data provided carry out data analysis, the data processing server of classification, above-mentioned data processing server can be to pending
Data analyze etc. processing, generation result (for example, according to user's request, will carry out cluster analysis with grouped data
Classification results afterwards) user is returned, or store stand-by.It should be noted that in the application, server 105 can individually be set
The server put or the selection or specified from the first data terminal 1011, the second data terminal 1012, the 3rd data terminal 1013
One of equipment is as server.
It should be noted that point unanimously clustered based on topic categories with multi-task learning that the embodiment of the present application is provided
Class method is typically performed by server 105, correspondingly, is unanimously clustered based on topic categories and the sorting device of multi-task learning one
As be arranged in server 105.
It should be understood that the number of the data end equipment, network and server in Fig. 1 is only schematical.According to realization
Need, can have any number of data end equipment, network and server.
With continued reference to Fig. 2, Fig. 2 shows point unanimously clustered based on topic categories with multi-task learning according to the application
The flow of one embodiment of class method.This unanimously clusters the sorting technique with multi-task learning based on topic categories, including with
Lower step:
Step 1, treat grouped data using K- averages and carry out cluster operation, multiple first clusters of generation, each first
Linear first SVMs is trained in cluster, is generated according to the above-mentioned first cluster and above-mentioned linear first SVMs
The parameter of preliminary classification model.
In the present embodiment, the electricity thereon with the operation of the sorting technique of multi-task learning is unanimously clustered based on topic categories
Sub- equipment (such as server shown in Fig. 1) is it is possible, firstly, to by wired connection mode or radio connection from data terminal
The pending data that equipment obtains, or pending data are obtained from default database.It is above-mentioned to be treated using K- averages
It is first to randomly select K object as initial cluster centre that grouped data, which carries out cluster operation, then calculate each object and
The distance between each seed cluster centre, distributes to each object the cluster centre nearest apart from it, cluster centre and
The object for distributing to the cluster centre is just a cluster.A Linear SVM is trained in each cluster, by the knot of above-mentioned cluster
Fruit and linear SVM obtain the parameter of preliminary classification model.
In some specific embodiments, above-mentioned steps 1 include:
Step 11:The data to be sorted are carried out with cluster operation, multiple first clusters of generation using K- averages.First is poly-
The number K of class can be arranged between 13 to 20.
Step 12, the first mixed coefficint π of each first cluster is calculatedk(number of sample accounts for total number of samples in cluster
Ratio), the first mean vector μkWith the first covariance matrix Σk.The value of k in above-mentioned parameter is k=1 ... K.Based on upper
Each coefficient is stated, the initialization value that can obtain above-mentioned first clustering parameter is { πk,μk,Σk| k=1 ..., K }.
Step 13, a Linear SVM is trained in each cluster, obtains K Linear SVM, wherein, k-th Linear SVM
Weight vectors are expressed as wk, the weight vectors that can obtain K Linear SVM are { wk| k=1 ..., K }.
Step 14, according to mixed coefficint, mean vector, covariance matrix and Linear SVM weight vectors, initial model is obtained
Parameter be:
Θ={ π, μ, Σ, W }={ πk,μk,Σk,wk| k=1 ..., K }.
Step 2, above-mentioned data to be sorted are divided into by multiple second clusters according to the consistent clustering method of topic categories, every
One the second linear SVMs of training in individual second cluster.
In the present embodiment, above-mentioned data to be sorted are divided into by above-mentioned server according to the consistent clustering method of topic categories
Multiple second clusters, linear second SVMs is trained in each second cluster.Here, topic categories unanimously cluster
During cluster number obtain set it is identical with step 11, also, cluster number it is identical.Utilize gained cluster ginseng in step 1
Number, is clustered based on the consistent clustering method of topic categories, namely replaces global characteristics to data clusters using local.
In some specific embodiments, above-mentioned steps 2 include:
Step 21:Grouped data, which to be treated, using locally consistent gauss hybrid models carries out cluster operation, generation second clusters,
Wherein, locally consistent gauss hybrid models are in Gauss model, the gauss hybrid models of standard under locally consistent regularization term
(GMM) simply in theorem in Euclid space fitting data.However, the distribution of data and may not meet theorem in Euclid space, they may be distributed
On potential submanifold.For the performance of lifting cluster, it is contemplated that local sub- popular structure, using locally consistent Gaussian Mixture
Model clusters to data.Wherein, locally consistent gauss hybrid models are exactly on the basis of gauss hybrid models, are added
One locally consistent regularization term.
The foundation of above-mentioned locally consistent regularization term can be:Neighbour's figure, neighbour figure are established on the training data
Side weight matrix A in each element, represent whether corresponding two samples are neighbours, if neighbour, then corresponding element
Element is 1, is otherwise 0.Based on the matrix A of neighbour's figure, it is as follows to define locally consistent regularization term:
Wherein AijIt is exactly the element in neighbour's figure A, represents sample xiWith xjWhether it is neighbour.Pi(c) represent i-th of sample
Originally the probability of c-th of Gaussian component, D (P are assigned toi(c)||Pj(c) probability P) is representediAnd P (c)j(c) the KL- divergences between.
Wherein, KL divergences (Kullback-Leibler divergence), also known as relative entropy (relative entropy), in probability
By or information theory in, be describe two probability distribution P and Q difference a kind of method.Regular terms R is smaller, probability Pi(c) in neighbour
It is more smooth on figure.That is, if two samples are in manifold closer to phase is got in their distributions between different Gaussian components
Seemingly, i.e., they are more possible to be assigned to identical Gaussian component up.
Step 22:Calculate each second cluster mixed coefficint πk, mean vector μkWith covariance matrix Σk, generation described the
Parameter { the π of two clustersk,μk,Σk| k=1 ..., K };
Step 23:One the second linear SVMs of training in each second cluster.
Step 3, second cluster is merged in a production model and trains above-mentioned second SVMs.
In the present embodiment, topic categories are unanimously clustered to the second generated cluster and trained in each second cluster
Linear SVM be fused in a production model, above-mentioned production model can be a production graph model.
In specific example, step 3 includes:
Step 31:In production model, specially in production model, fusion second cluster and described second is gathered
Linear second SVMs trained in class.Cluster above-mentioned second and trained in each above-mentioned second cluster linear
SVM is fused in a production model, as shown in figure 3, using dotted line as boundary in figure, top half, which corresponds to, utilizes locally consistent
Gauss hybrid models cluster to training data, wherein π, μ, Σ be respectively the mixed coefficint of gauss hybrid models, average to
Amount and covariance matrix;The latter half corresponds to trains a Linear SVM in each cluster, and wherein W is each Linear SVM power
The matrix of weight vector.
Step 32:Joint probability between second cluster is calculated according to the distribution of the production model sample, and
Go out the log-likelihood function of regularization according to the joint probability calculation.
Model in Fig. 3, the joint probability between X and Y can be drawn:
Wherein, N (xi|zi=k, μk,Σk) it is that k-th of Gaussian component of gauss hybrid models produces sample xiProbability, P
(yi|xi,wk) it is k-th of Linear SVM to sample xiThe probability of classification.Pair of regularization can be obtained according to this joint probability
Number likelihood function:
Wherein, Ω (W) is represented to Linear SVM weight vectors W={ wk}k=1 ..., K regular terms.R is above-mentioned local class
The regularization term not clustered unanimously.
Step 33:Linear second SVMs is dissolved into the log-likelihood function, optimization object functionTo train the second linear SVMs, wherein, the object function
Section 1 is the regular terms of linear SVM weight vectors, and Section 2 is total loss function on training set.
Section 1 in above-mentioned object function is the regular terms of Linear SVM weight vectors, and Section 2 is total on training set
Loss function.Corresponding relation is established between log-likelihood function and SVM minimums so that maximizing likelihood maximizing
Meanwhile SVM is minimized, so as to reach the purpose of training Linear SVM.Wherein, first corresponding relation is posteriority in corresponding relation
Probability P (yi|xi,wk) and loss functionBetween correspondence, be expressed as specific corresponding to relation:
Loss functionEqual to 0, then posterior probability P (yi|xi,wk) 1 will be equal to, otherwise probability P (yi|
xi,wk) 1 will be less than.
Second corresponding relation be Ω (W) with | | w | |2Between correspondence, be the regular terms to Linear SVM weight vectors.
In some concrete implementation modes, the study of the log-likelihood function of above-mentioned regularization can be expanded to half prison
Educational inspector practises, and semi-supervised graph model is as shown in Figure 4.The model for having supervision that the right half part of model corresponds in Fig. 3, left side
Divide and correspond to unsupervised model, it can be seen that left and right two parts share the parameter { π, μ, Σ } that topic categories unanimously cluster.It is logical
Cross this semi-supervised mode utilize simultaneously label data and largely without label data come estimate these cluster parameter, can
So that cluster obtains more preferable effect, so as to finally improve classification accuracy.
Semi-supervised graph model in Fig. 4, the log-likelihood function of regularization can be expressed as:
Wherein Section 1 corresponds to the log-likelihood function for having label data, and Section 2 corresponds to the logarithm without label data
Likelihood function.Above-mentioned log-likelihood function can be equally maximized using expectation-maximization algorithm, so as to try to achieve the ginseng of model
Number.
Step 4, linear SVM is learnt simultaneously using multi-task learning method, and between each SVMs
Migrate knowledge.
In the present embodiment, above-mentioned knowledge is the characteristic information being made up of attribute or feature, can according to features described above information
To train SVM.All Linear SVMs are trained using multi-task learning method, in single cluster Linear SVM will be trained to regard one as
Task, so as to train the corresponding multi-task learning of all Linear SVMs for all clusters.
In some specific embodiments, above-mentioned steps 4 also include:
Step 41, linear second SVMs is trained in each second cluster based on multi-task learning method.
Step 42, all second multiple groups will be polymerized to according to the weight vectors of each second SVMs.
Step 43, according to each group cluster each linear second SVMs weight vectors of average weight vector sum canonical
Item migrates knowledge between each SVMs.
Specifically, by all SVM weight vectors W={ wk}K=1 ..., KR group has been polymerized to, has established following corresponding relation:
Wherein, K task is polymerized to r group, I by the Section 1 on the right of equationlThe set of tasks in l-th group is represented,Represent the average weight vector of task in l-th group, wherein mlRepresent to share m in l-th grouplIndividual
Business.For weighing the variance of the weight vectors of task in same group, while maximizing likelihood, namely above-mentioned side is minimized
Difference, so that the task with group has similar weight vectors.
Section 2 on the right of above-mentioned equation is to the K respective regular terms of Linear SVM weight vectors, to improve extensive energy
Power so that decision boundary is smooth, avoids over-fitting.
Step 5, solved using expectation-maximization algorithm parameter that above-mentioned topic categories unanimously cluster and above-mentioned support to
The parameter of amount machine, and the parameter unanimously clustered according to above-mentioned topic categories and the parameter renewal of above-mentioned SVMs are above-mentioned initial
The parameter of disaggregated model, generate disaggregated model.
In the present embodiment, the parameter of initial model is tried to achieve using expectation-maximization algorithm, can be according to expectation maximization
Algorithm for Solving obtains the parameter that above-mentioned topic categories unanimously cluster and the parameter of above-mentioned SVMs, and is given birth to according to above-mentioned each parameter
The parameter of composition class model.
In some specific embodiments, the above-mentioned ginseng that preliminary classification model is solved using expectation-maximization algorithm
Number, including:Calculate the probability that each data in above-mentioned data to be sorted are categorized into each linear second SVMs;According to upper
State probability each data in data to be sorted are categorized into each linear second SVMs;By maximizing log-likelihood
The lower bound of function, the parameter of the cluster of renewal second and the parameter of linear second SVMs.
It is above-mentioned to be come the parameter of solving model using expectation maximization (EM) algorithm:OrderRepresent parameter set when t takes turns iteration
Close.In the E step of EM algorithms, calculate i-th of sample xiIt is assigned to the probability of k-th of Linear SVM.
From above-mentioned formula, when one cluster k in sample be not linear separability when, then this cluster on certain
A little sample xiLinear SVM (the weight vectors w that can be clustered by thisk) misclassification, then probabilityCan be smaller,
So as to cause the posterior probability P (c on the left of above formula equal signk|xi)(t)Can be smaller, that is, by sample xiIt is assigned to k-th of cluster
Probability can be smaller, final sample xiIt can be assigned in other clusters.Constantly iteration by this way, each cluster
Linear separability can be intended to.
In the M step of EM algorithms, by maximizing the lower bound of log-likelihood function, renewal topic categories unanimously cluster
Parameter and SVMs parameter.The lower bound of log-likelihood function is expressed as:
Θ is arrived into parameter renewal by maximizing above-mentioned lower bound(t+1), wherein, parameter that topic categories unanimously cluster π, μ,
Σ } updated by following three formula:
Wherein,
In order to update the weight vectors W of Linear SVM, following formula optimized-type is solved:
Wherein,Above formula Section 1 is equivalent to the loss function of weighting, institute
Above formula can be solved using ready-made method equivalent to the multi-task learning problem of cluster, thus obtain SVM power
The vectorial W of weight.
Step 6, classified using above-mentioned disaggregated model to treat grouped data.
In the present embodiment, based on the disaggregated model solved required by step 5 and the parameter of disaggregated model, grouped data is treated
Classified according to following formula:
The method that the above embodiments of the present application are provided, the multiple linear SVM substitution kemel SVM of combining classifiers, leads to
The clustering method of the locally consistent classification based on local feature is crossed, nonlinear data is classified, simplifies the process of computing,
A large amount of matrix operations that the training process avoided in Kernel SVM is related to;Multi-task learning so that can be simultaneously to multiple
SVM carries out learning training, improves classification speed.
As on the other hand, present invention also provides a kind of storage device, wherein be stored with a plurality of program, program be suitable to by
Processor is loaded and performed to realize:Grouped data, which to be treated, using K- averages carries out cluster operation, generation multiple first clusters,
One the first linear SVMs of training in each above-mentioned first cluster, is supported according to the above-mentioned first cluster and above-mentioned first
Vector machine generates the parameter of preliminary classification model;Above-mentioned data to be sorted are divided into according to the method that topic categories unanimously cluster
Multiple second clusters, one the second linear SVMs of training in each above-mentioned second cluster;In a production figure
Above-mentioned second cluster is merged in model and trains above-mentioned second SVMs;Learnt simultaneously using multi-task learning method above-mentioned
Second SVMs, and knowledge is migrated between each above-mentioned second SVMs, above-mentioned knowledge is by attribute or feature
The characteristic information of composition;Parameter that above-mentioned topic categories unanimously cluster and above-mentioned second is solved using expectation-maximization algorithm
The parameter of vector machine is held, and the parameter unanimously clustered according to above-mentioned topic categories and the parameter renewal of above-mentioned second SVMs
The parameter of above-mentioned preliminary classification model, generate disaggregated model;Above-mentioned data to be sorted are divided using above-mentioned disaggregated model
Class.
On the other hand, present invention also provides a kind of processing unit, including processor and storage device.Processor is suitable to hold
Each bar program of row;Equipment is stored up, suitable for storing a plurality of program;Program is suitable to be loaded by processor and performed to utilize K- averages to realize
Treat grouped data and carry out cluster operation, multiple first clusters of generation, training one is linear in each above-mentioned first cluster
First SVMs, the parameter of preliminary classification model is generated according to the above-mentioned first cluster and above-mentioned first SVMs;Root
Above-mentioned data to be sorted are divided into multiple second clusters according to the method that topic categories unanimously cluster, in each above-mentioned second cluster
Upper one the second linear SVMs of training;Above-mentioned second cluster is merged in a production graph model and is trained above-mentioned
Second SVMs;Learn above-mentioned second SVMs simultaneously using multi-task learning method, and each above-mentioned second
Knowledge is migrated between SVMs, above-mentioned knowledge is the characteristic information being made up of attribute or feature;Calculated using expectation maximization
Method solves the parameter of parameter that above-mentioned topic categories unanimously cluster and above-mentioned second SVMs, and according to above-mentioned local class
The parameter and the parameter of above-mentioned second SVMs that do not cluster unanimously update the parameter of above-mentioned preliminary classification model, generation classification
Model;Above-mentioned data to be sorted are classified using above-mentioned disaggregated model.
So far, combined preferred embodiment shown in the drawings describes technical scheme, still, this area
Technical staff is it is easily understood that protection scope of the present invention is expressly not limited to these embodiments.Without departing from this
On the premise of the principle of invention, those skilled in the art can make equivalent change or replacement to correlation technique feature, these
Technical scheme after changing or replacing it is fallen within protection scope of the present invention.
Claims (10)
1. a kind of sorting technique unanimously clustered based on topic categories with multi-task learning, it is characterised in that methods described includes
Following steps:
Step 1:Grouped data is treated using K- means clustering methods to be clustered, multiple first clusters of generation, each described
One the first linear SVMs of training in first cluster, gives birth to according to the described first cluster and first SVMs
Into the parameter of preliminary classification model;
Step 2:The data to be sorted are divided into by multiple second clusters according to the method that topic categories unanimously cluster, each
One the second linear SVMs of training in second cluster;
Step 3:Second cluster is merged in a production model, and trains second SVMs;
Step 4:Learn second SVMs simultaneously using multi-task learning method, and each described second support to
Knowledge is migrated between amount machine;The knowledge is the characteristic information being made up of attribute or feature;
Step 5:Solved using expectation-maximization algorithm parameter that the topic categories unanimously cluster and described second support to
The parameter of amount machine, and described in the parameter unanimously clustered according to the topic categories and the parameter renewal of second SVMs
The parameter of preliminary classification model, generate disaggregated model;
Step 6:The data to be sorted are classified using the disaggregated model.
2. according to the method for claim 1, it is characterised in that the step 1 includes:
Step 11:The data to be sorted are carried out with cluster operation, multiple first clusters of generation using K- means clustering methods;
Step 12:Calculate the first mixed coefficint, the first mean vector and the first covariance matrix of each first cluster;
Step 13:A first linear SVMs is trained in each cluster, obtains K the first linear supporting vectors
Machine;
Step 14:According to first mixed coefficint, the first mean vector, the first covariance matrix and each first supporting vector
The weight vectors of machine, calculate the parameter of preliminary classification model.
3. according to the method for claim 2, it is characterised in that the step 2 includes:
Step 21:Cluster operation is carried out to the data to be sorted using locally consistent gauss hybrid models, generation second clusters,
Wherein, the locally consistent gauss hybrid models are the Gauss models under locally consistent regularization term;
Step 22:The each second poly- mixed coefficint of class second, the second mean vector and the second covariance matrix is calculated, generates institute
State the parameter of the second cluster;
Step 23:One the second linear SVMs of training in each second cluster.
4. according to the method for claim 3, it is characterised in that the step 3 includes:
Step 31:In production model, fusion second cluster and described second clusters linear second trained
Hold vector machine;
Step 32:Joint probability between second cluster, and foundation are calculated according to the distribution of the production model sample
The joint probability calculation goes out the log-likelihood function of regularization;
Step 33:Linear second SVMs is dissolved into the log-likelihood function, optimization object functionLinear second SVMs trained;Wherein,For
The regular terms of linear SVM weight vectors,It is total loss function on training set,For the loss function of data x and y cluster.
5. according to the method for claim 4, it is characterised in that the step 4 includes:
Step 41, a second linear supporting vector is trained in each second cluster based on multi-task learning method
Machine;
Step 42, all described second multiple groups will be polymerized to according to the weight vectors of each second SVMs;
Step 43, according to each group cluster each second SVMs weight vectors of average weight vector sum regular terms,
Knowledge is migrated between each second SVMs.
6. according to the method for claim 5, it is characterised in that " according to each second supporting vector in the step 4
The weight vectors of machine all described second will be polymerized to multiple groups ", including:
The weight vectors of K the second SVMs are polymerized to r group, establish corresponding relation:
<mrow>
<mi>&Omega;</mi>
<mrow>
<mo>(</mo>
<mi>W</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mo>-</mo>
<mi>&alpha;</mi>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>l</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>r</mi>
</munderover>
<munder>
<mo>&Sigma;</mo>
<mrow>
<mi>k</mi>
<mo>&Element;</mo>
<msub>
<mi>I</mi>
<mi>l</mi>
</msub>
</mrow>
</munder>
<mo>|</mo>
<mo>|</mo>
<msub>
<mi>w</mi>
<mi>k</mi>
</msub>
<mo>-</mo>
<msub>
<mover>
<mi>w</mi>
<mo>&OverBar;</mo>
</mover>
<mi>l</mi>
</msub>
<mo>|</mo>
<msubsup>
<mo>|</mo>
<mn>2</mn>
<mn>2</mn>
</msubsup>
<mo>-</mo>
<mi>&beta;</mi>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>k</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>K</mi>
</munderover>
<mo>|</mo>
<mo>|</mo>
<msub>
<mi>w</mi>
<mi>k</mi>
</msub>
<mo>|</mo>
<msubsup>
<mo>|</mo>
<mn>2</mn>
<mn>2</mn>
</msubsup>
</mrow>
Wherein, IlThe set of tasks in l-th group is represented,Represent the average weight of task in l-th group
Vector, mlRepresent to share m in l-th grouplIndividual task, Ω (W) represent the regular terms to Linear SVM weight vectors.
7. according to the method for claim 5, it is characterised in that described to solve preliminary classification using expectation-maximization algorithm
The parameter of model, including:
Calculate the probability that each data in the data to be sorted are categorized into each second SVMs;
Each data in the data to be sorted are categorized into each second SVMs according to the probability;
By maximizing the lower bound of log-likelihood function, the parameter of renewal second cluster and second SVMs
Parameter.
8. according to the method for claim 6, it is characterised in that methods described is also including the use of multiple second SVMs
The weighted average of result of cluster treat grouped data and classified.
9. a kind of storage device, wherein being stored with a plurality of program, it is characterised in that described program is suitable to be loaded and held by processor
Go to realize the sorting technique unanimously clustered based on topic categories with multi-task learning described in claim any one of 1-8.
10. a kind of processing unit, including
Processor, it is adapted for carrying out each bar program;And
Storage device, suitable for storing a plurality of program;
Characterized in that, described program is suitable to be loaded by processor and performed to realize:
The sorting technique unanimously clustered based on topic categories with multi-task learning described in claim any one of 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710662859.2A CN107563410A (en) | 2017-08-04 | 2017-08-04 | The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710662859.2A CN107563410A (en) | 2017-08-04 | 2017-08-04 | The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107563410A true CN107563410A (en) | 2018-01-09 |
Family
ID=60975167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710662859.2A Pending CN107563410A (en) | 2017-08-04 | 2017-08-04 | The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107563410A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664986A (en) * | 2018-01-16 | 2018-10-16 | 北京工商大学 | Based on lpThe multi-task learning image classification method and system of norm regularization |
CN108710948A (en) * | 2018-04-25 | 2018-10-26 | 佛山科学技术学院 | A kind of transfer learning method based on cluster equilibrium and weight matrix optimization |
FR3074123A1 (en) * | 2018-05-29 | 2019-05-31 | Continental Automotive France | EVALUATING A DRIVING STYLE OF A DRIVER OF A ROAD VEHICLE IN MOTION BY AUTOMATIC LEARNING |
CN110363359A (en) * | 2019-07-23 | 2019-10-22 | 中国联合网络通信集团有限公司 | A kind of occupation prediction technique and system |
CN110490027A (en) * | 2018-05-15 | 2019-11-22 | 触景无限科技(北京)有限公司 | A kind of face characteristic extraction training method and system for recognition of face |
CN110532384A (en) * | 2019-08-02 | 2019-12-03 | 广东工业大学 | A kind of multitask dictionary list classification method, system, device and storage medium |
CN111797862A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Task processing method and device, storage medium and electronic equipment |
CN113642623A (en) * | 2021-08-05 | 2021-11-12 | 深圳大学 | Complex support vector machine classification method based on unitary space multi-feature fusion |
CN114127698A (en) * | 2019-07-18 | 2022-03-01 | 日本电信电话株式会社 | Learning device, detection system, learning method, and learning program |
CN117472587A (en) * | 2023-12-26 | 2024-01-30 | 广东奥飞数据科技股份有限公司 | Resource scheduling system of AI intelligent computation center |
-
2017
- 2017-08-04 CN CN201710662859.2A patent/CN107563410A/en active Pending
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664986A (en) * | 2018-01-16 | 2018-10-16 | 北京工商大学 | Based on lpThe multi-task learning image classification method and system of norm regularization |
CN108664986B (en) * | 2018-01-16 | 2020-09-04 | 北京工商大学 | Based on lpNorm regularized multi-task learning image classification method and system |
CN108710948A (en) * | 2018-04-25 | 2018-10-26 | 佛山科学技术学院 | A kind of transfer learning method based on cluster equilibrium and weight matrix optimization |
CN108710948B (en) * | 2018-04-25 | 2021-08-31 | 佛山科学技术学院 | Transfer learning method based on cluster balance and weight matrix optimization |
CN110490027A (en) * | 2018-05-15 | 2019-11-22 | 触景无限科技(北京)有限公司 | A kind of face characteristic extraction training method and system for recognition of face |
CN110490027B (en) * | 2018-05-15 | 2023-06-16 | 触景无限科技(北京)有限公司 | Face feature extraction training method and system |
FR3074123A1 (en) * | 2018-05-29 | 2019-05-31 | Continental Automotive France | EVALUATING A DRIVING STYLE OF A DRIVER OF A ROAD VEHICLE IN MOTION BY AUTOMATIC LEARNING |
CN111797862A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Task processing method and device, storage medium and electronic equipment |
CN114127698A (en) * | 2019-07-18 | 2022-03-01 | 日本电信电话株式会社 | Learning device, detection system, learning method, and learning program |
CN110363359A (en) * | 2019-07-23 | 2019-10-22 | 中国联合网络通信集团有限公司 | A kind of occupation prediction technique and system |
CN110532384A (en) * | 2019-08-02 | 2019-12-03 | 广东工业大学 | A kind of multitask dictionary list classification method, system, device and storage medium |
CN110532384B (en) * | 2019-08-02 | 2022-04-19 | 广东工业大学 | Multi-task dictionary list classification method, system, device and storage medium |
CN113642623A (en) * | 2021-08-05 | 2021-11-12 | 深圳大学 | Complex support vector machine classification method based on unitary space multi-feature fusion |
CN113642623B (en) * | 2021-08-05 | 2023-08-18 | 深圳大学 | Complex support vector machine classification method based on unitary space multi-feature fusion |
CN117472587A (en) * | 2023-12-26 | 2024-01-30 | 广东奥飞数据科技股份有限公司 | Resource scheduling system of AI intelligent computation center |
CN117472587B (en) * | 2023-12-26 | 2024-03-01 | 广东奥飞数据科技股份有限公司 | Resource scheduling system of AI intelligent computation center |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107563410A (en) | The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories | |
EP3748545A1 (en) | Sparsity constraints and knowledge distillation based learning of sparser and compressed neural networks | |
CN108171280A (en) | A kind of grader construction method and the method for prediction classification | |
Bifet et al. | Accurate ensembles for data streams: Combining restricted hoeffding trees using stacking | |
CN106326984A (en) | User intention identification method and device and automatic answering system | |
CN102567391A (en) | Method and device for building classification forecasting mixed model | |
CN104091071B (en) | The risk of application program determines method and determining device | |
CN108205570A (en) | A kind of data detection method and device | |
KR102506132B1 (en) | Method and device for personalized learning amount recommendation based on self-attention mechanism | |
US20220058435A1 (en) | Data classification method and system, and classifier training method and system | |
CN110135976A (en) | User's portrait generation method, device, electronic equipment and computer-readable medium | |
Zhou et al. | An effective ensemble pruning algorithm based on frequent patterns | |
CN109840413A (en) | A kind of detection method for phishing site and device | |
KR102563986B1 (en) | Method and apparatus for recommending learning amount using artificial intelligence and clustering using k-means algorithm at the same time | |
Xiao et al. | A transfer learning based classifier ensemble model for customer credit scoring | |
CN105913353A (en) | K-means cluster-based multi-weight self-adaptive student learning behavior analysis method | |
CN113971090B (en) | Layered federal learning method and device of distributed deep neural network | |
CN106897282A (en) | The sorting technique and equipment of a kind of customer group | |
US10733499B2 (en) | Systems and methods for enhancing computer assisted high throughput screening processes | |
Shrivastava et al. | Selection of efficient and accurate prediction algorithm for employing real time 5g data load prediction | |
Nguyen-Thi et al. | Transfer AdaBoost SVM for link prediction in newly signed social networks using explicit and PNR features | |
CN110457387A (en) | A kind of method and relevant apparatus determining applied to user tag in network | |
CN110222779A (en) | Distributed data processing method and system | |
Arif et al. | Machine Learning and Deep Learning Based Network Slicing Models for 5G Network | |
Mohamad et al. | Comparison of diverse ensemble neural network for large data classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180109 |
|
RJ01 | Rejection of invention patent application after publication |